2004년 5월 25일 화요일

Kernel BUG

kernel을 건드리는 작업을 한 것은 아니고 그냥 CPU, memory를 많이쓰는
string match 같은 일을 많이하는 서버인데요.
어느 날 이유없이 죽어서 리부팅해보니 /var/log/message에 'kernel BUG'라고
나와있네요.
(그 전까지 몇 달간 같은 프로그램이 잘 돌고 있었거든요.)
커널 패치해서 버젼을 올리는 게 좋을 까요?
제가 짠 작은 프로그램의 문제인지.. 커널의 문제인지(커널 버그라니;;)

$ uname -a
Linux search174.sayclub.com 2.4.20-20.9smp #1 SMP Fri Nov 7 21:27:40 KST
2003 i686 i686 i386 GNU/Linux

@ 그냥 linux kernel이 이 정도 확률로 죽는 건 감수해야 하는 건가요?;;

12101 May 21 18:21:58 search174 kernel: ------------[ cut here ]------------
  12102 May 21 18:21:58 search174 kernel: kernel BUG at page_alloc.c:277!
  12103 May 21 18:21:58 search174 kernel: invalid operand: 0000
  12104 May 21 18:21:58 search174 kernel: iptable_filter ip_tables e1000
keybdev mousedev hid input usb-ohci usbcore ext3 jbd raid0
  12105 May 21 18:21:58 search174 kernel: CPU:    2
  12106 May 21 18:21:58 search174 kernel: EIP:    0060:[<c01496a0>]    Not
tainted
  12107 May 21 18:21:58 search174 kernel: EFLAGS: 00010206
  12108 May 21 18:21:58 search174 kernel:
  12109 May 21 18:21:58 search174 kernel: EIP is at rmqueue [kernel] 0x300
(2.4.20-20.9smp)
  12110 May 21 18:21:58 search174 kernel: eax: 010c0000   ebx: 00037000
ecx: 00001000   edx: 00010daa
  12111 May 21 18:21:58 search174 kernel: esi: c1000030   edi: c0343680
ebp: c13afd60   esp: e66fdd0c
  12112 May 21 18:21:58 search174 kernel: ds: 0068   es: 0068   ss: 0068
  12113 May 21 18:21:58 search174 kernel: Process issue-meter.out (pid:
31457, stackpage=e66fd000)
  12114 May 21 18:21:58 search174 kernel: Stack: 00001000 c1000030 00000000
0000fdaa 0000fdaa 00000202 00000000 c0343680
  12115 May 21 18:21:58 search174 kernel:        c0343680 c0345b84 00000000
00000001 c01497b7 c0345b8c c1000030 00000070
  12116 May 21 18:21:59 search174 kernel:        00000000 c0149b44 c0345b80
00000000 00000000 00000001 00000900 00f1ac18
  12117 May 21 18:21:59 search174 kernel: Call Trace:   [<c01497b7>]
__alloc_pages_limit [kernel] 0x57 (0xe66fdd3c))
  12118 May 21 18:21:59 search174 kernel: [<c0149b44>] __alloc_pages
[kernel] 0x354 (0xe66fdd50))
  12119 May 21 18:21:59 search174 kernel: [<c01508b3>] alloc_bounce_page
[kernel] 0x13 (0xe66fdd90))
  12120 May 21 18:21:59 search174 kernel: [<c0150a3c>] create_bounce
[kernel] 0x4c (0xe66fdd9c))
  12121 May 21 18:21:59 search174 kernel: [<c01b5cb8>] __make_request
[kernel] 0x698 (0xe66fddbc))
  12122 May 21 18:21:59 search174 kernel: [<c01fdca2>] md_make_request
[kernel] 0x82 (0xe66fddf8))
  12123 May 21 18:21:59 search174 kernel: [<c01b5d9a>] generic_make_request
[kernel] 0xda (0xe66fde0c))
  12124 May 21 18:21:59 search174 kernel: [<c01b5e47>] submit_bh [kernel]
0x57 (0xe66fde34))
  12125 May 21 18:21:59 search174 kernel: [<c01566c7>] block_read_full_page
[kernel] 0x257 (0xe66fde50))
  12126 May 21 18:21:59 search174 kernel: [<c0145342>] lru_cache_add
[kernel] 0x1b2 (0xe66fde88))
  12127 May 21 18:21:59 search174 kernel: [<c013c0e8>]
add_to_page_cache_unique [kernel] 0x68 (0xe66fdea0))
  12128 May 21 18:21:59 search174 kernel: [<c013c22a>] page_cache_read
[kernel] 0xda (0xe66fdeb4))
12129 May 21 18:21:59 search174 kernel: [<f8822a80>] ext3_get_block [ext3]
0x0 (0xe66fdebc))

-------------------------------------------------------
kernel 코드가 assert를 만나 fail이 난듯.
커널 모듈 중 하나의 문제일 수 있음.

댓글 없음:

댓글 쓰기