Java 프로세스는 업그레이드된 하드웨어의 킬러인 OOM에 의해 종종 종료됩니다.

Java 프로세스는 업그레이드된 하드웨어의 킬러인 OOM에 의해 종종 종료됩니다.

저는 Ubuntu 서버에서 4개의 다른 Java 프로세스와 함께 solr을 실행하고 있습니다. 현재 인덱스 크기는 30GB입니다. 내 solr 프로세스가 몇 시간 내에 자주 종료됩니다. OOM 킬러라고 분명히 언급되어 있습니다. 문제의 원인이 정확히 무엇인지 이해할 수 없습니다. 사용 가능한 스왑 메모리가 0으로 표시됩니다. 스왑 메모리를 늘리거나 비활성화해야 합니까? 가능한 해결책은 무엇입니까? 4GB VPS에서 동일한 프로세스와 solr을 실행하고 있는데 문제가 발생하지 않습니다. 전용 서버로 전환하면 문제가 발생하기 시작했습니다. 그래서 제 생각에는 구성과 관련이 있는 것 같습니다. OOM-killer를 피하기 위한 해결책은 무엇입니까? 커널 로그를 살펴본 후 다음 로그를 발견했습니다.

Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686839] java invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686842] java cpuset=/ mems_allowed=0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686845] CPU: 3 PID: 7207 Comm: java Not tainted 3.13.0-24-generic #47-Ubuntu
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686847] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 1.1a 09/28/2011
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686850]  0000000000000000 ffff880019b519b8 ffffffff81715ac4 ffff8800149bc7d0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686857]  ffff880019b51a40 ffffffff817103ff 0000000000000000 ffffffff81c3f820
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686862]  ffff880019b51a70 0000000000000015 0000000000000000 0000000000000000
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686867] Call Trace:
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686877]  [<ffffffff81715ac4>] dump_stack+0x45/0x56
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686882]  [<ffffffff817103ff>] dump_header+0x7f/0x1f1
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686886]  [<ffffffff8115197e>] oom_kill_process+0x1ce/0x330
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686890]  [<ffffffff812d0135>] ? security_capable_noaudit+0x15/0x20
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686892]  [<ffffffff811520b4>] out_of_memory+0x414/0x450
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686894]  [<ffffffff81158223>] __alloc_pages_nodemask+0x983/0xa20
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686897]  [<ffffffff811976ba>] alloc_pages_vma+0x9a/0x140
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686900]  [<ffffffff8118aaeb>] read_swap_cache_async+0xeb/0x160
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686902]  [<ffffffff8118abf8>] swapin_readahead+0x98/0xe0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686906]  [<ffffffff81178c6e>] handle_mm_fault+0xa7e/0xf10
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686910]  [<ffffffff81721a24>] __do_page_fault+0x184/0x560
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686916]  [<ffffffff811112fc>] ? acct_account_cputime+0x1c/0x20
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686920]  [<ffffffff8109d76b>] ? account_user_time+0x8b/0xa0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686923]  [<ffffffff8109dd84>] ? vtime_account_user+0x54/0x60
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686925]  [<ffffffff81721e1a>] do_page_fault+0x1a/0x70
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686929]  [<ffffffff8171e288>] page_fault+0x28/0x30
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686932] Mem-Info:
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686935] Node 0 DMA per-cpu:
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686938] CPU    0: hi:    0, btch:   1 usd:   0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686940] CPU    1: hi:    0, btch:   1 usd:   0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686942] CPU    2: hi:    0, btch:   1 usd:   0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686943] CPU    3: hi:    0, btch:   1 usd:   0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686945] Node 0 DMA32 per-cpu:
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686948] CPU    0: hi:  186, btch:  31 usd:   0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686949] CPU    1: hi:  186, btch:  31 usd:   2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686950] CPU    2: hi:  186, btch:  31 usd:   0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686951] CPU    3: hi:  186, btch:  31 usd:   0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686952] Node 0 Normal per-cpu:
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686953] CPU    0: hi:  186, btch:  31 usd:  29
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686954] CPU    1: hi:  186, btch:  31 usd:  24
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686955] CPU    2: hi:  186, btch:  31 usd:  32
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686956] CPU    3: hi:  186, btch:  31 usd:  16
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686959] active_anon:607357 inactive_anon:174797 isolated_anon:32
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686959]  active_file:53 inactive_file:142 isolated_file:0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686959]  unevictable:0 dirty:0 writeback:12 unstable:0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686959]  free:25861 slab_reclaimable:4362 slab_unreclaimable:10351
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686959]  mapped:4655 shmem:4670 pagetables:12562 bounce:0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686959]  free_cma:0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686961] Node 0 DMA free:15900kB min:128kB low:160kB high:192kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686965] lowmem_reserve[]: 0 2954 7945 7945
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686969] Node 0 DMA32 free:45092kB min:25080kB low:31348kB high:37620kB active_anon:2306664kB inactive_anon:576776kB active_file:44kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3107092kB managed:3028172kB mlocked:0kB dirty:0kB writeback:0kB mapped:18416kB shmem:18508kB slab_reclaimable:6800kB slab_unreclaimable:12740kB kernel_stack:2040kB pagetables:26380kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:77 all_unreclaimable? yes
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686973] lowmem_reserve[]: 0 0 4990 4990
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686979] Node 0 Normal free:42452kB min:42368kB low:52960kB high:63552kB active_anon:122764kB inactive_anon:122412kB active_file:168kB inactive_file:600kB unevictable:0kB isolated(anon):128kB isolated(file):0kB present:5242880kB managed:5110564kB mlocked:0kB dirty:0kB writeback:48kB mapped:204kB shmem:172kB slab_reclaimable:10648kB slab_unreclaimable:28664kB kernel_stack:3336kB pagetables:23868kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1649 all_unreclaimable? yes
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686984] lowmem_reserve[]: 0 0 0 0
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.686986] Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15900kB
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687005] Node 0 DMA32: 724*4kB (UEMR) 958*8kB (UEM) 1360*16kB (UEM) 395*32kB (UEM) 5*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 45280kB
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687015] Node 0 Normal: 572*4kB (UEMR) 853*8kB (UEM) 589*16kB (UEM) 276*32kB (UEM) 133*64kB (UEM) 27*128kB (UEM) 12*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 42408kB
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687024] 55331 total pagecache pages
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687025] 50275 pages in swap cache
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687027] Swap cache stats: add 13552969455, delete 13552919180, find 5504664044/6715155814
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687027] Free swap  = 0kB
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687029] Total swap = 3905532kB
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687030] 2091489 pages RAM
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687032] 0 pages HighMem/MovableOnly
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687033] 33079 pages reserved
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687034] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687043] [  303]     0   303     4902        0      13       99             0 upstart-udev-br
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687047] [  308]     0   308    12804        1      28      145         -1000 systemd-udevd
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687049] [  515]     0   515     3815        0      12       75             0 upstart-socket-
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687051] [  582]     0   582     5883        0      15      100             0 vsftpd
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687054] [  717]   102   717     9807        0      25      100             0 dbus-daemon
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687056] [  801]     0   801    10863        1      27       89             0 systemd-logind
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687058] [  872]   101   872    64154       98      42     7756             0 rsyslogd
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687060] [  921]     0   921     3852        0      12       93             0 upstart-file-br
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687062] [ 1020]     0  1020     3955        1      13       40             0 getty
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687065] [ 1023]     0  1023     3955        1      15       38             0 getty
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687067] [ 1028]     0  1028     3955        1      12       39             0 getty
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687069] [ 1029]     0  1029     3955        1      13       38             0 getty
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687071] [ 1032]     0  1032     3955        1      13       38             0 getty
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687073] [ 1051]     0  1051     1092        0       8       37             0 acpid
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687075] [ 1053]     0  1053     4785        0      13       46             0 atd
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687076] [ 1054]     0  1054     5914       17      16       51             0 cron
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687078] [ 1066]     0  1066    15341        0      33      182         -1000 sshd
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687079] [ 1075]     0  1075     4797       28      14       30             0 irqbalance
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687080] [ 1118]   106  1118   598658    16362     248    39647             0 mysqld
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687082] [ 1338]     0  1338     3955        1      12       41             0 getty
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687083] [25936]   108 25936  1018711    10052     508   173458             0 java
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687085] [26010]     0 26010    96095       56     120     1914             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687087] [25528]     0 25528     6833       93      17      111             0 screen
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687088] [25529]     0 25529     5316        0      15      184             0 bash
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687089] [ 2954]     0  2954  9691875   196411    7258   375033             0 java
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687091] [24527]     0 24527    14910        0      34      114             0 cron
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687093] [24529]     0 24529     1111        0       7       26             0 sh
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687096] [24532]     0 24532     1795        0       9       23             0 flock
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687098] [24534]     0 24534  1194110   175735     661    91027             0 java
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687100] [ 6883]     0  6883    14910        0      34      114             0 cron
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687102] [ 6890]     0  6890     1111        0       7       26             0 sh
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687103] [ 6891]     0  6891     1795        0       9       24             0 flock
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687105] [ 6896]     0  6896  1160035   117317     590   132623             0 java
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687108] [ 7096]    33  7096    97330     4899     133     2340             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687111] [ 7195]     0  7195    14910        2      34      111             0 cron
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687113] [ 7197]     0  7197     1111        0       7       24             0 sh
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687115] [ 7201]     0  7201     1795        0       9       23             0 flock
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687117] [ 7203]     0  7203  1125170   194186     747   136458             0 java
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687119] [ 7267]    33  7267    97545     4189     128     2166             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687121] [ 7272]    33  7272    97552     4653     128     1790             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687123] [ 7285]    33  7285    97134     4994     126     1263             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687125] [ 7298]    33  7298    97573     6297     135     1326             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687127] [ 7306]    33  7306    97775     5594     137     1981             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687129] [ 7330]    33  7330    97550     4065     127     2434             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687131] [ 7334]    33  7334    97350     4508     133     2480             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687133] [ 7593]    33  7593    96230      212     113     1846             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687135] [ 7599]    33  7599    97445     3916     127     2509             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687137] [ 7607]    33  7607    97091     2245     125     1924             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687139] [ 7640]    33  7640    96129      185     112     1814             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687141] [ 7642]    33  7642    97318     3877     128     2279             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687143] [ 7645]    33  7645    97385     4407     127     1808             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687145] [ 7651]    33  7651    96121      159     112     1831             0 apache2
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.687147] Out of memory: Kill process 2954 (java) score 186 or sacrifice child
Feb  2 18:14:08 xxxxxxxxx kernel: [4247473.770503] Killed process 2954 (java) total-vm:38767500kB, anon-rss:785644kB, file-rss:0kB

내 하드웨어는 120GB SSD와 8GB RAM입니다.

관련 정보