日期:2014-05-16  浏览次数:21704 次

Linux -- 内存控制之oom kiiler机制及代码分析

          最近,线上一些内存比较内存占用敏感的应用,在访问峰值的时候,偶尔会被kill掉,导致服务重启。这是Linux一个叫out-of-memory kiiler的机制:

              http://linux-mm.org/OOM_Killer

          oom kiiler会在内存紧张的时候,会依次kill内存占用较高的进程,发送Signal 15(SIGTERM)。并在/var/log/message中进行记录。里面会记录一些如pid,process name,cpu mask,trace等信息,通过监控可以发现类似问题。今天特意分析了一下oom killer相关的选择机制,挖了一下代码,感觉该机制简单粗暴,不过效果还是挺明显的,给大家分享出来。

  •  oom killer初探 

        一个简单分配heap memroy的代码片段(big_mm.c):
#define block (1024L*1024L*MB)
#define MB 64L                                                                                                        
    unsigned long total = 0L; 
    for(;;) {
        // malloc big block memory and ZERO it !!
        char* mm = (char*) malloc(block);
        usleep(100000);
        if (NULL == mm) 
            continue;
        bzero(mm,block);
        total += MB; 
        fprintf(stdout,"alloc %lum mem\n",total);
    }   

        这里有2个地方需要注意:
        
        1、malloc是分配虚拟地址空间,如果不memset或者bzero,那么就不会触发physical allocate,不会映射物理地址,所以这里用bzero填充
        2、每次申请的block大小比较有讲究,Linux内核分为LowMemroy和HighMemroy,LowMemory为内存紧张资源,LowMemroy有个阀值,通过free -lm和

/proc/sys/vm/lowmem_reserve_ratio来查看当前low大小和阀值low大小。低于阀值时候才会触发oom killer,所以这里block的分配小雨默认的256M,否则如果每次申请512M(大于128M),malloc可能会被底层的brk这个syscall阻塞住,内核触发page cache回写或slab回收。

       测试:

       gcc big_mm.c -o big_mm ; ./big_mm & ./big_mm & ./big_mm &

       (同时启动多个big_mm进程争抢内存)       

       启动后,部分big_mm被killed,在/var/log/message下tail -n 1000 | grep -i oom 看到:

Apr 18 16:56:16 v125000100.bja kernel: : [22254383.898423] Out of memory: Kill process 24894 (big_mm) score 277 or sacrifice child
Apr 18 16:56:16 v125000100.bja kernel: : [22254383.899708] Killed process 24894, UID 55120, (big_mm) total-vm:2301932kB, anon-rss:2228452kB, file-rss:24kB
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738942] big_mm invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738947] big_mm cpuset=/ mems_allowed=0
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738950] Pid: 24893, comm: big_mm Not tainted 2.6.32-220.23.2.ali878.el6.x86_64 #1
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738952] Call Trace:
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738961]  [<ffffffff810c35e1>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738968]  [<ffffffff81114d70>] ? dump_header+0x90/0x1b0
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738973]  [<ffffffff810e1b2e>] ? __delayacct_freepages_end+0x2e/0x30
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738979]  [<ffffffff81213ffc>] ? security_real_capable_noaudit+0x3c/0x70
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738982]  [<ffffffff811151fa>] ? oom_kill_process+0x8a/0x2c0
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738985]  [<ffffffff81115131>] ? select_bad_process+0xe1/0x120
Apr 18 16:56:18 v125000100.bja kernel: : [22254386.738989]  [<ffffffff811