ようこそ Tech blogへ!


OOM Killerが盛り上がっている

なぜかtwitterのTLでOOM Killer が空前の盛り上がり。
#OOM Killerたんって、、、何?(笑
その中にOOM Killerの解説についてのツイートがあったので少し紹介。

An aircraft company discovered that it was cheaper to fly its planes
with less fuel on board. The planes would be lighter and use less fuel
and money was saved. On rare occasions however the amount of fuel was
insufficient, and the plane would crash. This problem was solved by
the engineers of the company by the development of a special OOF
(out-of-fuel) mechanism. In emergency cases a passenger was selected
and thrown out of the plane. (When necessary, the procedure was
repeated.) A large body of theory was developed and many publications
were devoted to the problem of properly selecting the victim to be
ejected. Should the victim be chosen at random? Or should one choose
the heaviest person? Or the oldest? Should passengers pay in order not
to be ejected, so that the victim would be the poorest on board? And
if for example the heaviest person was chosen, should there be a
special exception in case that was the pilot? Should first class
passengers be exempted? Now that the OOF mechanism existed, it would
be activated every now and then, and eject passengers even when there
was no fuel shortage. The engineers are still studying precisely how
this malfunction is caused.


この問題は、特別なOOF(燃料不足:Out of fuel)メカニズムを開発したエンジニアによって解決されました。緊急時には、乗客の一人が選択され、機体から外に放り出されます(必要があればこの処理が繰り返されます)。
現在この OOFメカニズムが存在しており、時々稼動しています。そして燃料が不足していない場合にさえも乗客を放り出しています。エンジニアは、この誤動作がどのように発生しているかを未だに学習しています。

kernel3.7.6でのoom killerのコードはこちら。最悪なプロセスを選ぶ badness()。ヒューリスティックってところが素敵(笑
175 * oom_badness – heuristic function to determine which candidate task to kill
176 * @p: task struct of which task we should calculate
177 * @totalpages: total present RAM allowed for page allocation
178 *
179 * The heuristic for determining which task to kill is made to be as simple and
180 * predictable as possible. The goal is to return the highest value for the
181 * task consuming the most memory to avoid subsequent oom failures.
182 */
183unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
184 const nodemask_t *nodemask, unsigned long totalpages)
kernel2.4系に含まれていた、スケジューラで最良のプロセスを選択する goodness()。
131 * This is the function that decides how desirable a process is..
132 * You can weigh different processes against each other depending
133 * on what CPU they’ve run on lately etc to try to handle cache
134 * and TLB miss penalties.
135 *
136 * Return values:
137 * -1000: never select this
138 * 0: out of time, recalculate counters (but it might still be
139 * selected)
140 * +ve: “goodness” value (the larger, the better)
141 * +1000: realtime process, select this.
142 */
144static inline int goodness(struct task_struct * p, int this_cpu, struct mm_struct *this_mm)
146 int weight;
148 /*