[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Two naive questions and a suggestion
On 25 Nov 1998 jfm2@club-internet.fr wrote:
> > I sounds remarkably like you want my Out Of Memory killer
> > patch. This patch tries to remove the randomness in killing
> > a process when you're OOM by carefully selecting a process
> > based on a lot of different factors (size, age, CPU used,
> > suid, root, IOPL, etc).
>
> Your scheme is (IMHO) far too complicated and (IMHO) falls short.
> The problem is that the kernel has no way to know what is the really
> important process in the box.
In my (and other people's) experience, an educated guess is
better than a random kill. Furthermore it is not possible to
get out of the OOM situation without killing one or more
processes, so we want to limit:
- the number of processes we kill (reducing the chance of
killing something important)
- the CPU time 'lost' when we kill something (so we don't
have to run that simulation for two weeks again)
- the risk of killing something important and stable, we
try to avoid this by giving less hitpoints to older
processes (which presumably are stable and take a long
time to 'recreate' the state in which they are now)
- the amount of work lost -- killing new processes that
haven't used much CPU is a way of doing this
- the probability of the machine hanging -- don't kill
IOPL programs and limit the points for old daemons
and root/suid stuff
Granted, we can never make a perfect guess. It will be a
lot better than a more or less random kill, however.
The large simulation that's taking 70% of your RAM and
has run for 2 weeks is the most likely victim under our
current scheme, but with my killer code it's priority
will be far less that that of a newly-started and exploded
GIMP or Netscape...
> Why not simply allow a root-owned process declare itself (and the
> program it will exec into) as "guaranteed"?
If the guaranteed program explodes it will kill the machine.
Even for single-purpose machines this will be bad since it
will increase the downtime with a reboot&fsck cycle instead
of just a program restart.
> Or a box used as a mail server using qmail: qmail starts sub-servers
> each one for a different task.
The children are younger and will be killed first. Starting
the master server from init will make sure that it is
restarted in the case of a real emergency or fluke.
> Of course this is only a suugestion for a mechanism but the important
> is allowing a human to have the final word.
What? You have a person sitting around keeping an eye on
your mailserver 24x7? Usually the most important servers
are tucked away in a closet and crash at 03:40 AM when
the sysadmin is in bed 20 miles away...
The kernel is there to prevent Murphy from taking over :)
cheers,
Rik -- slowly getting used to dvorak kbd layout...
+-------------------------------------------------------------------+
| Linux memory management tour guide. H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+
--
This is a majordomo managed list. To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org