[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Patch] shm cleanups



Ingo Molnar <mingo@chiara.csoma.elte.hu> writes:

> On Thu, 4 Nov 1999, Rik van Riel wrote:
> 
> > I think I see what is going on here. Kswapd sees that memory is
> > low an "frees" a bunch of high memory pages, causing those pages
> > to be shifted to low memory so the total number of free pages
> > stays just as low as when kswapd started.
> 
> hm, kswapd should really be immune against this.
> 
> > This can result in in-memory swap storms, we should probably
> > limit the number of in-transit async himem pages to 256 or some
> > other even smaller number.
> 
> i introduced some stupid balancing bugs, and i wrongly thought that the
> fixes are already in 2.3.25, but no, it's the pre1-2.3.26 kernel that is
> supposed to have balancing right. basically the fix is to restore the
> original behavior of not counting high memory in memory pressure. This
> might sound an unfair policy, but the real critical resource is low
> memory. If this ever proves to be a problematic approach then we still can
> make it more sophisticated.
> 
> [Christoph, are you still seeing the same kind of bad swapping behavior
> with pre1-2.3.26?]

No, after applying the following patch, it is much better now:

--- 2.3.26-pre/ipc/shm.c        Fri Nov  5 10:25:40 1999
+++ make26/ipc/shm.c    Fri Nov  5 10:54:09 1999
@@ -897,10 +897,10 @@
                unlock_kernel();
                return 0;
        }
-       if (page_count(page_map))
+       if (page_count(page_map) != 1)
                goto check_table;
        if (!(page_map = prepare_highmem_swapout(page_map)))
-               goto check_table;
+               goto failed;
        SHM_ENTRY (shp, idx) = swp_entry_to_pte(swap_entry);
        swap_successes++;
        shm_swp++;


[root@ls3016 src]# ipcs -um

------ Shared Memory Status --------
segments allocated 274
pages allocated 2244608
pages resident  2043405
pages swapped   177175
Swap performance: 251917 attempts        241439 successes

[root@ls3016 src]# cat /proc/meminfo 
        total:    used:    free:  shared: buffers:  cached:
Mem:  4152516608 4133502976 19013632        0   512000 21954560
Swap: 4133269504 725725184 3407544320
MemTotal:   8249496 kB
MemFree:      18568 kB
MemShared:        0 kB
Buffers:        500 kB
Cached:       21440 kB
HighTotal:  7471104 kB
HighFree:         0 kB
SwapTotal:  4036396 kB
SwapFree:   3327680 kB

and output of vmstat 5:

   procs                      memory    swap          io     system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
17  3  1 424020   2688    244  25932 407 10280   104  2570  970  1490   0  92   7
20  0  1 461380   1688    240  24464 467 8048   120  2012  758  1188   0  87  13
18  1  1 484676  27728    240  20692 994 5644   250  1411  549   910   0  90  10
19  0  1 497068   2732    260  25700 2690 5055   675  1264  532  1038   0  81  18
11  8  1 529820   4692    272  25192 4232 10643  1063  2661 1191  2126   0  85  15
17  2  1 559572   1472    264  19860 2538 8473   641  2118  919  1653   0  81  19
12  7  1 609780   1944    268  24280 3620 13611   912  3404 1485  2799   0  79  21
15  3  1 648148   1836    272  16648 8061 15666  2025  3916 2227  3731   0  78  22
12  6  1 692208   3044    280  23192 5394 14147  1359  3538 1768  3163   0  75  25
12  5  0 742160   2256    276  38144 6208 16190  1559  4047 1936  3419   0  78  22

So we have very few failes. Probably we never get caught by the second
part of the above patch any more.

> -- mingo
> 
> ps. some people might ask why we want to swap on an 8GB box, but i think
> it's really an issue in production systems to provide some kind of 'rubber
> wall' instead of 'hard concrete' if the system is reaching its limits.
> adding (99% unused) swap space does exactly this.

Yes we need it for ERP applications. You do not beleive how many data
is processed sometimes in business applications.

And to have a hard limit for production servers is always a reason to
use something else.

        Christoph
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/