[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 2.5.47-mm2



William Lee Irwin III wrote:
> 
> On Wed, Nov 13, 2002 at 12:45:07AM -0800, Andrew Morton wrote:
> >>> page-reservation.patch
> >>>   Page reservation API
> 
> William Lee Irwin III wrote:
> >> Don't drop it yet, I've got a caller of this on the back burner.
> 
> On Wed, Nov 13, 2002 at 09:12:15AM -0800, Andrew Morton wrote:
> > Well so have I.  Right now, if pte_chain_alloc() fails the
> > kernel oopses.
> 
> That's the one. I keep choking on mm/slab.c though. =(
> 

Well my plan here is to go to all code paths which end up allocating
a pte chain and do:

	reserve_local_pages(GFP_KERNEL, 2);
	spin_lock(some_lock);
	<lotsa code>
	pte_alloc_map();	/* That's one */
	pte_chain_alloc();	/* That's two */
	spin_unlock(some_lock);
	release_local_pages(GFP_KERNEL, 2);

When you're inside reserve_local_pages(), you are running atomically:
preempt is disabled.  Because the reserved pages are per-cpu.

Consequently all those pagetable allocation functions can no longer
use GFP_KERNEL and they can not have their sleep-and-try-again
stuff.  They must be atomic.  That's why the above code reserved
a page for them too.

This assumes that every architecture's pagetable allocation code
only uses zero-order pages.  If that's not true I am screwed.

Only allocations which use __GFP_RESERVE may dip into those pages.

With this we _could_ take out all the (nasty) dropping of page_table_lock
everywhere where we allocate a pagetable page.  But I figured
I'd keep that there because it works, and memsetting a whole page
while holding page_table_lock is unfriendly.


A similar bunch-o-crap needs to be done for ratnode allocations.

It isn't going to be pretty, but I haven't really been able to
come up with anything better.  A per-task reserved page pool
would not be very good - either we pin boatloads of memory or
we do tons more allocations and frees than necessary...

What do you think?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/