[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: design for new VM




On Fri, 4 Aug 2000, Matthew Dillon wrote:
> :
> :There are architecture-specific special cases, of course. On ia64, the
> :..
> 
>     I spent a weekend a few months ago trying to implement page table 
>     sharing in FreeBSD -- and gave up, but it left me with the feeling
>     that it should be possible to do without polluting the general VM
>     architecture.
> 
>     For IA32, what it comes down to is that the page table generated by
>     any segment-aligned mmap() (segment == 4MB) made by two processes 
>     should be shareable, simply be sharing the page directory entry (and thus
>     the physical page representing 4MB worth of mappings).  This would be
>     restricted to MAP_SHARED mappings with the same protections, but the two
>     processes would not have to map the segments at the same VM address, they
>     need only be segment-aligned.

I agree that from a page table standpoint you should be correct. 

I don't think that the other issues are as easily resolved, though.
Especially with address space ID's on other architectures it can get
_really_ interesting to do TLB invalidates correctly to other CPU's etc
(you need to keep track of who shares parts of your page tables etc).

>     This would be a transparent optimization wholely invisible to the process,
>     something that would be optionally implemented in the machine-dependant
>     part of the VM code (with general support in the machine-independant
>     part for the concept).  If the process did anything to create a mapping
>     mismatch, such as call mprotect(), the shared page table would be split.

Right. But what about the TLB?

It's not a problem on the x86, because the x86 doesn't have ASN's anyway.
But fo rit to be a valid notion, I feel that it should be able to be
portable too.

You have to have some page table locking mechanism for SMP eventually: I
think you miss some of the problems because the current FreeBSD SMP stuff
is mostly still "big kernel lock" (outdated info?), and you'll end up
kicking yourself in a big way when you have the 300 processes sharing the
same lock for that region..

(Not that I think you'd necessarily have much contention on the lock - the
problem tends to be more in the logistics of keeping track of the locks of
partial VM regions etc).

>     (Linux falls on its face for other reasons, mainly the fact that it
>     maps all of physical memory into KVM in order to manage it).

Not true any more.. Trying to map 64GB of RAM convinced us otherwise ;)

>     I think the loss of MP locking for this situation is outweighed by the
>     benefit of a huge reduction in page faults -- rather then see 300 
>     processes each take a page fault on the same page, only the first process
>     would and the pte would already be in place when the others got to it.
>     When it comes right down to it, page faults on shared data sets are not
>     really an issue for MP scaleability.

I think you'll find that there are all these small details that just
cannot be solved cleanly. Do you want to be stuck with a x86-only
solution?

That said, I cannot honestly say that I have tried very hard to come up
with solutions. I just have this feeling that it's a dark ugly hole that I
wouldn't want to go down..

			Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/