[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Thread implementations...



>>>>> "RG" == Richard Gooch <Richard.Gooch@atnf.CSIRO.AU> writes:

RG> Eric W. Biederman writes:
>> >>>>> "RG" == Richard Gooch <Richard.Gooch@atnf.CSIRO.AU> writes:

>> With madvise(3) following the traditional format with only one
RG>                ^
RG> Don't you mean 2?

My suggestion:
madvise(2)(struct madvise_struct *, int number_of_structs);
madvise(3)(caddr_t addr, size_t len, size_t strategy);

madvise(3) being in libc...

>> advisement can be done easily.  The reason I suggest multiple
>> arguments is that for apps that have random but predictable access
>> patterns will want to use MADV_WILLNEED & MADV_DONTNEED to an optimum
>> swapping algorigthm.

RG> I'm not aware of madvise() being a POSIX standard. I've appended the
RG> man page from alpha_OSF1, which looks reasonable. It would be nice to
RG> be compatible with something.

According to the kernel source it is available on:
the alpha, mips, and sparc.  And the mips code thinks there is a posix
version somewhere.

Does someone have the Sun/sparc man page?  Besides what is in the
kernel source I mean.

> 	    MADV_WILLNEED
	This needs to start an asynchronouse pagein if necessary.

> 	    MADV_DONTNEED
> 		      Do not need these	pages

> 		      The system will free any resident	pages that are allo-
> 		      cated to the region.  All	modifications will be lost
> 		      and any swapped out pages	will be	discarded.  Subse-
> 		      quent access to the region will result in	a zero-fill-
> 		      on-demand	fault as though	it is being accessed for the
> 		      first time.  Reserved swap space is not affected by
> 		      this call.

This one is broken, for 3 reasons.
1) madvise should only give advise.
2) This can be done with mmap(start, len, PROT..., MAP_ANON, -1, 0)
3) There is a more reasonable interpretation from IRIX:


     MADV_DONTNEED    informs the system that the address range	from addr to
		      addr + len will likely not be referenced in the near
		      future.  The memory to which the indicated addresses are
		      mapped will be the first to be reclaimed when memory is
		      needed by	the system.

Which means that with a smart programmer you can implement the optimal
swapping algorithm for your process with MADV_DONTNEED and
MADV_WILLNEED and be relatively portable.

Of course MADV_SEQUENTIAL should handle the case of sending a file out
a socket, for a userspace sendfile.

> 	    MADV_SPACEAVAIL
> 		      Ensure that resources are	reserved

This one also does more than advise and for that reason I don't like it.

Anyhow this looks like something to keep in mind for 2.3.
Currently I have too many projects in the air to do more than think
the interface through.  The mapping type could easily be stored in the
vma as a hind though.  Perhaps it could be ready for 2.2 but I
couldn't do it.

Eric