[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: some doubts on cache coherence
Hello Prabhat...
> In your first line you have mentioned if the bh executes in
> process context then it can sleep. i.e workqueue.
> Now if its executing in process context then it can acces user space
> and user space memory can be swapped out. So bh can actuallu put to
> sleep in this manner whenever there is a page fault.
Actually, it's not a matter of being in process context or not. If your
goal is accessing user space memory, then you can do it anywhere in
kernel space, since it is legal to access user space memory from kernel
space. Process context here only means that it is executing on behalf
of a process, be it a normal process or kernel thread.
Now, since there are many types of bottom halves, you need to use one
that is allowed to sleep. The example is workqueue.
> What is this temporary mapping, can you please explain? is it
> being done via mmap() system call?
It will be long to explain here. Better read Linux Device Driver 3rd
edition in chapters dealing with memory management. In short, it is
done using kmap(). Temporary mapping is needed to address memory above
896 MB mark since they don't have static virtual address within the 4GB
address space.
> So prevent the locking stuff, each CPU maintians per CPU
> data structures. i.e any global data is copied onto the per CPU data
> structure to prevent synchronisation problems on that global data
> structure.
Well IMHO, if you already make certain data structure as per-cpu type,
you don't need to maintain another global copy of similar data.
>But if we reduce the synchronisation overhead wont we
> introduce another overhead of updating all those per CPU data
> structures whenever a change happens in the global data structure?
I think that will rarely happen. On per cpu data scenario, a single core
of processor only need to access and update certain data only, not
globally. If your program doesn't work in this fashion, maybe you're
not correctly planning your program to adapt per-cpu data allocation
strategy.
> Example: routing table and flow cache in IPSec processing.
> Here we can say the flow cache is a subset of the routing cache.
> So the policy search is made on the flow cache which actually is a
> per CPU data structure maintained by each CPU separately. Now whenver
> the routing table is undergoing a change, the corresponding values
> must be updated in the flow cache. Wont this updatation introduce
> overhead.
I am not really sure about this routing table scenario, but allow me to
share my thought:
1. How many times will someone or something change the routing table? I
think it is very few-to-none. Assume that there is actually a routing
table/flow cache, one for a single CPU core, then it won't be a big
problem to do such cache update. The kernel just needs to flush/clear
all the relevant entries on each per-cpu cache.
2. After all, i think maintaining per-cpu routing table cache is not
necessary. I prefer to leave L1/L2 cache to do this kind of work and
try to optimize it by cache-aligning the relevant entries in routing
table. I say this because routing table is something that mostly being
read only, so even it is read by many CPU cores at one, it won't be a
problem..
I hope I share a valuable thoughts here
regards
Mulyadi
--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive: http://mail.nl.linux.org/kernelnewbies/
FAQ: http://kernelnewbies.org/faq/