[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [SMP lock BUG?] Re: Feedback on preemptible kernel patch



If the page it is the correct one, when it is found mapped, the code
should just exit, not BUG() IHMO.

George


Roger Larsson wrote:
> 
> Hi,
> 
> This is interesting. [Assumes UP Athlon - correct]
> Note that all BUGs out in highmem.h:95 (kmap_atomic)
> and that test is only on if you have enabled HIGHMEM_DEBUG
> [my analyze is done with a 2.4.10-pre2 kernel, but I checked with
> later patches and I do not think they fix it either...]
> 
> The preemptive kernel puts more SMP stress on the kernel than
> running with multiple CPUs.
> 
> So this might be a potential bug in the kernel proper, running with
> a SMP computer.
> 
> If I understand the bug correctly, a process gets a page fault.
> Starts to map in the page. But before the final part it checks -
> and the page is already there!!! Correct?
> 
> On Saturday den 8 September 2001 19:33, Arjan Filius wrote:
> > Hello Robert,
> >
> >
> > I tried 2.4.10-pre4 with patch-rml-2.4.10-pre4-preempt-kernel-1.
> > But it seems to hit highmem (see below) (i do have 1.5GB ram)
> > 2.4.10-pre4 plain runs just fine.
> >
> > With the kernel option mem=850M the patched kernel boots an seems to run
> > fine. However i didn't do any stress testing yet, but i still notice
> > hickups while playing mp3 files at -10 nice level with mpg123 on a 1.1GHz
> > Athlon, and removing for example a _large_ file (reiser-on-lvm).
> >
> > My syslog output with highmem:
> >
> > Sep  8 18:10:16 sjoerd kernel: kernel BUG at
> > /usr/src/linux-2.4.10-pre4/include/asm/highmem.h:95! Sep  8 18:10:16 sjoerd
> > kernel: invalid operand: 0000
> > Sep  8 18:10:16 sjoerd kernel: CPU:    0
> > Sep  8 18:10:16 sjoerd kernel: EIP:    0010:[do_wp_page+636/1088]
> > [- - -]
> > sjoerd kernel: Call Trace: [handle_mm_fault+141/224]
> > [do_page_fault+375/1136] [do_page_fault+0/1136] [__mmdrop+58/64]
> > [do_exit+595/640] Sep  8 18:10:16 sjoerd kernel:    [error_code+52/64]
> 
> Lets look at this example. You need to add some inline functions...
> 
> handle_mm_fault
>         takes the mm->page_table_lock [this should prevent reschedules]
>         allocs pmd
>         allocs pte
>         handle_pte_fault(...)
> handle_pte_fault [inline, most likely path]
>         pte is present
>         it is a write access
>         but the pte is not writeable  - call do_wp_page
> do_wp_page
>         plays some games with the lock...
>         finally calls copy_cow_page [inline] with the page_table_lock
>         UNLOCKED!
> copy_cow_page
>         calls clear_user_highpage or copy_user_highpage
> both clear_user_highpage and copy_user_highpage
>         calls kmap_atomic
> kmap_atomic
>         page is a highmem page
>         but during the time this process was unlocked some other
>         thread has allocated the page in question... BUG out.
> 
> So somewere between the UNLOCK (might be a lot later) and the
> BUG test in kmap_atomic the process running in kernel got preempted.
> (most likely during the page copy since it will take some time)
> 
> Another process (thread) started to run - hit the same page fault
> but succeeded in its alloc.
> 
> Back to the first process it continues, finally checks - the page
> is there... and BUGS.
> 
> Note that this can happen in a pure SMP kernel.
> 
> But let the processes (threads) run on two CPUs. And let the
> first get an interrupt/bh after unlock - the other can pass
> and add the page before the first one can continue - same
> result!
> 
> /RogerL
> 
> --
> Roger Larsson
> Skellefteċ
> Sweden
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/