[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mremap use-after-free [was Re: 2.5.52-mm2]



Hugh Dickins wrote:
> 
> ...
> I doubt that (but may be wrong, I haven't time right now to think as
> twistedly as mremap demands).

Maybe, maybe not.  Think about it - because VM_LOCKED is cleared
by slab poisoning, the chances of anyone noticing it are very small.
 
> The code (patently!) expects new_vma
> to be good at the end, it certainly wasn't intending to unmap it;
> but 2.5 split_vma has been through a couple of convulsions, either
> of which might have resulted in the potential for new_vma to be
> freed where before it was guaranteed to remain.

I see no "guarantees" around do_munmap() at all.  The whole thing
is foul; no wonder it keeps blowing up.

It died partway into kde startup.  It was in fact the first mremap
call.

> Do you know the vmas before and after, and the mremap which did this?

Breakpoint 1, move_vma (vma=0xcf1ed734, addr=1077805056, old_len=36864, new_len=204800, new_addr=1078050816) at mm/mremap.c:176
176     {
(gdb) n
183             next = find_vma_prev(mm, new_addr, &prev);
(gdb) 
177             struct mm_struct * mm = vma->vm_mm;
(gdb) 
180             int split = 0;
(gdb) 
182             new_vma = NULL;
(gdb) 
183             next = find_vma_prev(mm, new_addr, &prev);
(gdb) 
184             if (next) {
(gdb) p/x next
$1 = 0xce3bee84
(gdb) p/x *next
$2 = {vm_mm = 0xcf7cc624, vm_start = 0xbffcd000, vm_end = 0xc0000000, vm_next = 0x0, vm_page_prot = {pgprot = 0x25}, 
  vm_flags = 0x100177, vm_rb = {rb_parent = 0xcf1ed74c, rb_color = 0x1, rb_right = 0x0, rb_left = 0x0}, shared = {next = 0xce3beeac, 
    prev = 0xce3beeac}, vm_ops = 0x0, vm_pgoff = 0xffffffce, vm_file = 0x0, vm_private_data = 0x0}
(gdb) p prev
$3 = (struct vm_area_struct *) 0xcf1ed734
(gdb) p/x *prev
$4 = {vm_mm = 0xcf7cc624, vm_start = 0x403e0000, vm_end = 0x4041c000, vm_next = 0xce3bee84, vm_page_prot = {pgprot = 0x25}, 
  vm_flags = 0x100073, vm_rb = {rb_parent = 0xce3be4c4, rb_color = 0x0, rb_right = 0xce3bee9c, rb_left = 0xce3be1ac}, shared = {
    next = 0xcf1ed75c, prev = 0xcf1ed75c}, vm_ops = 0x0, vm_pgoff = 0x0, vm_file = 0x0, vm_private_data = 0x0}


> ...
> 
> Hmmm.  Am I right to suppose that all the changes above are "cleanup"
> which you couldn't resist making while you looked through this code,
> but entirely irrelevant to the bug in question?

to make the code readable so I could work on the bug, it then "accidentally"
got left in.

>  If those mods above
> were right, they should be the subject of a separate patch.

well yes it would be nice to clean that code up.  Like, documenting
it and working out what the locking rules are.

What does this do?
                        spin_lock(&mm->page_table_lock);
                        prev->vm_end = new_addr + new_len;
                        spin_unlock(&mm->page_table_lock);
and this?
                        spin_lock(&mm->page_table_lock);
                        next->vm_start = new_addr;
                        spin_unlock(&mm->page_table_lock);

clearly nothing.  OK, but is this effectively-unlocked code safe?

> There's certainly room for cleanup there, but my preference would be
> to remove "can_vma_merge" completely, or at least its use in mremap.c,
> using its explicit tests instead.  It looks like it was originally
> quite appropriate for a use or two in mmap.c, but obscurely unhelpful
> here - because in itself it is testing a bizarre asymmetric subset of
> what's needed (that subset which remained to be tested in its original
> use in mmap.c).

yes
 
> The problem with your changes above is, you've removed the !vma->vm_file
> tests, presumably because you noticed that can_vma_merge already tests
> !vma->vm_file.  But "vma" within can_vma_merge is "prev" or "next" here:
> they are distinct tests, and you're now liable to merge an anonymous
> mapping with a private file mapping - nice if it's from /dev/zero,
> but otherwise not.  Please just cut those hunks out.

ok
 
> ...
> > -             do_munmap(current->mm, addr, old_len);
> > -
> 
> Anguished cry!  There was careful manipulation of VM_ACCOUNT before and
> after do_munmap, now you've for no reason moved do_munmap down outside.

How do we know that *vma was not altered by the do_munmap() call?

With this change, nobody will touch any locally-cached vma*'s from
the do_munmap() right through to syscall exit.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/