Hello.
I found that accessing namespace_sem from security_inode_create()
causes lockdep warning when compiled with CONFIG_PROVE_LOCKING=y .
=======================================================
[ INFO: possible circular locking dependency detected ]
-------------------------------------------------------
klogd/1798 is trying to acquire lock:
(&namespace_sem){----}, at: [<e0f133c7>] _aa_perm_dentry+0x80/0x184 [apparmor]
but task is already holding lock:
(&inode->i_mutex){--..}, at: [<c02a883e>] mutex_lock+0x12/0x15
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&inode->i_mutex){--..}:
[<c0137c89>] lock_acquire+0x4b/0x6a
[<c02a86e6>] __mutex_lock_slowpath+0xb0/0x1f6
[<c02a883e>] mutex_lock+0x12/0x15
[<c0180b02>] graft_tree+0x5c/0xd4
[<c0180e98>] do_add_mount+0x84/0x100
[<c0181b5f>] do_mount+0x602/0x659
[<c0181c1a>] sys_mount+0x64/0x9b
[<c0103d9d>] sysenter_past_esp+0x56/0x99
-> #0 (&namespace_sem){----}:
[<c0137c89>] lock_acquire+0x4b/0x6a
[<c0134e34>] down_read+0x1e/0x31
[<e0f133c7>] _aa_perm_dentry+0x80/0x184 [apparmor]
[<e0f14849>] aa_perm_dentry+0x62/0xa4 [apparmor]
[<e0f167c7>] apparmor_inode_create+0x40/0x63 [apparmor]
[<c01749e5>] vfs_create+0x84/0x13e
[<c01774ec>] open_namei+0x169/0x635
[<c0166f15>] do_filp_open+0x20/0x36
[<c0166f6b>] do_sys_open+0x40/0xbb
[<c0167012>] sys_open+0x16/0x18
[<c0103d9d>] sysenter_past_esp+0x56/0x99
other info that might help us debug this:
1 lock held by klogd/1798:
#0: (&inode->i_mutex){--..}, at: [<c02a883e>] mutex_lock+0x12/0x15
stack backtrace:
[<c010555d>] show_trace+0xd/0x10
[<c0105a99>] dump_stack+0x19/0x1b
[<c0136dc8>] print_circular_bug_tail+0x59/0x64
[<c01375bd>] __lock_acquire+0x7ea/0x973
[<c0137c89>] lock_acquire+0x4b/0x6a
[<c0134e34>] down_read+0x1e/0x31
[<e0f133c7>] _aa_perm_dentry+0x80/0x184 ...On Tue, 06 Nov 2007 13:00:41 +0900 sounds like you have an AB-BA deadlock... -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org -
sed /you/AppArmor shipped with OpenSuSE 10.1 and 10.2/ :) Though I don't think this deadlock should occur quite often, it occurs when it occurs. Care should be taken promptly. There should be no way around for this problem as its nature. Passing vfsmount parameter to VFS helper functions and LSM hooks seems to be a good choice to me. Cheers, Toshiharu Harada -
Any code except VFS internals has no business using it at all and doesn't do that in mainline either. I'd start looking for design bugs in whatever code you have using it first. -
Hello. Isn't security_inode_create() a part of VFS internals? I think security_inode_create() is a part of VFS internals because it is called from vfs_create(). Regards. -
It's not. security_inode_create is part of the LSM infrastructure, and the actual methods are part of security modules and definitively not VFS internals. -
Hello. The reason why I want to access namespace_sem inside security_inode_create() is that it doesn't receive "struct vfsmount" parameter. If "struct vfsmount" *were* passed to security_inode_create(), I have no need to access namespace_sem. And now, since calling down_read(&namespace_sem) causes deadlock, I'm looking for a solution. What you said ("I'd start looking for design bugs in whatever code you have using it first.") sounds "never try to implement pathname based access control at security_inode_create()", which makes AppArmor (for OpenSuSE 10.1/10.2) and TOMOYO unable to apply access control. At first, I thought that this lockdep's warning is a false positive, since "struct inode" is allocated/freed dynamically. But the warning still appears even after I disabled freeing memory at destroy_inode() in fs/namei.c (so that address of locking object in "struct inode" never be reused), it is likely genuine. Regards. -
Same argument as with the AA folks: it does not have any business looking at the vfsmount. If you create a file it can and in many setups will show up in multiple vfsmounts, so making decisions based on the particular one this creat happens through is wrong and actually dangerous. -
This has been said many times, and I have never been able to translate it into anything other than "pathname access control is bad". Pathname-based access control systems, among other things, let you write a policy that says "you may create files under /foo/bar/baz". So when you attempt to create a file "/foo/bar/baz/biff" the LSM needs to know which VFS mount you are creating it in. Multiple mount points, bind mounts, and other fun with mounting, do in fact allow you to create aliases. Because of that, an LSM that set a policy of "you can create files anywhere *except* /foo/bar/baz" would be trivially bypassable. But a policy that says "You may *only* create files under /foo/bar/baz" (plus whatever other explicit permissions it grants) does not seem to create any problems, so long as the confined processes are not permitted to create arbitrary aliases by using fun with mount. So just exactly what is dangerous about this? Caveat: complaints that you can create a policy that is bypassable are not interesting, you can do that with any policy system. To show "dangerous" you would have to show how a reasonable policy that should be secure is in fact bypassable. This threat from mount point aliases, this has often been conjectured but has never been shown. Crispin -- Crispin Cowan, Ph.D. http://crispincowan.com/~crispin CEO, Mercenary Linux http://mercenarylinux.com/ Itanium. Vista. GPLv3. Complexity at work -
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds |
