xorl %eax, %eax

Linux kernel XFS double free()

leave a comment »

I was reading 2.6.30-rc2’s ChangeLog and I saw this definitely interesting vulnerability. It was reported by Dave Chinner on 15 March 2009 and affects Linux kernel prior to 2.6.30-rc2. Here is the code from fs/xfs/xfs_iget.c of 2.6.29.

48 /*
49  * Allocate and initialise an xfs_inode.
50  */
51 STATIC struct xfs_inode *
52 xfs_inode_alloc(
53         struct xfs_mount        *mp,
54         xfs_ino_t               ino)
55 {
56        struct xfs_inode        *ip;
63        ip = kmem_zone_alloc(xfs_inode_zone, KM_SLEEP);
72        /*
73         * initialise the VFS inode here to get failures
74         * out of the way early.
75         */
76        if (!inode_init_always(mp->m_super, VFS_I(ip))) {
77                kmem_zone_free(xfs_inode_zone, ip);
78                return NULL;
79        }

This function is used to allocate an XFS inode. At line 63 it calls kmem_zone_alloc() to allocate space of ‘kmem_zone_t *‘ size. Then, at line 76 it invokes inode_init_always() from fs/inode.c to perform the inode structure initiliazation. Its arguments are the superblock that inode belongs to and the inode to initialize. The second argument, is of type xfs_inode, this includes numerous members which should be initialized at inode_init_always(). A quick look at this function reveals this:

120 struct inode *inode_init_always(struct super_block *sb, struct inode *inode)
121 {
126        struct address_space * const mapping = &inode->i_data;
128        inode->i_sb = sb;
150        if (security_inode_alloc(inode)) {
151                if (inode->i_sb->s_op->destroy_inode)
152                        inode->i_sb->s_op->destroy_inode(inode);
155                return NULL;
156        }
192 }

If security_inode_alloc() returns a non-zero value it will attempt to call destroy_inode which was initialized at line 128. The function at line 150 is part of security/security.c file:

356 int security_inode_alloc(struct inode *inode)
357 {
358        inode->i_security = NULL;
359        return security_ops->inode_alloc_security(inode);
360 }

And this call to inode_alloc_security() can be found at security/selinux/hooks.c. This allocates space using kmem_cache_zalloc() and if it fails at the allocation it returns -ENOMEM. Clearly this is unlikely to happen but if there is a way in which slab will run out of memory, then inode_init_always() will call destroy_inode() at line 152 which basically, does this:

209 void destroy_inode(struct inode *inode) 
210 {
212        security_inode_free(inode);
217 }

And this also leads to an SELinux hook that frees this buffer. Now, if we move back to xfs_inode_alloc() we’ll see that it’ll call kmem_zone_free() if inode_init_always() fails. Since this inode was freed by destroy_inode() before initializing inode->i_no, this will result in freeing an uninitialized value at a specific offset from inode pointer. The patch to this bug was to reorder the call to inode_init_always() after inode->i_no has been initialized:

-       /*
-        * initialise the VFS inode here to get failures
-        * out of the way early.
-        */
-       if (!inode_init_always(mp->m_super, VFS_I(ip))) {
-               kmem_zone_free(xfs_inode_zone, ip);
-               return NULL;
-       }
        /* initialise the xfs inode */
        ip->i_ino = ino;
        ip->i_mount = mp;

 #ifdef XFS_DIR2_TRACE
        ip->i_dir_trace = ktrace_alloc(XFS_DIR2_KTRACE_SIZE, KM_NOFS);
+       /*
+       * Now initialise the VFS inode. We do this after the xfs_inode
+       * initialisation as internal failures will result in ->destroy_inode
+       * being called and that will pass down through the reclaim path and
+       * free the XFS inode. This path requires the XFS inode to already be
+       * initialised. Hence if this call fails, the xfs_inode has already
+       * been freed and we should not reference it at all in the error
+       * handling.
+       */
+       if (!inode_init_always(mp->m_super, VFS_I(ip)))
+               return NULL;
+       /* prevent anyone from using this yet */
+       VFS_I(ip)->i_state = I_NEW|I_LOCK;

         return ip;

Obviously, this is not a trivial to exploit vulnerability since you have to force slab allocator into running out of memory while it attempts to allocate space for an XFS inode. Even if you can trigger this situation, the exploitation is still complicated since it’ll require a deep understanding of slab internals. But of course, everything is possible :-P

Written by xorl

April 17, 2009 at 02:32

Posted in linux, vulnerabilities

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s