CVE-2010-0405: bzip2 Integer Overflow

As we can read at Ubuntu’s USN-986-1 security advisory, the popular libbz2 open source bzip2 implementation is vulnerable in to an integer overflow vulnerability. So, first of all from the CHANGES file of the 1.0.6 version we can read that this vulnerability was reported by Mikolaj Izdebski and we can also notice the buggy code which resides in decompress.c source code file. Specifically, here is the code snippet that demonstrates the bug:

Int32 BZ2_decompress ( DState* s )
{
   UChar      uc;
   Int32      retVal;
   Int32      minLen, maxLen;
   bz_stream* strm = s->strm;
        ...
   Int32  es;
   Int32  N;
        ...
   /*restore from the save area*/
        ...
   es          = s->save_es;
   N           = s->save_N;
        ...
      while (True) {

         if (nextSym == EOB) break;

         if (nextSym == BZ_RUNA || nextSym == BZ_RUNB) {

            es = -1;
            N = 1;
            do {
               if (nextSym == BZ_RUNA) es = es + (0+1) * N; else
               if (nextSym == BZ_RUNB) es = es + (1+1) * N;
               N = N * 2;
               GET_MTF_VAL(BZ_X_MTF_3, BZ_X_MTF_4, nextSym);
            }
               while (nextSym == BZ_RUNA || nextSym == BZ_RUNB);


            es++;
            uc = s->seqToUnseq[ s->mtfa[s->mtfbase[0]] ];
            s->unzftab[uc] += es;

First of all, you can see that both variables ‘es’ and ‘N’ are 32-bit long signed integers. Now, the ‘DState’ pointer ‘s’ points to the save area in the main decompress code as we can read in its definition at bzlib_private.h header file. Next, moving to the code we can read that if the next symbol is the End-of-Block (aka. EOB), it will break of the loop; otherwise, it will check if the next ones value is set to ‘RUNA’ or ‘RUNB’ (these two constants are special values used by the run-length encoding to represent the run-length as a binary number greater than one). If this is the case, it will start calculating the ‘es’ and ‘N’ values and execute the MTF (Move to Front) transform to the symbol.
As Mikolaj Izdebski noticed, the ‘N’ value can result in overflowing the ‘es’ signed integer during the multiplication and thus leading to an integer overflow. To fix this, the following code was used:

             do {
+               /* Check that N doesn't get too big, so that es doesn't
+                  go negative.  The maximum value that can be
+                  RUNA/RUNB encoded is equal to the block size (post
+                  the initial RLE), viz, 900k, so bounding N at 2
+                  million should guard against overflow without
+                  rejecting any legitimate inputs. */
+               if (N >= 2*1024*1024) RETURN(BZ_DATA_ERROR);
                if (nextSym == BZ_RUNA) es = es + (0+1) * N; else
                if (nextSym == BZ_RUNB) es = es + (1+1) * N;

Which limits the ‘N’ value to 2097152 to avoid the overflow. After a couple of lines in the same function you can read the following code…

      /*-- Set up cftab to facilitate generation of T^(-1) --*/
      s->cftab[0] = 0;
      for (i = 1; i <= 256; i++) s->cftab[i] = s->unzftab[i-1];
      for (i = 1; i <= 256; i++) s->cftab[i] += s->cftab[i-1];
      for (i = 0; i <= 256; i++) {
         if (s->cftab[i] < 0 || s->cftab[i] > nblock) {
            /* s->cftab[i] can legitimately be == nblock */
            RETURN(BZ_DATA_ERROR);
         }
      }

      s->state_out_len = 0;
      s->state_out_ch  = 0;
      BZ_INITIALISE_CRC ( s->calculatedBlockCRC );
      s->state = BZ_X_OUTPUT;
      if (s->verbosity >= 2) VPrintf0 ( "rt+rld" );

      if (s->smallDecompress) {

         /*-- Make a copy of cftab, used in generation of T --*/
         for (i = 0; i <= 256; i++) s->cftabCopy[i] = s->cftab[i];

As you can see in the initial code snippet, the possibly negative ‘es’ value is used to update the ‘s->unzftab[uc]’ value which would lead to initializing a ‘s->cftab[]’ value with an incorrect negative integer in the first ‘for’ loop of the above code. To avoid this, another check was added here too like this:

       /*-- Set up cftab to facilitate generation of T^(-1) --*/
+      /* Check: unzftab entries in range. */
+      for (i = 0; i <= 255; i++) {
+         if (s->unzftab[i] < 0 || s->unzftab[i] > nblock)
+            RETURN(BZ_DATA_ERROR);
+      }
+      /* Actually generate cftab. */
       s->cftab[0] = 0;
       for (i = 1; i <= 256; i++) s->cftab[i] = s->unzftab[i-1];
       for (i = 1; i <= 256; i++) s->cftab[i] += s->cftab[i-1];
+      /* Check: cftab entries in range. */
       for (i = 0; i <= 256; i++) {
          if (s->cftab[i] < 0 || s->cftab[i] > nblock) {

The newly added ‘for’ loop checks that the 255 values of the ‘s->unzftab[]’ array aren’t either negative or greater than the block size (stored in ‘nblock’ variable). At last, another check was placed before the code that makes a copy of the ‘cftab’ as you can read above to avoid any read out-of-bounds situation.

       }
+      /* Check: cftab entries non-descending. */
+      for (i = 1; i <= 256; i++) {
+         if (s->cftab[i-1] > s->cftab[i]) {
+            RETURN(BZ_DATA_ERROR);
+         }
+      }
 
       s->state_out_len = 0;

Written by xorl

September 21, 2010 at 01:19

Posted in vulnerabilities

One Response

Subscribe to comments with RSS.

Ah, I was wondering about the internals of this bug. Glad to have you back on your post xorl! :) Thanks ;)

Gynvael Coldwind

September 21, 2010 at 12:29

xorl %eax, %eax