CVE-2010-4260: ClamAV Multiple PDF Vulnerabilities

Continuing from the previous bug fixed in the 0.96.5 release of ClamAV, we have some other security bugs too. This time the buggy code can be found at libclamav/pdf.c which is the PDF parser of ClamAV.
Here are those bugs…

find_stream_bounds() read/write out of bounds

static int find_stream_bounds(const char *start, off_t bytesleft, off_t bytesleft2, off_t *stream, off_t *endstream)
{
    const char *q2, *q;
    if ((q2 = cli_memstr(start, bytesleft, "stream", 6))) {
        q2 += 6;
        if (q2[0] == '\xd' && q2[1] == '\xa')
            q2 += 2;
        if (q2[0] == '\xa')
            q2++;
        *stream = q2 - start;
        bytesleft2 -= q2 - start;
        if (bytesleft2 < 0)
            return 0;
        q = q2;
        q2 = cli_memstr(q, bytesleft2, "endstream", 9);
        if (!q2)
            q2 = q + bytesleft2-9; /* till EOF */
        *endstream = q2 - start;
        if (*endstream < *stream)
            *endstream = *stream;
        return 1;
    }
    return 0;
}

From its name you can easily understand its purpose. It uses cli_memstr() to find the beginning of a PDF stream from the ‘bytesleft’ offset. When it finds it, it increments ‘q2’ pointer by 6 to point after “stream” and checks for “0x0D 0x0A” sequences which is just an “\r\n” line breaking sequence. If this is the case, it skips it by incrementing ‘q2’ by two and if it deals with a single ‘\n’ character it increments it by one.
Finally, it uses that value to calculate a temporary offset of the Bytes left (stored in ‘bytesleft2’) and it’s checking it for possible integer underflows because of the subtraction.
A similar approach is used for the “endstream” and the equivalent pointers are updated.
The problem here arises if a “stream” is the last thing contained in the PDF file. This means that after incrementing ‘q2’ by 6, it will point to the end of file. The subsequent operations lead to read/write out of bounds and to fix this the following patch was applied.

        q2 += 6;
-       if (q2[0] == '\xd' && q2[1] == '\xa')
+       bytesleft -= q2 - start;
+       if (bytesleft < 1)
+           return 0;
+       if (bytesleft >= 2 && q2[0] == '\xd' && q2[1] == '\xa')
            q2 += 2;

What it does is immediately updating ‘byteleft’ after hitting a “stream” to check that it hasn’t reached then end of the file.

find_length() NULL Pointer Dereference

The next vulnerability resides in find_length() which is also pretty self-explanatory.

static int find_length(struct pdf_struct *pdf,
                       struct pdf_obj *obj,
                       const char *start, off_t len)
{
    int length;
    const char *q;
    q = cli_memstr(start, len, "/Length", 7);
  ...
    while (isdigit(*q)) q++;
    if (*q == ' ') {
  ...
        while(isdigit(*q)) q++;
        if (q[0] == ' ' && q[1] == 'R') {
            cli_dbgmsg("cli_pdf: length is in indirect object %u %u\n", length, genid);
            obj = find_obj(pdf, obj, (length << 8) | (genid&0xff));
            if (!obj) {
                cli_dbgmsg("cli_pdf: indirect object not found\n");
                return 0;
            }
            q = pdf_nextobject(pdf->map+obj->start, pdf->size - obj->start);
            length = atoi(q);
        }
    }
  ...
    return length;
}

This is a NULL pointer dereference since the function that retrieves the next PDF object returns NULL if it fails to do so. The fix for this bug is:

            q = pdf_nextobject(pdf->map+obj->start, pdf->size - obj->start);
+           if (!q) {
+               cli_dbgmsg("cli_pdf: next object not found\n");
+               return 0;
+           }
            length = atoi(q);
        }

Which checks the return value of pdf_nextobject().

cli_pdf() Incorrect Return Value

Here is a code snippet…

int cli_pdf(const char *dir, cli_ctx *ctx, off_t offset)
{
    struct pdf_struct pdf;
    fmap_t *map = *ctx->fmap;
    size_t size = map->len - offset;
    off_t versize = size > 1032 ? 1032 : size;
    off_t map_off, bytesleft;
    long xref;
  ...
    rc = run_pdf_hooks(&pdf, PDF_PHASE_PRE, -1, -1);
    if (rc) {
        cli_dbgmsg("cli_pdf: returning %d\n", rc);
        return rc == CL_BREAK ? CL_CLEAN : rc;
  ...
    }
    cli_dbgmsg("cli_pdf: returning %d\n", rc);
    free(pdf.objs);
    return rc;
}

Although the correct way is to check for ‘CL_BREAK’ which means that the operation was aborted, the routine will just return the value contained in ‘rc’ even in cases of abort. To fix this, it was patched like this:

     free(pdf.objs);
-    return rc;
+    /* PDF hooks may abort, don't return CL_BREAK to caller! */
+    return rc == CL_BREAK ? CL_CLEAN : rc;
 }

In order to return ‘CL_CLEAN’ on such cases.

Written by xorl

December 6, 2010 at 00:08

Posted in vulnerabilities

2 Responses

Subscribe to comments with RSS.

Nice writeup as usual. One minor correction: should be 0x0d and 0x0a, not 0xd0 and 0xa0.

Jordan

December 6, 2010 at 00:49
Oh… right, thanks :)

xorl

December 6, 2010 at 00:52

xorl %eax, %eax