xorl %eax, %eax

ClamAV cli_url_canon() Buffer Overflow

leave a comment »

This was reported by Nigel and disclosed on 8 April 2009. Once again, no official advisory was released. This is an easy to spot bug found at libclamav/phishcheck.c which is the phising detection routines based on URL spoofing detection methods. Those are written by Török Edvin. This bug can be found on ClamAV prior to 0.95.1 release. Here we are:

1225 #define URL_MAX_LEN 1024
1226 #define COMPONENTS 4
1227 int cli_url_canon(const char *inurl, size_t len, char *urlbuff, size_t   
1227 dest_len, char **host, size_t *hostlen, const char **path, size_t *pathlen)
1228 {
1229         char *url, *p, *last;
1230         char *host_begin, *path_begin;
1231         const char *urlend = urlbuff + len;
1232         size_t host_len, path_len;
1233
1234         dest_len -= 3;
1235         strncpy(urlbuff, inurl, dest_len);


As its name implies this function is used to canonicalize URLs to proceed with the phising detection. Keep an eye on urlend at line 1231 which is supposed to have a pointer at the end of the URL buffer and also dest_len which you can see at lines 1234-1235. Anyway, the urlbuff copied at line 1235 is then pointed by this pointer:

1237         url = urlbuff;
1238
1239         /* canonicalize only real URLs, with a protocol */
1240         host_begin = strchr(url, ':');


And here is the cool part:

1282         p = host_begin;
1283         while (p < urlend && p+2 < url + dest_len) {
    ...
1286                 /* convert non-ascii characters back to % escaped */
1287                 const char hexchars[] = "0123456789ABCDEF";
1288                 memmove(p+3, p+1, urlend - p - 1);


Do you see it? The while loop at line 1283 iterates as long as those two conditions remain true:
1) host_begin has not reached the URL end
2) host_begin + 2 is not longer than the URL
However, the URL end might be longer than the end of the URL buffer allocated. If this is the case, then memmove() at line 1288 will overwrite data beyond the bounds of the buffer and lead to a common buffer overflow condition. This was patched by adding the missing check:

        p = host_begin;
-       while (p < urlend && p+2 < url + dest_len) {
+       while (p < urlend && p+2 < url + dest_len && urlend < urlbuff+dest_len)
{
            unsigned char c = *p;


But a couple of lines later in the same function another bug is present. This is while determining the end of the given hostname:

1297         urlend = p;
1298         len = urlend - url;
1299         /* determine end of hostname */
1300         host_len = strcspn(host_begin, ":/?");
1301         path_begin = host_begin + host_len;
1302         if(host_len < len) {
1303                 /* url without path, use a single / */
1304                 memmove(path_begin + 2, path_begin + 1, len - host_len);   
1305                 *path_begin++ = '/';
1306                 *path_begin++ = '';


If host_len is less than len it could miss the last character since host_len should normaly be exactly the same as len (which is calculated at line 1298). Again, the copies being perfomed at lines 1304-1306 might lead to incorrect parsing. This was simply fixed like this:

        path_begin = host_begin + host_len;
-       if(host_len < len) {
+       if(host_len <= len) {
                /* url without path, use a single / */


Finally, if you wonder where is this function being called from. Here is the answer to your question:

1329 static int url_hash_match(const struct regex_matcher *rlist,
1329 const char *inurl, size_t len)
1330 {
        ...
1352         rc = cli_url_canon(inurl, len, urlbuff, sizeof(urlbuff),   
1352 &host_begin, &host_len, &path_begin, &path_len);
1353         if (rc == CL_PHISH_CLEAN)
        ...
1413 }


The above is also from libclamav/phishcheck.c. And another one from the test routines at unit_test/check_regex.c:

453 START_TEST (test_url_canon)
454 {
455     char urlbuff[1024+3];
456     char *host = NULL;
457     const char *path = NULL;
458     size_t host_len, path_len;
459     struct uc *u = &uc[_i];
460
461     cli_url_canon(u->in, strlen(u->in), urlbuff, sizeof(urlbuff),     
461  &host, &host_len, &path, &path_len);
462     fail_unless(!!host && !!path, "null results\n");
463     fail_unless_fmt(!strcmp(u->host, host), "host incorrect: %s\n", host);
464     fail_unless_fmt(!strcmp(u->path, path), "path incorrect: %s\n", path);
465 }
466 END_TEST


Have fun!

Written by xorl

April 12, 2009 at 00:09

Posted in bugs

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s