xorl %eax, %eax

Linux kernel Alsa (hda-intel) Division by Zero Crash

leave a comment »

This bug was discovered by Jody Bruchon who also provided a detailed bug report available in his site here. The susceptible code resides in sound/pci/hda/hda_intel.c and specifically in the function shown below.

static int bdl_pos_adj[SNDRV_CARDS] = {[0 ... (SNDRV_CARDS-1)] = -1};
 * Check whether the current DMA position is acceptable for updating
 * periods.  Returns non-zero if it's OK.
 * Many HD-audio controllers appear pretty inaccurate about
 * the update-IRQ timing.  The IRQ is issued before actually the
 * data is processed.  So, we need to process it afterwords in a
 * workqueue.
static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev)
        unsigned int pos;

        if (azx_dev->start_flag &&
            time_before_eq(jiffies, azx_dev->start_jiffies))
                return -1;      /* bogus (too early) interrupt */
        azx_dev->start_flag = 0;

        pos = azx_get_position(chip, azx_dev);
        if (chip->position_fix == POS_FIX_AUTO) {
                if (!pos) {
                               "hda-intel: Invalid position buffer, "
                               "using LPIB read method instead.\n");
                        chip->position_fix = POS_FIX_LPIB;
                        pos = azx_get_position(chip, azx_dev);
                } else
                        chip->position_fix = POS_FIX_POSBUF;

        if (!bdl_pos_adj[chip->dev_index])
                return 1; /* no delayed ack */
        if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2)
                return 0; /* NG - it's below the period boundary */
        return 1; /* OK, it's fine */

Let’s have a quick look at it. The first ‘if’ condition will check the existence of the “stream full start flag” as well as if the start jiffies is before the current jiffies which will result in immediate return with -1. Otherwise, it will update the “stream full start flag” and invoke azx_get_position() to retrieve the position buffer in the given device. If its position fix mode is set to ‘POS_FIX_AUTO’, it will issue a warning message using printk(), change it to ‘POS_FIX_LPIB’ and at last, call azx_get_position() to obtain the correct position buffer. In any other case, it will set its position fix flag to ‘POS_FIX_POSBUF’. After exiting this loop we have the more interesting segment. After checking that ‘bdl_pos_adj[]’ contains a value for the provided index it will attempt to calculate the period boundary and return either 0 or 1. However, as Jody Bruchon noticed, in some cases ‘azx_dev->period_bytes’ which represents the size of the period in bytes could be set to zero. As we can read in his bug report…

Using mp3blaster-3.2.5 (latest version) to play MP3 audio, I am able to
crash the kernel by stopping and restarting playback using the "5" key
repeatedly.  This happens as a normal user, not only as root.  Kernel
backtrace points to azx_position_ok() dividing by zero, so wrote a tiny
patch to investigate which reported via printk() values of pos and
azx_dev->period_bytes; on crash, both were 0.  The offending operation
does: if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2)
which obviously is the source of the crash.

Obviously, the division will result in a divide by zero situation if this is the case. To reproduce the bug, J. Bruchon did the following:

A small shell script or example program which triggers the problem (if possible)
mp3blaster-3.2.5 with repeated start/stop of playback consistently causes
this crash, without fail.  Just keep hitting "5" every three seconds.

And the patch was clearly…

        if (!bdl_pos_adj[chip->dev_index])
                return 1; /* no delayed ack */
+       if (azx_dev->period_bytes == 0) {
+               printk(KERN_WARNING
+                      "hda-intel: Divide by zero was avoided "
+                      "in azx_dev->period_bytes.\n");
+               return 0;
+       }
        if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2)

Which is a simple check against zero before moving on to the calculation process.

Written by xorl

February 22, 2010 at 09:59

Posted in bugs, linux

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s