xorl %eax, %eax

GCC Confuses Me

leave a comment »

I was reversing some code before a couple of hours and I noticed something weird. I had the same array initialization with two different conventions which should normally generate the exact same machine code since C99 on paragraph 6.5.2.1 defines that these conventions are exactly the same, so here is the proof of the above statement. I have this simple application:

int
main(void)
{
   char buf[2];
   buf[1] = 'A';
   return 0;
}

Let’s name this a1.c and another one with just a different convention which we’ll call it a2.c. Here is a2.c:

int
main(void)
{
   char buf[2];
   (*((buf)+(1))) = 'A';
   return 0;
}

We compile both of them with the default flags of GCC (I used GCC 4.1.1-21 Debian) and no additional options:

sh-3.1$ gcc a1.c -o a1 && gcc a2.c -o a2
sh-3.1$

Now, let’s see the difference.. I opened up a1 on GDB and here is the assignment operation:

0x08048335 <main+17>:   movb   $0x41,0xfffffffb(%ebp)

Simple, isn’t it? Just move byte 0x41 (which is the ASCII character ‘A’ to an offset of EBP where the array is). Now the same functionality on a2 was translated as:

0x08048335 <main+17>:   lea    0xfffffffa(%ebp),%eax
0x08048338 <main+20>:   inc    %eax
0x08048339 <main+21>:   movb   $0x41,(%eax)

Where, as you can see a1‘s assembly is by far more efficient! On a2 binary we have many additional steps until the assignment, which are:

lea    0xfffffffa(%ebp),%eax    ; Load EAX with offset 0xfffffffa(%ebp)
inc    %eax                     ; Increment EAX, so it's now 0xfffffffb(%ebp)
movb   $0x41,(%eax)             ; Perform the assignment of 'A' to this location

Clearly, the first convention is much more efficient since the second one includes 4 additional, redundant bytes, 3 for LEA instruction and 1 for INC instruction respectively. I haven’t tried this with -O options of GCC to check for any difference but I think it should translate both codes the same since according to C99 these conventions are exactly the same.

Written by xorl

January 20, 2009 at 18:56

Posted in C programming

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s