I must say I am surprised, stunned even. Is there anyone over there using CLang for any C+ +? For I have found a bug in the compiler. But not a subtle, use-the-very-latest-features kind of bug! No! The bug is triggered by plain old C++, and is probably the result of two different optimization processes.

The bug

Here is a sample code to demonstrate it:

// File test_assign_minim.cpp
struct A
  A() : a(0), b(true) { }
  long a;
  bool b;
 B : public A
  B() : A(), c(0xDEAD) { }
  unsigned short c;

void assign(A& dst, const A& src)
  dst = src;

int main()
  A a;
  B b;
  assign(b, a);
  return (b.c != 0xDEAD);

If you run this program with a buggy compiler (i.e. any version of clang will do) the program will return with an error. Simplest to test:

$ clang++ -o test_assign_minim test_assign_minim.cpp
$ ./test_assign_minim; echo $?

This should print "1" if the error is there, "0" otherwise.


So what happens? First, we will assume a few things (that hold true for Intel platforms and the clang compiler):

  1. a n-byte value must be aligned on n-byte, up to a word size. In other words, if you have a 2-bytes integer, it needs to be on an even address, a 4-bytes integer must be on a an address that is a multiple of 4, but a 32-bytes value needs to be on a 8-bytes address for a 64 bits machine and 4 bytes address for a 32 bits machine.
  2. "long" is the size of a word (i.e. 4 bytes for 32 bits machine and 8 bytes for 64 bits machines)
  3. "short" is 16 bits (or 2 bytes)
  4. "bool" is 1 byte long

A first consequence is that the size of the structure must be a multiple of the largest alignment requirement. Indeed, suppose you have a 4 bytes integer in your structure. This value needs to be always on an address that is a multiple of 4. Within the structure, the compiler will make sure than the variable starts at a multiple of 4 from the start of the structure. But this means the structure itself must be 4-aligned. Because in arrays the address of the next element is given by incrementing the pointer by the size of the structure, this implies the structure's size must be a multiple of the required alignment. So, on a 32 bits machine (I am lazy, but it would be almost the same for a 64 bits one), the structure A looks like this:

struct A:
0 1 2 3  4  5 6 7
| a      | b|//////|

Where "//" is some padding. From there, if you were to write an assignment operator, you might be tempted (and it seems the clang team was) by something like:

A& A::operator=(const A& other)
memcpy(this, &other, sizeof(A));
return *this;

That's really fast, it works whatever the structure is, and you don't care about padding and other nasty features. Now, let's look at the structure B:

struct B:
0 1 2 3  4  5  6 7
| a      | b|//| c  |

To optimize for space, the compiler replaced some of the padding with the extra variable c, which has the nice property that B is no bigger than A (i.e. you got a variable "for free"). Using the above assignment operator, you can see the problem: the garbage at the end of A will be copied where c is in B.


The easy way around this is to avoid finish you data structure with a small object. Just make sure the biggest object is the last one. In the case of A, you would end up with this structure:

struct A1:
0  1 2 3  4 5 6 7
| b|//////| a      |

which has no padding in the end, so the new B would be:

struct B1:
0  1 2 3  4 5 6 7  8 9 1011
| b|//////| a      | c  |////|

On the downside, the structure is now bigger, but there won't be any problem.


It took me quite a long time to find the bug. In my case, the extra variable was a reference counter, that ended up reset to random value every time a similar assignment happened. Structures were deleted many times, reference count reached in-understandable values ... and of course I never suspected a compiler error.

I think this is a very simple error. I am really puzzled to be the first one to find it (or at least to report it). The only reason I can think of is that very few teams are actually using this compiler for the C++.

Of course, I have submitted a bug report here: Bug report 13329. Now, let's hope they will correct these small issues as I would really like to test my code on different compiler, if only to ensure I am not using bugs/extensions of g++.



After submitting the bug report and sending an email to the developer mailing list, the bug is now fix. The reactivity was very good, and now the compiler generate valid code for my program. The patch has to be integrated into the trunk now. But thanks to the clang team to be so reactive.