Nested Designated Initializers: When Things Go Sideways
I was playing with my TCP stack and wondered what would happen if I initialized my TCB struct with nested designated initializers and some strange tricks, which led to a tragedy.
1
2
3
4
5
6
7
8
9
10
11
12
13
struct TCB *ctrl_block = malloc(sizeof(struct TCB));
*ctrl_block = (struct TCB){
.state = SYNRECVD,
.send = {
.iss = A_RANDOM_NUMBER,
.una = 0,
.nxt = ctrl_block->send.iss + 1, // <== DANGER! DANGER!
...
},
...
};
Simplified Example
Okay. Now that you know what I’m going to talk about, let’s go over the C standard specification and a simplified implementation to see what occurs in compilers, why I felt it was a good idea to write that code in the first place, and what goes wrong. I’ll begin with a single designator:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct foo {
int first;
int second;
};
int bar()
{
struct foo instance = {
.first = 20,
.second = instance.first << 2, // <== is this OK?
};
return instance.second;
}
Here’s what C23 has to say about this code:
C23 § 6.7.10 ¶ 18
- Each brace-enclosed initializer list has an associated current object. When no designations are present, subobjects of the current object are initialized in order according to the type of the current object: array elements in increasing subscript order, structure members in declaration order, and the first named member of a union.181) In contrast, a designation causes the following initializer to begin initialization of the subobject described by the designator. Initialization then continues forward in order, beginning with the next subobject after that described by the designator.182)
Alright, we’re good! This means that if we follow the order correctly, everything should work for a single designator. We can use godbolt to validate the assembly generated by GCC 14.1: [link to godbolt]
As you can see, the final assembly moves instance.first
to instance.second
without issue and everything about it is well-defined as we expected.
Spice it Up with Nesting
Now that we know a single designator can use its subobject to define another member while initializing and keeping the members in order, should we try the same thing with nested designators? I assume you said yes because you’re here, after all!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
struct inner_foo {
int first;
int second;
};
struct foo {
struct inner_foo inner;
};
int bar()
{
struct foo instance;
instance = (struct foo){
.inner = {
.first = 20,
.second = instance.inner.first, // <== is this OK?
},
};
return instance.inner.second;
}
If you continue reading the C23 specification, you will come across an interesting paragraph:
C23 § 6.7.10 ¶ 19
- Each designator list begins its description with the current object associated with the closest sur- rounding brace pair. Each item in the designator list (in order) specifies a particular member of its current object and changes the current object for the next designator (if any) to be that member.183) The current object that results at the end of the designator list is the subobject to be initialized by the following initializer.
This is bad news! It is an issue of whether the value being initialized (in this case, the variable) is bound to “the current object,” which is not indicated in the specification, hence we are dealing with undefined behavior. But what does GCC do in this case?
1
2
3
4
5
6
# note:
# [rbp-8] = instance.inner.first
# [rbp-4] = instance.inner.second
mov eax, DWORD PTR [rbp-8] # isn't inner.first uninitialized?
mov DWORD PTR [rbp-8], 20 # now decided to put something inside of it?!
mov DWORD PTR [rbp-4], eax # moves the garbage value from eax to inner.second?!!
If you refer back to what the specification stated, you may agree that GCC provides a logical assembly and you do get the behavior you asked for, but you asked for a wrong behavior! In this scenario, the current object is the uninitialized state, represented by struct foo instance;
. So you are accessing an uninitialized variable, which is undefined behavior! Isn’t it beautiful?
Go Crazy with volatile
Now that we understand what GCC does, you may think, “Okay, the compiler stores the current object value inside of a register before doing the assignments, but what if we put a volatile
right before the inner.first
?”
1
2
3
4
struct inner_foo {
volatile int first;
int second;
};
This should prompt the compiler to load inner.first
before the assignment, right? It turns out this is not the case. To understand why just inserting volatile
before inner.first
does not deceive the compiler, we must first understand how volatile
works in this context. Remember this tip:
Any object that is guaranteed to be local given TBAA and escape analysis is subject to whatever the compiler chooses, but if aliases or pointers to these objects escape, volatile actually applies.
By escaping we mean the address of an item leave the scope. So you might say that a simple function would do the job:
1
2
3
4
5
6
void __attribute__((noinline)) nop(volatile void *ptr)
{
return;
}
nop(&instance.inner.first);
Unfortunately, we come to a dead end in our rabbit hole. The compiler is smarter than I, and it continues to behave the same way for the initializer, even though it loads the variable every time it wants to read it in other places of the code.