admin管理员组文章数量:1025487
Given the untyped_sequence
and int_sequence
below:
typedef struct {
void* data; // first item
size_t size; // number of items
size_t item_size; // item byte size
} untyped_sequence;
typedef struct {
int* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} int_sequence;
QUESTION: Is it UB to put them as two union members, initialize an instance of that union using the int_sequence
member, then mutating the int
data using the untyped_sequence
member?
- If yes - why?
- If no - why?
GCC, Clang and MSVC give no warnings about this, but that doesn't necessarily mean anything.
Minimal runnable example ():
#include <string.h>
#include <stdio.h>
typedef struct {
void* data; // first item
size_t size; // number of items
size_t item_size; // item byte size
} untyped_sequence;
typedef struct {
int* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} int_sequence;
typedef union {
int_sequence typed;
untyped_sequence untyped;
} sequence;
void untyped_zero_first(untyped_sequence untyped) {
memset(untyped.data, 0, untyped.size * untyped.item_size);
}
int main(void) {
int ints[4] = {1, 2, 3, 4};
sequence s = {
.typed.data = ints,
.typed.size = 4,
.typed.item_size = sizeof(int)
};
untyped_zero_first(s.untyped);
// prints "0, 0, 0, 0" for GCC, Clang, MSVC - but is ut UB?
printf("%d, %d, %d, %d\n", ints[0], ints[1], ints[2], ints[3]);
}
Given the untyped_sequence
and int_sequence
below:
typedef struct {
void* data; // first item
size_t size; // number of items
size_t item_size; // item byte size
} untyped_sequence;
typedef struct {
int* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} int_sequence;
QUESTION: Is it UB to put them as two union members, initialize an instance of that union using the int_sequence
member, then mutating the int
data using the untyped_sequence
member?
- If yes - why?
- If no - why?
GCC, Clang and MSVC give no warnings about this, but that doesn't necessarily mean anything.
Minimal runnable example (https://godbolt./z/PT6ahh4qq):
#include <string.h>
#include <stdio.h>
typedef struct {
void* data; // first item
size_t size; // number of items
size_t item_size; // item byte size
} untyped_sequence;
typedef struct {
int* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} int_sequence;
typedef union {
int_sequence typed;
untyped_sequence untyped;
} sequence;
void untyped_zero_first(untyped_sequence untyped) {
memset(untyped.data, 0, untyped.size * untyped.item_size);
}
int main(void) {
int ints[4] = {1, 2, 3, 4};
sequence s = {
.typed.data = ints,
.typed.size = 4,
.typed.item_size = sizeof(int)
};
untyped_zero_first(s.untyped);
// prints "0, 0, 0, 0" for GCC, Clang, MSVC - but is ut UB?
printf("%d, %d, %d, %d\n", ints[0], ints[1], ints[2], ints[3]);
}
Share
Improve this question
asked Nov 17, 2024 at 14:34
Johann GerellJohann Gerell
25.7k11 gold badges76 silver badges126 bronze badges
7
|
Show 2 more comments
2 Answers
Reset to default 4Is this union pointer member type punning UB in C?
Yes, in that the language spec does not define the behavior (as opposed to explicitly declaring it undefined).
Unlike C++, C does not have a sense of an "active" member of a union. Accessing a different member than was initialized or last stored does not, in and of itself, produce undefined behavior. Since C17, the behavior is not even implementation-defined. You can just do it, which involves (as a note in the spec clarifies) reinterpreting the appropriate part of the stored value according to the type of the accessed member.
But in your particular case, that's not enough. C does not require that the size and representation of type void *
be the same as the size and representation of type int *
. As far as the spec is concerned, there is no telling, at the point where your example code calls untyped_zero_first(s.untyped)
, what s.untyped.data
points to. It might even be a trap representation if your implementation's void *
representation affords those.
In practice, you're unlikely to run into a modern platform in which different object pointer types in fact do have different size or representation, so your code is likely to work as intended, but C does not guarantee that.
- The pointers and other fields union punning is implementation defined.
Union Type-Punning Exception (C11, Section 6.5.2.3, Paragraph 3):
"A pointer to a union object, suitably converted, points to each of its members (or if a member is a bit-field, to the unit in which it resides), and vice versa."
"If the member used to access the contents of a union object is not the same as the member last stored into, the behavior is implementation-defined."
- Using the pointers (it may invoke UB)
Effective Type Rule (C11, Section 6.5, Paragraph 7):
"An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
- a type compatible with the effective type of the object,
- a qualified version of a type compatible with the effective type of the object,
- a type that is the signed or unsigned type corresponding to the effective type of the object,
- a character type."
Strict Aliasing Rule (C11, Section 6.5, Paragraph 7):
- "An object shall have its stored value accessed only by an lvalue expression that has one of the following types: a type compatible with the effective type of the object..."
Answering in a few words:
- union type punning is implementation defined
- using the pointers depends on the referenced objects and pointer types. It may invoke undefined behaviour (UB)
Example invoking and not invoking UB assuming assuming the correctness if the implementation.
typedef struct {
void* data; // first item
size_t size; // number of items
size_t item_size; // item byte size
} untyped_sequence;
typedef struct {
int* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} int_sequence;
typedef struct {
float* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} float_sequence;
typedef union {
int_sequence typed;
untyped_sequence untyped;
float_sequence floatseq;
} sequence;
void untyped_zero_first(untyped_sequence untyped) {
memset(untyped.data, 0, untyped.size * untyped.item_size);
}
int main(void) {
int ints[4] = {1, 2, 3, 4};
//no UB here
sequence s =
{
.typed.data = ints,
.typed.size = 4,
.typed.item_size = sizeof(int)
};
untyped_zero_first(s.untyped);
printf("%d, %d, %d, %d\n", s.typed.data[0], s.typed.data[1], s.typed.data[2], s.typed.data[3]);
//UB
printf("%f, %f, %f, %f\n", s.floatseq.data[0], s.floatseq.data[1], s.floatseq.data[2], s.floatseq.data[3]);
}
Given the untyped_sequence
and int_sequence
below:
typedef struct {
void* data; // first item
size_t size; // number of items
size_t item_size; // item byte size
} untyped_sequence;
typedef struct {
int* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} int_sequence;
QUESTION: Is it UB to put them as two union members, initialize an instance of that union using the int_sequence
member, then mutating the int
data using the untyped_sequence
member?
- If yes - why?
- If no - why?
GCC, Clang and MSVC give no warnings about this, but that doesn't necessarily mean anything.
Minimal runnable example ():
#include <string.h>
#include <stdio.h>
typedef struct {
void* data; // first item
size_t size; // number of items
size_t item_size; // item byte size
} untyped_sequence;
typedef struct {
int* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} int_sequence;
typedef union {
int_sequence typed;
untyped_sequence untyped;
} sequence;
void untyped_zero_first(untyped_sequence untyped) {
memset(untyped.data, 0, untyped.size * untyped.item_size);
}
int main(void) {
int ints[4] = {1, 2, 3, 4};
sequence s = {
.typed.data = ints,
.typed.size = 4,
.typed.item_size = sizeof(int)
};
untyped_zero_first(s.untyped);
// prints "0, 0, 0, 0" for GCC, Clang, MSVC - but is ut UB?
printf("%d, %d, %d, %d\n", ints[0], ints[1], ints[2], ints[3]);
}
Given the untyped_sequence
and int_sequence
below:
typedef struct {
void* data; // first item
size_t size; // number of items
size_t item_size; // item byte size
} untyped_sequence;
typedef struct {
int* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} int_sequence;
QUESTION: Is it UB to put them as two union members, initialize an instance of that union using the int_sequence
member, then mutating the int
data using the untyped_sequence
member?
- If yes - why?
- If no - why?
GCC, Clang and MSVC give no warnings about this, but that doesn't necessarily mean anything.
Minimal runnable example (https://godbolt./z/PT6ahh4qq):
#include <string.h>
#include <stdio.h>
typedef struct {
void* data; // first item
size_t size; // number of items
size_t item_size; // item byte size
} untyped_sequence;
typedef struct {
int* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} int_sequence;
typedef union {
int_sequence typed;
untyped_sequence untyped;
} sequence;
void untyped_zero_first(untyped_sequence untyped) {
memset(untyped.data, 0, untyped.size * untyped.item_size);
}
int main(void) {
int ints[4] = {1, 2, 3, 4};
sequence s = {
.typed.data = ints,
.typed.size = 4,
.typed.item_size = sizeof(int)
};
untyped_zero_first(s.untyped);
// prints "0, 0, 0, 0" for GCC, Clang, MSVC - but is ut UB?
printf("%d, %d, %d, %d\n", ints[0], ints[1], ints[2], ints[3]);
}
Share
Improve this question
asked Nov 17, 2024 at 14:34
Johann GerellJohann Gerell
25.7k11 gold badges76 silver badges126 bronze badges
7
-
for me in this case, I see no value in that union.
void *
can be just converted to anint *
easily. – KamilCuk Commented Nov 17, 2024 at 15:25 -
2
@Johann, Since a
void *
andint *
may differ in size, code risks UB. Considervoid *
not fully well defined whenint *
is smaller. – chux Commented Nov 17, 2024 at 15:26 -
Although such architectures are uncommon,
untyped_sequence
andint_sequence
could differ in size. – chux Commented Nov 17, 2024 at 15:39 - @KamilCuk: I agree, as far as the example goes. But this is a minimal example of a much, much bigger scenario where it makes a lot of value. – Johann Gerell Commented Nov 17, 2024 at 16:23
- 1 @JohannGerell The tricky part about UB is that compilers use that excuse to make efficient code. Even if a compilation will emit desired functionality, a new compiler version (or perhaps with more optimizations enabled) may now do undesirable, yet efficient things. Best to avoid UB. – chux Commented Nov 17, 2024 at 18:22
2 Answers
Reset to default 4Is this union pointer member type punning UB in C?
Yes, in that the language spec does not define the behavior (as opposed to explicitly declaring it undefined).
Unlike C++, C does not have a sense of an "active" member of a union. Accessing a different member than was initialized or last stored does not, in and of itself, produce undefined behavior. Since C17, the behavior is not even implementation-defined. You can just do it, which involves (as a note in the spec clarifies) reinterpreting the appropriate part of the stored value according to the type of the accessed member.
But in your particular case, that's not enough. C does not require that the size and representation of type void *
be the same as the size and representation of type int *
. As far as the spec is concerned, there is no telling, at the point where your example code calls untyped_zero_first(s.untyped)
, what s.untyped.data
points to. It might even be a trap representation if your implementation's void *
representation affords those.
In practice, you're unlikely to run into a modern platform in which different object pointer types in fact do have different size or representation, so your code is likely to work as intended, but C does not guarantee that.
- The pointers and other fields union punning is implementation defined.
Union Type-Punning Exception (C11, Section 6.5.2.3, Paragraph 3):
"A pointer to a union object, suitably converted, points to each of its members (or if a member is a bit-field, to the unit in which it resides), and vice versa."
"If the member used to access the contents of a union object is not the same as the member last stored into, the behavior is implementation-defined."
- Using the pointers (it may invoke UB)
Effective Type Rule (C11, Section 6.5, Paragraph 7):
"An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
- a type compatible with the effective type of the object,
- a qualified version of a type compatible with the effective type of the object,
- a type that is the signed or unsigned type corresponding to the effective type of the object,
- a character type."
Strict Aliasing Rule (C11, Section 6.5, Paragraph 7):
- "An object shall have its stored value accessed only by an lvalue expression that has one of the following types: a type compatible with the effective type of the object..."
Answering in a few words:
- union type punning is implementation defined
- using the pointers depends on the referenced objects and pointer types. It may invoke undefined behaviour (UB)
Example invoking and not invoking UB assuming assuming the correctness if the implementation.
typedef struct {
void* data; // first item
size_t size; // number of items
size_t item_size; // item byte size
} untyped_sequence;
typedef struct {
int* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} int_sequence;
typedef struct {
float* data; // first int
size_t size; // number of ints
size_t item_size; // int byte size
} float_sequence;
typedef union {
int_sequence typed;
untyped_sequence untyped;
float_sequence floatseq;
} sequence;
void untyped_zero_first(untyped_sequence untyped) {
memset(untyped.data, 0, untyped.size * untyped.item_size);
}
int main(void) {
int ints[4] = {1, 2, 3, 4};
//no UB here
sequence s =
{
.typed.data = ints,
.typed.size = 4,
.typed.item_size = sizeof(int)
};
untyped_zero_first(s.untyped);
printf("%d, %d, %d, %d\n", s.typed.data[0], s.typed.data[1], s.typed.data[2], s.typed.data[3]);
//UB
printf("%f, %f, %f, %f\n", s.floatseq.data[0], s.floatseq.data[1], s.floatseq.data[2], s.floatseq.data[3]);
}
本文标签: Is this union pointer member type punning UB in CStack Overflow
版权声明:本文标题:Is this union pointer member type punning UB in C? - Stack Overflow 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://it.en369.cn/questions/1745630465a2160132.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
void *
can be just converted to anint *
easily. – KamilCuk Commented Nov 17, 2024 at 15:25void *
andint *
may differ in size, code risks UB. Considervoid *
not fully well defined whenint *
is smaller. – chux Commented Nov 17, 2024 at 15:26untyped_sequence
andint_sequence
could differ in size. – chux Commented Nov 17, 2024 at 15:39