No compliant way to convert signed/unsigned of same size
I fear I may be missing something trivial, but it appears there is no actual safe way to convert to/from a signed type if you wish to retain the original unsigned value.
On reinterpret_cast, 5.2.10 does not list an integer to integer conversion, thus it is not defined (and static_cast defines no additional conversion). On integral conversions 4.7.3 basically says conversion of a large unsigned will be implementation defined (thus not portable).
This seems limiting since we know, for example, that a uint64_t
should, on any hardware, be safely convertible to a int64_t
and back without change in value. Plus the rules on standard layout types actually guarantee safe conversion if we were to memcpy
between the two types instead of assign.
Am I correct? Is there a legitimate reason why one cannot reinterpret_cast
between integral types sufficient size?
Clarification: Definitely the signed version of the unsigned is not guaranteed a value, but it is only the round-trip that I am considering (unsigned => signed => unsigned)
UPDATE : Looking closely at the answers and cross-checking the standard, I believe the memcpy
is not actually guaranteed to work, as nowhere does it state that the two types are layout compatible, and neither are char types. Further update, digging into the C-standard this memcpy should work, as the sizeof the target is large enough and it copies the bytes.
ANSWER : There appears to be no technical reason why reinterpret_cast was not allowed to perform this conversion. For these fixed size integer types a memcpy
is guaranteed to work, and indeed so long as the intermediate can represent all bit-patterns any intermediate type can be used (float's can be dangerous as there may be trap patterns). In general you can't memcpy between any standard layout types, they must be compatible or char type. Here the ints are special since they have additional guarantees.
As you point out, memcpy is safe:
uint64_t a = 1ull<<63;
int64_t b;
memcpy(&b,&a,sizeof a);
The value is b is still implementation defined since C++ does not require a two's complement representation, but converting it back will give you the original value.
As Bo Persson points out int64_t will be two's complement. Therefore the memcpy should result in a signed value for which the simple integral conversion back to the unsigned type is well defined to be the original unsigned value.
uint64_t c = b;
assert( a == c );
Also, you can implement your own 'signed_cast' to make conversions easy (I don't take advantage of the two's complement thing since these aren't limited to the intN_t types):
template<typename T>
typename std::enable_if<std::is_integral<T>::value && std::is_signed<T>::value,T>::type
signed_cast(typename std::make_unsigned<T>::type v) {
T s;
std::memcpy(&s,&v,sizeof v);
return s;
}
template<typename T>
typename std::enable_if<std::is_integral<T>::value && std::is_unsigned<T>::value,T>::type
signed_cast(typename std::make_signed<T>::type v) {
T s;
std::memcpy(&s,&v,sizeof v);
return s;
}
We know that you can't cast an arbitrary bit sequence to floating-point, because it might be a trap representation.
Is there any rule that says there can't be trap representations in the signed integral types? (Unsigned types can't, because of the way the range is defined, all representations are needed for valid values)
Signed representations can also include equivalence classes (such as +0 == -0
) and may coerce values in such a class to a canonical representation, thus breaking the roundtrip.
Here's the relevant rules from the Standard (sectin 4.7, [conv.integral]
):
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two's complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note ]
If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.
If you mean using reinterpret_cast
on a pointer or reference, rather than the value, you have to deal with the strict-aliasing rule. And what you find is that this case is expressly allowed .
Presumably it's not allowed because for machines with sign-magnitude representations it would violate the principle of least surprise that signed 0
maps to unsigned 0
while a signed -0
would map to some other (probably very large) number.
Given that the memcpy
solution exists I assume the standards body decided to not support such an unintuitive mapping, probably because unsigned->signed->unsigned isn't as useful a sequence as pointer->integer->pointer.