Representing 128-Bit Numbers in C++

Is there a 128 bit integer in C++?

GCC and Clang support __int128

Representing 128-bit numbers in C++

Look into other libraries that have been developed. Lots of people have wanted to do this before you. :D

Try bigint C++

Assigning 128 bit integer in C

Am I doing something wrong or is this a bug in gcc?

The problem is in 47942806932686753431 part, not in __uint128_t p. According to gcc docs there's no way to declare 128 bit constant:

There is no support in GCC for expressing an integer constant of type __int128 for targets with long long integer less than 128 bits wide.

So, it seems that while you can have 128 bit variables, you cannot have 128 bit constants, unless your long long is 128 bit wide.

The workaround could be to construct 128 bit value from "narrower" integral constants using basic arithmetic operations, and hope for compiler to perform constant folding.

Represent 128 bit integer as two 64 bit integers in C++

Umm, why not just do:

void encode(__uint128_t src, __uint64_t &dest1, __uint64_t &dest2)
{
constexpr const __uint128_t bottom_mask = (__uint128_t{1} << 64) - 1;
constexpr const __uint128_t top_mask = ~bottom_mask;
dest1 = src & bottom_mask;
dest2 = (src & top_mask) >> 64;
}

void decode(__uint64_t src1, __uint64_t src2, __uint128_t &dest)
{
dest = (__uint128_t{src2} << 64) | src1;
}

?

Of course, this might be kind of futile, since __uint128_t may already be just 2 64-bit values. Also, prefer returning a value rather than using lvalue-references:

std::pair<__uint64_t,__uint64_t> encode(__uint128_t src)
{
constexpr const __uint128_t bottom_mask = (__uint128_t{1} << 64) - 1;
constexpr const __uint128_t top_mask = ~bottom_mask;
return { src & bottom_mask, (src & top_mask) >> 64 };
}

__uint128_t decode(__uint64_t src1, __uint64_t src2)
{
return (__uint128_t{src2} << 64) | src1;
}

Is there a 128 bit integer in gcc?

For GCC before C23, a primitive 128-bit integer type is only ever available on 64-bit targets, so you need to check for availability even if you have already detected a recent GCC version. In theory gcc could support TImode integers on machines where it would take 4x 32-bit registers to hold one, but I don't think there are any cases where it does.

In C++, consider a library such as boost::multiprecision::int128_t which hopefully uses compiler built-in wide types if available, for zero overhead vs. using your own typedef (like GCC's __int128 or Clang's _BitInt(128)). See also @phuclv's answer on another question.

ISO C23 will let you typedef unsigned _BitInt(128) u128, modeled on clang's feature originally called _ExtInt() which works even on 32-bit machines; see a brief intro to it. Current GCC -std=gnu2x doesn't even support that syntax yet.


GCC 4.6 and later has a __int128 / unsigned __int128 defined as a built-in type. Use

#ifdef __SIZEOF_INT128__ to detect it.

GCC 4.1 and later define __int128_t and __uint128_t as built-in types. (You don't need #include <stdint.h> for these, either. Proof on Godbolt.)

I tested on the Godbolt compiler explorer for the first versions of compilers to support each of these 3 things (on x86-64). Godbolt only goes back to gcc4.1, ICC13, and clang3.0, so I've used <= 4.1 to indicate that the actual first support might have been even earlier.

         legacy               recommended(?)    |  One way of detecting support
__uint128_t | [unsigned] __int128 | #ifdef __SIZEOF_INT128__
gcc <= 4.1 | 4.6 | 4.6
clang <= 3.0 | 3.1 | 3.3
ICC <= 13 | <= 13 | 16. (Godbolt doesn't have 14 or 15)

If you compile for a 32-bit architecture like ARM, or x86 with -m32, no 128-bit integer type is supported with even the newest version of any of these compilers. So you need to detect support before using, if it's possible for your code to work at all without it.

The only direct CPP macro I'm aware of for detecting it is __SIZEOF_INT128__, but unfortunately some old compiler versions support it without defining it. (And there's no macro for __uint128_t, only the gcc4.6 style unsigned __int128). How to know if __uint128_t is defined

Some people still use ancient compiler versions like gcc4.4 on RHEL (RedHat Enterprise Linux), or similar crusty old systems. If you care about obsolete gcc versions like that, you probably want to stick to __uint128_t. And maybe detect 64-bitness in terms of sizeof(void*) == 8 as a fallback for __SIZEOF_INT128__ no being defined. (I think GNU systems always have CHAR_BIT==8, although I might be wrong about some DSPs). That will give a false negative on ILP32 ABIs on 64-bit ISAs (like x86-64 Linux x32, or AArch64 ILP32), but this is already just a fallback / bonus for people using old compilers that don't define __SIZEOF_INT128__.

There might be some 64-bit ISAs where gcc doesn't define __int128, or maybe even some 32-bit ISAs where gcc does define __int128, but I'm not aware of any.


The GCC internals are integer TI mode (GCC internals manual). (Tetra-integer = 4x width of a 32-bit int, vs. DImode = double width vs. SImode = plain int.) As the GCC manual points out, __int128 is supported on targets that support a 128-bit integer mode (TImode).

// __uint128_t is pre-defined equivalently to this
typedef unsigned uint128 __attribute__ ((mode (TI)));

There is an OImode in the manual, oct-int = 32 bytes, but current GCC for x86-64 complains "unable to emulate 'OI'" if you attempt to use it.


Random fact: ICC19 and g++/clang++ -E -dM define:

#define __GLIBCXX_TYPE_INT_N_0 __int128
#define __GLIBCXX_BITSIZE_INT_N_0 128

@MarcGlisse commented that's the way you tell libstdc++ to handle extra integer types (overload abs, specialize type traits, etc)

icpc defines that even with -xc (to compile as C, not C++), while g++ -xc and clang++ -xc don't. But compiling with actual icc (e.g. select C instead of C++ in the Godbolt language dropdown) doesn't define this macro.


The test function was:

#include <stdint.h>   // for uint64_t

#define uint128_t __uint128_t
//#define uint128_t unsigned __int128

uint128_t mul64(uint64_t a, uint64_t b) {
return (uint128_t)a * b;
}

compilers that support it all compile it efficiently, to

    mov       rax, rdi
mul rsi
ret # return in RDX:RAX which mul uses implicitly

Getting a 128 bits integer from command line

Take a step back, and look at what you are trying to implement. The Tiny Encryption Algorithm does not work on an 128-bit integer, but on a 128-bit key; the key is composed of four 32-bit unsigned integers.

What you actually need, is a way to parse a decimal (or hexadecimal, or some other base) 128-bit unsigned integer from a string to four 32-bit unsigned integer elements.

I suggest writing a multiply-add function, which takes the quad-32-bit value, multiplies it by a 32-bit constant, and adds another 32-bit constant:

#include <stdint.h>

uint32_t muladd128(uint32_t quad[4], const uint32_t mul, const uint32_t add)
{
uint64_t temp = 0;

temp = (uint64_t)quad[3] * (uint64_t)mul + add;
quad[3] = temp;

temp = (uint64_t)quad[2] * (uint64_t)mul + (temp >> 32);
quad[2] = temp;

temp = (uint64_t)quad[1] * (uint64_t)mul + (temp >> 32);
quad[1] = temp;

temp = (uint64_t)quad[0] * (uint64_t)mul + (temp >> 32);
quad[0] = temp;

return temp >> 32;
}

The above uses most significant first word order. It returns nonzero if the result overflows; in fact, it returns the 32-bit overflow itself.

With that, it is very easy to parse a string describing a nonnegative 128-bit integer in binary, octal, decimal, or hexadecimal:

#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>

static void clear128(uint32_t quad[4])
{
quad[0] = quad[1] = quad[2] = quad[3] = 0;
}

/* muladd128() */

static const char *parse128(uint32_t quad[4], const char *from)
{
if (!from) {
errno = EINVAL;
return NULL;
}

while (*from == '\t' || *from == '\n' || *from == '\v' ||
*from == '\f' || *from == '\r' || *from == ' ')
from++;

if (from[0] == '0' && (from[1] == 'x' || from[1] == 'X') &&
((from[2] >= '0' && from[2] <= '9') ||
(from[2] >= 'A' && from[2] <= 'F') ||
(from[2] >= 'a' && from[2] <= 'f'))) {
/* Hexadecimal */
from += 2;
clear128(quad);

while (1)
if (*from >= '0' && *from <= '9') {
if (muladd128(quad, 16, *from - '0')) {
errno = ERANGE;
return NULL;
}
from++;
} else
if (*from >= 'A' && *from <= 'F') {
if (muladd128(quad, 16, *from - 'A' + 10)) {
errno = ERANGE;
return NULL;
}
from++;
} else
if (*from >= 'a' && *from <= 'f') {
if (muladd128(quad, 16, *from - 'a' + 10)) {
errno = ERANGE;
return NULL;
}
from++;
} else
return from;
}

if (from[0] == '0' && (from[1] == 'b' || from[1] == 'B') &&
(from[2] >= '0' && from[2] <= '1')) {
/* Binary */
from += 2;
clear128(quad);

while (1)
if (*from >= '0' && *from <= '1') {
if (muladd128(quad, 2, *from - '0')) {
errno = ERANGE;
return NULL;
}
from++;
} else
return from;
}

if (from[0] == '0' &&
(from[1] >= '0' && from[1] <= '7')) {
/* Octal */
from += 1;
clear128(quad);

while (1)
if (*from >= '0' && *from <= '7') {
if (muladd128(quad, 8, *from - '0')) {
errno = ERANGE;
return NULL;
}
from++;
} else
return from;
}

if (from[0] >= '0' && from[0] <= '9') {
/* Decimal */
clear128(quad);

while (1)
if (*from >= '0' && *from <= '9') {
if (muladd128(quad, 10, *from - '0')) {
errno = ERANGE;
return NULL;
}
from++;
} else
return from;
}

/* Not a recognized number. */
errno = EINVAL;
return NULL;
}

int main(int argc, char *argv[])
{
uint32_t key[4];
int arg;

for (arg = 1; arg < argc; arg++) {
const char *end = parse128(key, argv[arg]);
if (end) {
if (*end != '\0')
printf("%s: 0x%08x%08x%08x%08x (+ \"%s\")\n", argv[arg], key[0], key[1], key[2], key[3], end);
else
printf("%s: 0x%08x%08x%08x%08x\n", argv[arg], key[0], key[1], key[2], key[3]);
fflush(stdout);
} else {
switch (errno) {
case ERANGE:
fprintf(stderr, "%s: Too large.\n", argv[arg]);
break;
case EINVAL:
fprintf(stderr, "%s: Not a nonnegative integer in binary, octal, decimal, or hexadecimal notation.\n", argv[arg]);
break;
default:
fprintf(stderr, "%s: %s.\n", argv[arg], strerror(errno));
break;
}
}
}

return EXIT_SUCCESS;
}

It is very straightforward to add support for Base64 and Base85, which are sometimes used; or indeed for any base less than 232.

And, if you think about the above, it was all down to being precise about what you need.



Related Topics



Leave a reply



Submit