How to Print _Int128 in G++

How to print __int128 in g++?

If you don't need any of the fancy formatting options, writing
your own << operator is trivial. Formally, I suspect that
writing one for __int128_t would be considered undefined
behavior, but practically, I think it would work, up until the
library starts providing actual support for it (at which point,
you'd retire your conversion operator).

Anyway, something like the following should work:

std::ostream&
operator<<( std::ostream& dest, __int128_t value )
{
std::ostream::sentry s( dest );
if ( s ) {
__uint128_t tmp = value < 0 ? -value : value;
char buffer[ 128 ];
char* d = std::end( buffer );
do
{
-- d;
*d = "0123456789"[ tmp % 10 ];
tmp /= 10;
} while ( tmp != 0 );
if ( value < 0 ) {
-- d;
*d = '-';
}
int len = std::end( buffer ) - d;
if ( dest.rdbuf()->sputn( d, len ) != len ) {
dest.setstate( std::ios_base::badbit );
}
}
return dest;
}

Note that this is just a quicky, temporary fix, until the time
the g++ library supports the type. It counts on 2's complement,
wrap around on overflow, for __int128_t, but I'd be very
surprised if that wasn't the case (formally, it's undefined
behavior). If not, you'll need to fix up the initialization of
tmp. And of course, it doesn't handle any of the formatting
options; you can add as desired. (Handling padding and the
adjustfield correctly can be non-trivial.)

how to print __uint128_t number using gcc?

No there isn't support in the library for printing these types. They aren't even extended integer types in the sense of the C standard.

Your idea for starting the printing from the back is a good one, but you could use much larger chunks. In some tests for P99 I have such a function that uses

uint64_t const d19 = UINT64_C(10000000000000000000);

as the largest power of 10 that fits into an uint64_t.

As decimal, these big numbers get unreadable very soon so another, easier, option is to print them in hex. Then you can do something like

  uint64_t low = (uint64_t)x;
// This is UINT64_MAX, the largest number in 64 bit
// so the longest string that the lower half can occupy
char buf[] = { "18446744073709551615" };
sprintf(buf, "%" PRIX64, low);

to get the lower half and then basically the same with

  uint64_t high = (x >> 64);

for the upper half.

How to deal with `__int128` in gdb?

If you just need to print a constant value, the py print(expr128) idea works fantastically due to Python's arbitrary integer precision.

If, however, you need to work with an actual C variable of type __int128, you'll need to convert it temporarily into something like unsigned long long[2] to perform operations on it in GDB, but remember that you're then working with an array of 2 64-bit values, so X[0] << 64 will not work as it would with the true 128-bit __int128 type. GDB can print the value; it just can't manipulate its bits. GCC allows you to manipulate its bits; your libc just can't print the value using printf and there might not even be any GCC-specific code that allows it to do so.

Here's a sample shell session showing how troublesome this compiler-specific type is to work with in GDB:

$ nl bar.c
1 int main(void)
2 {
3 __int128 v = 1;
4 v <<= 62;
5 v <<= 2;
6 }
$ gcc -g -o bar bar.c
$ gdb -q ./bar
Reading symbols from ./bar...done.
(gdb) break 5
Breakpoint 1 at 0x5e8: file bar.c, line 5.
(gdb) run
Starting program: /home/luser/bar

Breakpoint 1, main () at bar.c:5
5 v <<= 2;
(gdb) print/x *(long long(*)[2])&v
$1 = {0x4000000000000000, 0x0}
(gdb) print/x (*(long long(*)[2])&v)[0]+1
$2 = {0x4000000000000001, 0x0}
(gdb) next
6 }
(gdb) print/x *(long long(*)[2])&v
$3 = {0x0, 0x1}
(gdb) print/x (*(long long(*)[2])&v)[0]+1
$4 = {0x1, 0x1}

Taking into account my machine's little endian CPU, the results are (sort of) clear:

$1 = 0x0000 0000 0000 0000
4000 0000 0000 0000 # 1<<62
$2 = 0x0000 0000 0000 0000
4000 0000 0000 0001 #(1<<62) + 1
$3 = 0x0000 0000 0000 0001
0000 0000 0000 0000 # 1<<64
$4 = 0x0000 0000 0000 0001
0000 0000 0000 0001 #(1<<64) + 1

With values this large, even hexadecimal is getting to be a bit cumbersome, but you get the idea: working with these values in GDB might be a problem with all of the parentheses you need to deal with, plus you need to keep your target machine's endianness in mind when manipulating the value as well as tracking overflow.

My suggestion: link in some arithmetic routines that work with __int128 values to aid debugging, so you can use things like call negate128 (value) in GDB to obtain the result of the C expression -value where value has type __int128. No need for overflow checks either since the machine will handle that for you as it would with any other type, so go ahead and write things like this (assuming you're working with a system where overflow doesn't kill your program or the entire machine):

__int128 add128(__int128 a, __int128 b) { return a + b; }
__int128 sub128(__int128 a, __int128 b) { return a - b; }
__int128 shl128(__int128 a, int n) { return a << n; }
__int128 shr128(__int128 a, int n) { return a >> n; }

Overload of std::ostream for __int128_t ambiguous (when called from a namespace)

With using you make the name ::operator<< part of the namespace as the one which is already present for ABC.
Then the compiler can choose the most appropriate at call site.

namespace Something
{
using ::operator<<;
void print_uint128(uint128 val)
{
std::cout << "An int128: " << val << std::endl;
}
}

Without this artificial injection of the name ::operator<<, when << exists for another type (ABC) in this namespace, there is no need to look further (in global namespace).
The argument-dependent-lookup (ADL) also injects the << name from std:: because of std::ostream.
After that, the compiler chooses the best match between all of these possible << considered as accessible.
The one for ABC does not match but the other ones (for integers, reals...) could be used with a conversion; but which one is the best?
This is ambiguous.

On the contrary, when << for ABC does not exist, there is no such << name in the current namespace, then this name is looked for in the upper (global) namespace; here a perfect match exists for uint128, so the candidates from std:: (ADL) are not considered as potential better match.

This not easy to follow because there are two stages.

First, look for the << name.
This starts from the current namespace; if found it stops here, if not found it goes on in the upper namespace and so on until the global namespace. But ADL takes also place and injects << names from other namespaces based on arguments at call site (std:: here for ostream).

Second, choose the best match between all these collected <<.
If there is no perfect match, a conversion could be considered but if many conversions are possible, this is ambiguous.

Trying to illustrate the various situations:

• NO << (for ABC)  in current namespace,
<< (for uint128) in global namespace,
1 --> no << in the current namespace then look in the upper namespace,
find << (for uint128) in global namespace
+ many << from std:: via ADL
2 --> the one from the global namespace is a perfect match --> OK!

• << (for ABC) in current namespace,
<< (for uint128) in global namespace,
1 --> find << (for ABC) in _current_ namespace and _STOP_ here
+ many << from std:: via ADL
2 --> no one is a perfect match,
ABC does not match at all,
various integers/reals could match with conversions --> AMBIGUOUS!

• << (for ABC) in current namespace,
<< (for uint128) in global namespace,
using ::operator<< in current namespace
1 --> find << (for ABC _AND_ for uint128) in _current_ namespace and stop here
+ many << from std:: via ADL
2 --> ABC does not match at all,
various integers/reals could match with conversions,
the one for uint128 is the only perfect match --> OK!

• << (for ABC) in current namespace,
<< (for uint128 *) in global namespace,
1 --> find << (for ABC) in _current_ namespace and _STOP_ here
+ many << from std:: via ADL
2 --> ABC does not match at all,
various integers/reals do not match at all
std::operator<< for void * matches --> OK!!!!!!!!!!!!
_A_MATCH_IS_FOUND_BUT_NOT_YOURS_ (void * not uint128 *)!!!

use of 128 bit unsigned int in c language

  1. "Target" means the specific combination of CPU architecture and operating system that your compiler is configured to create programs for. There is a discussion at Does a list of all known target triplets in use exist?. But "integer mode" is really a concept used internally by the compiler, and only indirectly related to what the hardware can and can't do. So all this really says is "the compiler supports 128-bit integers on some targets and not on others". The easiest way to find out whether yours does is to just try to compile and run a small test program that uses __int128.

  2. Most system's printf functions don't support __int128, so you have to write your own code to print them, or find third-party code somewhere. See How to print __int128 in g++? which is for C++ but still relevant.

  3. You don't need to include anything or use any special options.

atoi() for int128_t type

Here is a C++ implementation:

#include <string>
#include <stdexcept>

__int128_t atoint128_t(std::string const & in)
{
__int128_t res = 0;
size_t i = 0;
bool sign = false;

if (in[i] == '-')
{
++i;
sign = true;
}

if (in[i] == '+')
{
++i;
}

for (; i < in.size(); ++i)
{
const char c = in[i];
if (not std::isdigit(c))
throw std::runtime_error(std::string("Non-numeric character: ") + c)
res *= 10;
res += c - '0';
}

if (sign)
{
res *= -1;
}

return res;
}

int main()
{
__int128_t a = atoint128_t("170141183460469231731687303715884105727");
}

If you want to test it then there is a stream operator here.

Performance

I ran a few performance test. I generate 100,000 random numbers uniformly distributed in the entire support of __int128_t. Then I converted each of them 2000 times. All of these (200,000,000) conversions where completed within ~12 seconds.
Using this code:

#include <iostream>
#include <string>
#include <random>
#include <vector>
#include <chrono>

int main()
{
std::mt19937 gen(0);
std::uniform_int_distribution<> num(0, 9);
std::uniform_int_distribution<> len(1, 38);
std::uniform_int_distribution<> sign(0, 1);

std::vector<std::string> str;

for (int i = 0; i < 100000; ++i)
{
std::string s;
int l = len(gen);
if (sign(gen))
s += '-';
for (int u = 0; u < l; ++u)
s += std::to_string(num(gen));
str.emplace_back(s);
}

namespace sc = std::chrono;
auto start = sc::duration_cast<sc::microseconds>(sc::high_resolution_clock::now().time_since_epoch()).count();
__int128_t b = 0;
for (int u = 0; u < 200; ++u)
{
for (int i = 0; i < str.size(); ++i)
{
__int128_t a = atoint128_t(str[i]);
b += a;
}
}
auto time = sc::duration_cast<sc::microseconds>(sc::high_resolution_clock::now().time_since_epoch()).count() - start;
std::cout << time / 1000000. << 's' << std::endl;
}


Related Topics



Leave a reply



Submit