What Should Be the Sizeof(Int) on a 64-Bit MAChine

What should be the sizeof(int) on a 64-bit machine?

Doesn't have to be; "64-bit machine" can mean many things, but typically means that the CPU has registers that big. The sizeof a type is determined by the compiler, which doesn't have to have anything to do with the actual hardware (though it typically does); in fact, different compilers on the same machine can have different values for these.

Size of int and sizeof int pointer on a 64 bit machine

No, the sizeof(int) is implementation defined, and is usually 4 bytes.

On the other hand, in order to address more than 4GB of memory (that 32bit systems can do), you need your pointers to be 8 bytes wide. int* just holds the address to "somewhere in memory", and you can't address more than 4GB of memory with just 32 bits.

sizeof(int) on x64?

There are various 64-bit data models; Microsoft uses LP64 for .NET: both longs and pointers are 64-bits (although C-style pointers can only be used in C# in unsafe contexts or as a IntPtr value which cannot be used for pointer-arithmetic). Contrast this with ILP64 where ints are also 64-bits.

Thus, on all platforms, int is 32-bits and long is 64-bits; you can see this in the names of the underlying types System.Int32 and System.Int64.

Who decides the sizeof any datatype or structure (depending on 32 bit or 64 bit)?

It's ultimately the compiler. The compiler implementors can decide to emulate whatever integer size they see fit, regardless of what the CPU handles the most efficiently. That said, the C (and C++) standard is written such, that the compiler implementor is free to choose the fastest and most efficient way. For many compilers, the implementers chose to keep int as a 32 bit, although the CPU natively handles 64 bit ints very efficiently.

I think this was done in part to increase portability towards programs written when 32 bit machines were the most common and who expected an int to be 32 bits and no longer. (It could also be, as user user3386109 points out, that 32 bit data was preferred because it takes less space and therefore can be accessed faster.)

So if you want to make sure you get 64 bit ints, you use int64_t instead of int to declare your variable. If you know your value will fit inside of 32 bits or you don't care about size, you use int to let the compiler pick the most efficient representation.

_{As for the other datatypes such as struct, they are composed from the base types such as int.}

C++ int vs long long in 64 bit machine

1) If it is best practice to use long long in x64 for achieving maximum performance even for for 1-4 byte data?

No- and it will probably in fact make your performance worse. For example, if you use 64-bit integers where you could have gotten away with 32-bit integers then you have just doubled the amount of data that must be sent between the processor and memory and the memory is orders of magnitude slower. All of your caches and memory buses will crap out twice as fast.

2) Trade-off in using a type less than word size(memory win vs additional operations)

Generally, the dominant driver of performance in a modern machine is going to be how much data needs to be stored in order to run a program. You are going to see significant performance cliffs once the working set size of your program exceeds the capacity of your registers, L1 cache, L2 cache, L3 cache, and RAM, in that order.

In addition, using a smaller data type can be a win if your compiler is smart enough to figure out how to use your processor's vector instructions (aka SSE instructions). Modern vector processing units are smart enough to cram eight 16-bit short integers into the same space as two 64-bit long long integers, so you can do four times as many operations at once.

3) Does a x64 computer where word&int size is 64 bits, has possibility of processing a short, using 16 bit word size by using so called backward compatibility? Or it must put the 16bit file into 64 bit file, and the fact that it can be done defines the system as backward compatible.

I'm not sure what you're asking here. In general, 64-bit machines are capable of executing 32-bit and 16-bit executable files because those earlier executable files use a subset of the 64-bit machine's potential.

Hardware instruction sets are generally backwards compatible, meaning that processor designers tend to add capabilities, but rarely if ever remove capabilities.

4) Can we force the compiler to make the int 64 bit?

There are fairly standard extensions for all compilers that allow you to work with fixed-bit-size data. For example, the header file stdint.h declares types such as int64_t, uint64_t, etc.

5) How to incorporate ILP64 into PC that uses LP64?

https://software.intel.com/en-us/node/528682

6) What are possible problems of using code adapted to above issues with other compilers, OS's, and architectures(32 bit processor)?

Generally the compilers and systems are smart enough to figure out how to execute your code on any given system. However, 32-bit processors are going to have to do extra work to operate on 64-bit data. In other words, correctness should not be an issue, but performance will be.

But it's generally the case that if performance is really critical to you, then you need to program for a specific architecture and platform anyway.

Clarification Request: Thanks alot! I wanted to clarify question no:1. You say that it is bad for memory. Lets take an example of 32 bit int. When you send it to memory, because it is 64 bit system, for a desired integer 0xee ee ee ee, when we send it won't it become 0x ee ee ee ee+ 32 other bits? How can a processor send 32 bits when the word size is 64 bits? 32 bits are the desired values, but won't it be combined with 32 unused bits and sent this way? If my assumption is true, then there is no difference for memory.

There are two things to discuss here.

First, the situation you discuss does not occur. A processor does not need to "promote" a 32-bit value into a 64-bit value in order to use it appropriately. This is because modern processors have different accessing modes that are capable of dealing with different size data appropriately.

For example, a 64-bit Intel processor has a 64-bit register named RAX. However, this same register can be used in 32-bit mode by referring to it as EAX, and even in 16-bit and 8-bit modes. I stole a diagram from here:

x86_64 registers rax/eax/ax/al overwriting full register contents

1122334455667788
================ rax (64 bits)
        ======== eax (32 bits)
            ====  ax (16 bits)
            ==    ah (8 bits)
              ==  al (8 bits)

Between the compiler and assembler, the correct code is generated so that a 32-bit value is handled appropriately.

Second, when we're talking about memory overhead and performance we should be more specific. Modern memory systems are composed of a disk, then main memory (RAM) and typically two or three caches (e.g. L3, L2, and L1). The smallest quantity of data that can be addressed on the disk is called a page, and page sizes are usually 4096 bytes (though they don't have to be). Then, the smallest quantity of data that can be addressed in memory is called a cache line, which is usually much larger than 32 or 64 bits. On my computer the cache line size is 64 bytes. The processor is the only place where data is actually transferred and addressed at the word level and below.

So if you want to change one 64-bit word in a file that resides on disk, then, on my computer, this actually requires that you load 4096 bytes from the disk into memory, and then 64 bytes from memory into the L3, L2, and L1 caches, and then the processor takes a single 64-bit word from the L1 cache.

The result is that the word size means nothing for memory bandwidth. However, you can fit 16 of those 32-bit integers in the same space you can pack 8 of those 64-bit integers. Or you could even fit 32 16-bit values or 64 8-bit values in the same space. If your program uses a lot of different data values you can significantly improve performance by using the smallest data type necessary.

On a 64-bit machine is the size of an int in Java 32 bits or 64 bits?

32 bits. It's one of the Java language features that the size of the integer does not vary with the underlying computer. See the relevant section of the spec.

C/C++: sizeof(short), sizeof(int), sizeof(long), sizeof(long long), etc... on a 32-bit machine versus on a 64-bit machine

Looks right to me. In c/c++ int isn't specifically defined in terms of bit-size. When creating a project you can select a "console application". VS2012 still supports C, but they mostly lump projects into C/C++. There is a compiler option (/TC I think) which will force the compiler into C compliance. By default it will imply the language by the file extension. MS C support isn't ideal, it doesn't include stdbool.h for instance.

If you want to control the bit size of your data you can use stdint.h which contains exact width int datatypes.

What decides the sizeof an integer?

Taken from http://en.wikipedia.org/wiki/64-bit (under 64-bit data models)

There are various models, Microsoft decided that sizeof(int) == 4, some (a few) others didn't.

HAL Computer Systems port of Solaris to SPARC64 and Unicos seem to be the only ones where sizeof(int) == 8. They are called ILP64 and SILP64 models.

The true "war" was for sizeof(long), where Microsoft decided for sizeof(long) == 4 (LLP64) while nearly everyone else decided for sizeof(long) == 8 (LP64).

Note that in truth it's the compiler that "decides" which model to use, but as written in the wiki

Note that a programming model is a choice made on a per-compiler basis, and several can coexist on the same OS. However, the programming model chosen as the primary model for the OS API typically dominates.

What Should Be the Sizeof(Int) on a 64-Bit MAChine