What determines the size of primitive data types?
What determines the size of a primitive data type?
It depends on Compiler. Compiler in turns usually depends on the architecture, processor, development environment etc because it takes them into account. So you may say it's a combination of all.
What is the reason to choose an integer to have a size of 2 bytes in some systems, and 4 bytes in others? Is there any reason it cannot proceed with 2 byte anymore?
The C++ standard does not specify the size of integral types in bytes, but it specifies minimum ranges they must be able to hold. You can infer minimum size in bits from the required range. You can infer minimum size in bytes from that and the value of the CHAR_BIT
macro that defines the number of bits in a byte (in all but the most obscure platforms it's 8, and it can't be less than 8).
Check out here for more info.
Why do Primitive Data Types have a Fixed Size?
As low-level programming languages, the designs of C and C++ closely follow what common hardware is capable of. The primitive building blocks (fundamental types) correspond to entities that common CPUs natively support. CPUs typically can handle bytes and words very efficiently; C called these char
and int
. (More precisely, C defined int
in such a way that a compiler could use the target CPU's word size for it.) There has also been CPU support for double-sized words, which historically corresponded to the long
data type in C, later to the long long
types of C and C++. Half-words corresponded to short
. The basic integer types correspond to things a CPU can handle well, with enough flexibility to accommodate different architectures. (For example, if a CPU did not support half-words, short
could be the same size as int
.)
If there was hardware support for integers of unbounded size (limited only by available memory), then there could be an argument for adding that as a fundamental type in C (and C++). Until that happens, support of big integers (see bigint) in C and C++ has been relegated to libraries.
Some of the newer, higher-level languages do have built-in support for arbitrary-precision arithmetic.
getting size of primitive data types in python
Running
sys.getsizeof(float)
does not return the size of any individual float, it returns the size of the float
class. That class contains a lot more data than just any single float, so the returned size will also be much bigger.
If you just want to know the size of a single float, the easiest way is to simply instantiate some arbitrary float. For example:
sys.getsizeof(float())
Note that
float()
simply returns 0.0
, so this is actually equivalent to:
sys.getsizeof(0.0)
This returns 24
bytes in your case (and probably for most other people as well). In the case of CPython (the most common Python implementation), every float
object will contain a reference counter and a pointer to the type (a pointer to the float
class), which will each be 8 bytes for 64bit CPython or 4 bytes each for 32bit CPython. The remaining bytes (24 - 8 - 8 = 8
in your case which is very likely to be 64bit CPython) will be the bytes used for the actual float value itself.
This is not guaranteed to work out the same way for other Python implementations though. The language reference says:
These represent machine-level double precision floating point numbers. You are at the mercy of the underlying machine architecture (and C or Java implementation) for the accepted range and handling of overflow. Python does not support single-precision floating point numbers; the savings in processor and memory usage that are usually the reason for using these are dwarfed by the overhead of using objects in Python, so there is no reason to complicate the language with two kinds of floating point numbers.
and I'm not aware of any runtime methods to accurately tell you the number of bytes used. However, note that the quote above from the language reference does say that Python only supports double precision floats, so in most cases (depending on how critical it is for you to always be 100% right) it should be comparable to double precision in C.
Java primitive data types size
There's no need for such a thing in Java since the sizes of the primitives are set for all JVMs (unlike C where they can vary).
char
is 16 bit (actually an unsigned quantity), int
is 32 bit, long
is 64 bit.
boolean
is the ugly sister in all this. Internally it is manipulated as a 32 bit int
, but arrays of boolean
s use 1 byte per element.
Primitive data types and portability in Java
In "lower" languages, primitive data types sizes are often derived from the CPU's ability to handle them.
E.g., in c, an int
is defined as being "at least 16 bits in size", but its size may vary between architectures in order to assure that "The type int should be the integer type that the target processor is most efficient working with." (source). This means that if your code makes careless assumptions about an int
's size, it may very well break if you port it from 32-bit x86 to 64-bit powerpc.
java, as noted above, is different. An int
, e.g., will always be 32 bits. This means you don't have to worry about its size changing when you run the same code on a different architecture. The tradeoff, as also mentioned above, is performance - on any architecture that doesn't natively handle 32 bit calculations, these int
s need to be expanded to the native size the CPU can handle (which will have a small penalty), or worse, if the CPU can only handle smaller int
s, every operation on an int
may require several CPU operations.
Does Java define the size of its primitive types anywhere?
Not a class, but you have Integer.SIZE
, and so on for Long
and floating point classes too. You also have *.BYTES
.
Therefore Integer.SIZE
is 32, Integer.BYTES
is 4, Double.SIZE
is 64 and Double.BYTES
is 8, etc etc; all of these are int
s in case you were wondering.
NOTE: *.BYTES
are only defined since Java 8 (thanks @Slanec for noticing)
(*.SIZE
appeared in Java 5 but you do use at least that, right?)
And yes, this is defined by the JDK since the JLS itself defines the size of primitive types; you are therefore guaranteed that you'll have the same values for these constants on whatever Java implementation on whatever platform.
Related Topics
Std::Stod Throws Out_Of_Range Error for a String That Should Be Valid
Benefits and Portability of Boost Library
Sf::Texture as Class Member Doesn't Work
Workaround for Error C2536: Cannot Specify Explicit Initializer for Arrays in Visual Studio 2013
In C++, Can a Class with a Const Data Member Not Have a Copy Assignment Operator
Inheriting and Overriding Functions of a Std::String
How to Print the Address of Char Array
How to Check Deallocation of Memory
Partially Truncating a Stream (Fstream or Ofstream) in C++
Changing R's Seed from Rcpp to Guarantee Reproducibility
Can You Access Private Member Variables Across Class Instances
The Difference Between C and C++ Regarding the ++ Operator
How to Resume Input Stream After Stopped by Eof in C++
Why Use #Define Instead of a Variable
The Fastest Way to Retrieve 16K Key-Value Pairs
How to Get Position of a Certain Element in Strings Vector, to Use It as an Index in Ints Vector
C++: Class Declaration and Definition Separated Inside Header Causes Duplicate Symbol
Types of Iterator:Output VS. Input VS. Forward VS. Random Access Iterator