why reference size is always 4 bytes - c++
The standard is pretty clear on sizeof
(C++11, 5.3.3/4):
When applied to a reference or a reference type, the result is the
size of the referenced type.
So if you really are taking sizeof(double&)
, the compiler is telling you that sizeof(double)
is 4.
Update: So, what you really are doing is applying sizeof
to a class type. In that case,
When applied to a class, the result is the number of bytes in an
object of that class [...]
So we know that the presence of the reference inside A
causes it to take up 4 bytes. That's because even though the standard does not mandate how references are to be implemented, the compiler still has to implement them somehow. This somehow might be very different depending on the context, but for a reference member of a class type the only approach that makes sense is sneaking in a double*
behind your back and calling it a double&
in your face.
So if your architecture is 32-bit (in which pointers are 4 bytes long) that would explain the result.
Just keep in mind that the concept of a reference is not tied to any specific implementation. The standard allows the compiler to implement references however it wants.
Why does sizeof(MACRO) give an output of 4 bytes when MACRO does not hold any memory location?
sizeof(max)
is replaced by the preprocessor with sizeof('A')
. sizeof('A')
is the same as sizeof(int)
, and the latter is 4 on your platform.
For the avoidance of doubt, 'A'
is an int
constant in C, not a char
. (Note that in C++ 'A'
is a char
literal, and sizeof(char)
is fixed at 1 by the standard.)
why the char buffer content size is 4 bytes?
Diagnosis
In the second example, buff
is a char
buffer and plain char
is a signed type on your machine, and you're storing values which are negative in buff
, so when they're converted to int
in the call to printf()
, they are negative integers (of small magnitude), printed in hex.
ISO/IEC 9899:2018
Actually, the links are to an online draft of C11, not C18, in HTML which allows links to the relevant paragraphs in the standard. AFAIK, these details have not changed between C90, C99, C11 and C18 anyway.
The standard says that the plain char
type is equivalent to either signed char
or unsigned char
.
§6.2.5 Types ¶15:
The three types
char
,signed char
, andunsigned char
are collectively called the character types. The implementation shall definechar
to have the same range, representation, and behavior as eithersigned char
orunsigned char
.45)45)
CHAR_MIN
, defined in<limits.h>
, will have one of the values 0 orSCHAR_MIN
, and this can be used to distinguish the two options. Irrespective of the choice made,char
is a separate type from the other two and is not compatible with either.
§6.3.1.1 Boolean, characters and integers ¶2,3:
2 The following may be used in an expression wherever an int or unsigned int may be used:
- An object or expression with an integer type (other than
int
orunsigned int
) whose integer conversion rank is less than or equal to the rank ofint
andunsigned int
.- A bit-field of type
_Bool
,int
,signed int
, orunsigned int
.If an
int
can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to anint
; otherwise, it is converted to anunsigned int
. These are called the integer promotions.58) All other types are unchanged by the integer promotions.3 The integer promotions preserve value including sign. As discussed earlier, whether a "plain"
char
is treated as signed is implementation-defined.58) The integer promotions are applied only: as part of the usual arithmetic conversions, to certain argument expressions, to the operands of the unary
+
,-
, and~
operators, and to both operands of the shift operators, as specified by their respective subclasses.
§6.5.2.6 Function calls ¶6,7:
6 If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type
float
are promoted todouble
. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undefined. If the function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...
) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undefined. If the function is defined with a type that does not include a prototype, and the types of the arguments after promotion are not compatible with those of the parameters after promotion, the behavior is undefined, except for the following cases:
- one promoted type is a signed integer type, the other promoted type is the corresponding unsigned integer type, and the value is representable in both types;
- both types are pointers to qualified or unqualified versions of a character type or void.
7 If the expression that denotes the called function has a type that does include a prototype, the arguments are implicitly converted, as if by assignment, to the types of the corresponding parameters, taking the type of each parameter to be the unqualified version of its declared type. The ellipsis notation in a function prototype declarator causes argument type conversion to stop after the last declared parameter. The default argument promotions are performed on trailing arguments.
Exegesis
Note the last two sentences of §6.5.2.6 ¶7 — when the char
values are promoted by the 'integer promotions', they are promoted to a (signed) int
, and the negative values remain negative. Since an int
has 4 bytes, and all the machines you're likely to have available use two's-complement arithmetic, the most significant 3 bytes of the value will be 0xFF each.
Prescription
To always print 2-digit hex for the characters, use %.2X
(or %.2x
if you prefer; you can also use either %02X
or %02x
) and pass either (unsigned char)rbuff[r_n-1]
or rbuff[r_n-1] & 0xFF
as the argument (using the variables from the first example). Or, using the variables from the second example:
printf("%.2X\n", (unsigned char)buff[i]);
printf("%.2X\n", buff[i] & 0xFF);
Why reference variable inside class always taking 4 bytes irrespect of type? (on 32-bit system)
Those for bytes are the reference. A reference is just a pointer internally, and pointers typically use 4 bytes on a 32bit system, irrespective of the data types because it is just an address, not the value itself.
How the size of this structure comes out to be 4 byte
Source: http://geeksforgeeks.org/?p=9705
In sum: it is optimizing the packing of bits (that's what bit-fields are meant for) as maximum as possible without compromising on alignment.
A variable’s data alignment deals with the way the data stored in these banks. For example, the natural alignment of int on 32-bit machine is 4 bytes. When a data type is naturally aligned, the CPU fetches it in minimum read cycles.
Similarly, the natural alignment of short int
is 2 bytes. It means, a short int can be stored in bank 0 – bank 1 pair or bank 2 – bank 3 pair. A double
requires 8 bytes, and occupies two rows in the memory banks. Any misalignment of double will force more than two read cycles to fetch double data.
Note that a double variable will be allocated on 8 byte boundary on 32 bit machine and requires two memory read cycles. On a 64 bit machine, based on number of banks, double variable will be allocated on 8 byte boundary and requires only one memory read cycle.
So the compiler will introduce alignment requirement to every structure. It will be as that of the largest member of the structure. If you remove char
from your struct
, you will still get 4 bytes.
In your struct
, char
is 1 byte aligned. It is followed by an int
bit-field, which is 4 byte aligned for integers, but you defined a bit-field.
8 bits = 1 byte. Char
can be any byte boundary. So Char
+ Int:8
= 2 bytes. Well, that's an odd byte boundary so the compiler adds an additional 2 bytes to maintain the 4-byte boundary.
For it to be 8 bytes, you would have to declare an actual int
(4 bytes) and a char
(1 byte). That's 5 bytes. Well that's another odd byte boundary, so the struct
is padded to 8 bytes.
What I have commonly done in the past to control the padding is to place fillers in between my struct
to always maintain the 4 byte boundary. So if I have a struct
like this:
struct s {
int id;
char b;
};
I am going to insert allocation as follows:
struct d {
int id;
char b;
char temp[3];
}
That would give me a struct
with a size of 4 bytes + 1 byte + 3 bytes = 8 bytes! This way I can ensure that my struct
is padded the way I want it, especially if I transmit it somewhere over the network. Also, if I ever change my implementation (such as if I were to maybe save this struct
into a binary file, the fillers were there from the beginning and so as long as I maintain my initial structure, all is well!)
Finally, you can read this post on C Structure size with bit-fields for more explanation.
What is array to pointer decay?
It's said that arrays "decay" into pointers. A C++ array declared as int numbers [5]
cannot be re-pointed, i.e. you can't say numbers = 0x5a5aff23
. More importantly the term decay signifies loss of type and dimension; numbers
decay into int*
by losing the dimension information (count 5) and the type is not int [5]
any more. Look here for cases where the decay doesn't happen.
If you're passing an array by value, what you're really doing is copying a pointer - a pointer to the array's first element is copied to the parameter (whose type should also be a pointer the array element's type). This works due to array's decaying nature; once decayed, sizeof
no longer gives the complete array's size, because it essentially becomes a pointer. This is why it's preferred (among other reasons) to pass by reference or pointer.
Three ways to pass in an array1:
void by_value(const T* array) // const T array[] means the same
void by_pointer(const T (*array)[U])
void by_reference(const T (&array)[U])
The last two will give proper sizeof
info, while the first one won't since the array argument has decayed to be assigned to the parameter.
1 The constant U should be known at compile-time.
Why isn't the size of my array 4 bytes in C
Ok, I'll take this one question at a time...
Why isn't the result of sizeof(foo)
4?
It's because you've only set the size of the foo
array to 3
in the first statement char foo[3]
. There would be no reason to expect 4
as a result when you've explicitly defined the bound as 3 chars
which is 3 bytes.
Why isn't the result 1?
You're correct in saying that in some cases, calling foo
is the same as &foo[0]
. The most relevant of these cases to your question is when being passed as a pointer into a function as a parameter. In the case of the sizeof
function, when you pass in your foo
array pointer, the function iterators throughout the memory block associated with that argued pointer and returns the total size, therefore not being 1
.
Related Topics
How Are Circular #Includes Resolved
Default Argument in the Middle of Parameter List
Launch Failed. Binary Not Found. Cdt on Eclipse Helios
Cmake: Include Library Dependencies in a Static Library
What Is Data Alignment? Why and When Should I Be Worried When Typecasting Pointers in C
How to Identify the File Content as Ascii or Binary
When to Use Std::Begin and Std::End Instead of Container Specific Versions
Capturing Perfectly-Forwarded Variable in Lambda
Are C++ Templates Just MACros in Disguise
How to Give Priority to Privileged Thread in Mutex Locking
When Should You Use the "This" Keyword in C++
Using Std::Bind with Member Function, Use Object Pointer or Not for This Argument
Printing Prime Numbers from 1 Through 100