Can Pointers Be of Different Sizes

Can pointers be of different sizes?

Yes, it's entirely possible. On some machines, a pointer to a byte contains two values: A pointer to the WORD address of the memory word containing the byte, and a "byte index" that gives the position of the byte within the word. E.g. on a 32-bit machine, the "byte index" is 0..3.

This would require more storage space than a "int *", which is just a pointer to the relevant word.

What is the size of a pointer?

Pointers generally have a fixed size, for ex. on a 32-bit executable they're usually 32-bit. There are some exceptions, like on old 16-bit windows when you had to distinguish between 32-bit pointers and 16-bit... It's usually pretty safe to assume they're going to be uniform within a given executable on modern desktop OS's.

Edit: Even so, I would strongly caution against making this assumption in your code. If you're going to write something that absolutely has to have a pointers of a certain size, you'd better check it!

Function pointers are a different story -- see Jens' answer for more info.

Is the sizeof(some pointer) always equal to four?

The guarantee you get is that sizeof(char) == 1. There are no other guarantees, including no guarantee that sizeof(int *) == sizeof(double *).

In practice, pointers will be size 2 on a 16-bit system (if you can find one), 4 on a 32-bit system, and 8 on a 64-bit system, but there's nothing to be gained in relying on a given size.

Are there any platforms where pointers to different types have different sizes?

Answer from the C FAQ:

The Prime 50 series used segment 07777, offset 0 for the null pointer, at least for PL/I. Later models used segment 0, offset 0 for null pointers in C, necessitating new instructions such as TCNP (Test C Null Pointer), evidently as a sop to all the extant poorly-written C code which made incorrect assumptions. Older, word-addressed Prime machines were also notorious for requiring larger byte pointers (char *'s) than word pointers (int *'s).

The Eclipse MV series from Data General has three architecturally supported pointer formats (word, byte, and bit pointers), two of which are used by C compilers: byte pointers for char * and void *, and word pointers for everything else. For historical reasons during the evolution of the 32-bit MV line from the 16-bit Nova line, word pointers and byte pointers had the offset, indirection, and ring protection bits in different places in the word. Passing a mismatched pointer format to a function resulted in protection faults. Eventually, the MV C compiler added many compatibility options to try to deal with code that had pointer type mismatch errors.

Some Honeywell-Bull mainframes use the bit pattern 06000 for (internal) null pointers.

The CDC Cyber 180 Series has 48-bit pointers consisting of a ring, segment, and offset. Most users (in ring 11) have null pointers of 0xB00000000000. It was common on old CDC ones-complement machines to use an all-one-bits word as a special flag for all kinds of data, including invalid addresses.

The old HP 3000 series uses a different addressing scheme for byte addresses than for word addresses; like several of the machines above it therefore uses different representations for char * and void * pointers than for other pointers.

The Symbolics Lisp Machine, a tagged architecture, does not even have conventional numeric pointers; it uses the pair (basically a nonexistent handle) as a C null pointer.

Depending on the ``memory model'' in use, 8086-family processors (PC
compatibles) may use 16-bit data pointers and 32-bit function
pointers, or vice versa.

Some 64-bit Cray machines represent int * in the lower 48 bits of a
word; char * additionally uses some of the upper 16 bits to indicate a
byte address within a word.

Additional links: A message from Chris Torek with more details
about some of these machines.

Can the size of pointers vary between data and function pointers?

> type ppp.c
#include <stdio.h>
#include <stdlib.h>

int global = 0;

int main(void) {
int local = 0;
static int staticint = 0;
int *mall;
int (*fx)(void);

fx = main;
mall = malloc(42); /* assume it worked */
printf("#sizeof pointer to local: %d\n", (int)sizeof &local);
printf("#sizeof pointer to static: %d\n", (int)sizeof &staticint);
printf("#sizeof pointer to malloc'd: %d\n", (int)sizeof mall);
printf("#sizeof pointer to global: %d\n", (int)sizeof &global);
printf("#sizeof pointer to main(): %d\n", (int)sizeof fx);
free(mall);
return 0;
}
> tcc -mc ppp.c
Turbo C Version 2.01 ...
warnings about unused variables elided ...
Turbo Link Version 2.0 ...
> ppp
#sizeof pointer to local: 4
#sizeof pointer to static: 4
#sizeof pointer to malloc'd: 4
#sizeof pointer to global: 4
#sizeof pointer to main(): 2
> tcc -mm ppp.c
> ppp
#sizeof pointer to local: 2
#sizeof pointer to static: 2
#sizeof pointer to malloc'd: 2
#sizeof pointer to global: 2
#sizeof pointer to main(): 4

tcc -mc generates code in the "compact" model; tcc -mm generates code in the "medium" model

Can pointers to different types have different binary representations?

Yep

As a concrete example, there is a C++ implementation where pointers to single-byte elements are larger than pointers to multi-byte elements, because the hardware uses word (not byte) addressing. To emulate byte pointers, C++ uses a hardware pointer plus an extra byte offset.

void* stores that extra offset, but int* does not. Converting int* to char* works (as it must under the standard), but char* to int* loses that offset (which your note implicitly permits).

The Cray T90 supercomputer is an example of such hardware.

I will see if I can find the standards argument why this is valid thing for a compliant C++ compiler to do; I am only aware someone did it, not that it is legal to do it, but that note rather implies it is intended to be legal.

The rules are going to be in the to-from void pointer casting rules. The paragraph you quoted implicitly forwards the meaning of the conversion to there.

7.6.1.9 Static cast [expr.static.cast]

A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1.
If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified.
Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b.
Otherwise, the pointer value is unchanged by the conversion.

This demonstrates that converting to more-aligned types generates an unspecified pointer, but converting to equal-or-less aligned types that aren't actually there does not change the pointer value.

Which is permission to make a cast from a pointer to 4 byte aligned data converted to a pointer to 8 byte aligned data result in garbage.

Every object unrelated pointer cast needs to logically round-trip through a void* however.

An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue v of object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_­cast<cv T*>(static_­cast<cv void*>(v)).

(From the OP)

That covers void* to T*; I have yet to find the T* to void* conversion text to make this a complete language-lawyer level answer.

Are different pointers considered to be different data types?

Yes, they are different data types, as they point to different types of data.

In other words, their usage and properties varies (example: same pointer arithmetic on different pointer type will yield different result) and they have different alignment requirements.



Related Topics



Leave a reply



Submit