How to Create a Type in C++ That Takes Less Than One Byte of Memory

Is it possible to create a type in c++ that takes less than one byte of memory?

Not really. Inside a struct, you can make use of bit fields. So if you know you'll need a certain, fixed amount of entries, this would be a way to save a few bits (but note that the struct will always be padded to at least the next whole amount of bytes). Also note that due to the fact that "normal" CPUs can't address amounts smaller than an octet/byte, the access to these bit field values might be slower because of the extra instructions the compiler has to generate to get/store a value "in the middle". So in order to save a few bits, you have to spend some CPU time.

The C++11 standard says in chapter 1.7 The C++ memory model (emphasis mine):

The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set (2.3) and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementation- defined.

In other words: the smallest addressable unit in C++ is at least 8 bits wide.

Side-note: In case you're wondering: there are machines like DSPs that can only address units larger than 8 bits at a time; for such a machine, the compiler may define "byte" to be, for example, 16 bits wide.

Is it possible to create a data type of length one bit in C

It is not really possible to create a type that occupies one bit. The smallest addressable unit in C is the char (which is by definition one byte and usually, but not necessarily, 8 bits long; it might be longer but isn't allowed to be shorter than 8 bits in Standard C).

You can approach it with :

typedef _Bool uint1_t;

or:

#include <stdbool.h>
typedef bool uint1_t;

but it will occupy (at least) one byte, even though a Boolean variable only stores the values 0 or 1, false or true.

You could, in principle, use a bit-field:

typedef struct
{
unsigned int x : 1;
} uint1_t;

but that will also occupy at least one byte (and possibly as many bytes as an unsigned int; that's usually 4 bytes) and you'll need to use .x to access the value. The use of bit-fields is problematic (most aspects of them are implementation defined, such as how much space the storage unit that holds it will occupy) — don't use a bit-field.

Including amendments suggested by Drew McGowen, Drax and Fiddling Bits.

Built in datatype with size less than 1 byte

For languages that have manual memory management/address juggling at all, the hardware dictates some restrictions on those features. Very few, if any, architectures support addressing a single bit. Typically, the smallest unit of storage is a byte, so they use that.

Making all addresses refer to bits either requires larger-than-average address representation (a performance hit - twice as many instructions for anything touching addresses) or vastly limit the available address space. Adding a special case (and special kind of address) complicates the language for something that is rarely needed (note that C has a related, but IMHO more general version: bitfields in structs - the structs still have a sizeof measured in bytes, but a struct with 8 members may be one byte large overall). Bit fiddling operators that are included anyway allow emulating it in user code.

In higher-level languages that don't have a notion of addressing stuff at all, the size is an implementation detail. The implementation are, of course (directly or indirectly), again in lower-level languages that default to bytes over bits. That, and other requirements and limitations (e.g.: objects need to be accessed through pointers), make it impractical in general (though it exists, e.g. BitVector for Python) to expose tricks like "use a machine word, then index the bits through shifting/masking" to the language implemented.

It is possible to write less than 1 byte to a file

no, you can't... files are organized in bytes, it's the smallest piece of data you can save.

And, actually, that 1 byte will occupy more than 1 byte of space, in general. Depending on the OS, the system file type, etc, everything you save as a file will use at least one block. And the block's size varies according to the file system you're using. Then, this 1-bit will be written as 1 - byte and can occupy as much as 4kB of your disk.

In wikipedia you can read something about the byte being the smallest data unit in many computers.

Is there a C99 data type guaranteed to be at least two bytes?

Since int must have a range of at least 16 bits, int will meet your criterion on most practical systems. So would short (and long, and long long). If you want exactly 16 bits, you have to look to see whether int16_t and uint16_t are declared in <stdint.h>.

If you are worried about systems where CHAR_BIT is greater than 8, then you have to work harder. If CHAR_BIT is 32, then only long long is guaranteed to hold two characters.


What the C standard says about sizes of integer types

In a comment, Richard J Ross III says:

The standard says absolutely nothing about the size of an int except that it must be larger than or equal to short, so, for example, it could be 10 bits on some systems I've worked on.

On the contrary, the C standard has specifications on the lower bounds on the ranges that must be supported by different types, and a system with 10-bit int would not be conformant C.

Specifically, in ISO/IEC 9899:2011 §5.2.4.2.1 Sizes of integer types <limits.h>, it says:

¶1 The values given below shall be replaced by constant expressions suitable for use in #if
preprocessing directives. Moreover, except for CHAR_BIT and MB_LEN_MAX, the
following shall be replaced by expressions that have the same type as would an
expression that is an object of the corresponding type converted according to the integer
promotions. Their implementation-defined values shall be equal or greater in magnitude
(absolute value) to those shown, with the same sign.

— number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8

[...]

— minimum value for an object of type short int

SHRT_MIN -32767 // −(215 − 1)

— maximum value for an object of type short int

SHRT_MAX +32767 // 215 − 1

— maximum value for an object of type unsigned short int

USHRT_MAX 65535 // 216 − 1

— minimum value for an object of type int

INT_MIN -32767 // −(215 − 1)

— maximum value for an object of type int

INT_MAX +32767 // 215 − 1

— maximum value for an object of type unsigned int

UINT_MAX 65535 // 216 − 1

How can I define a datatype with 1 bit size in C?

Maybe you are looking for a bit-field:

struct bitfield
{
unsigned b0:1;
unsigned b1:1;
unsigned b2:1;
unsigned b3:1;
unsigned b4:1;
unsigned b5:1;
unsigned b6:1;
unsigned b7:1;
};

There are so many implementation-defined features to bit-fields that it is almost unbelievable, but each of the elements of the struct bitfield occupies a single bit. However, the size of the structure may be 4 bytes even though you only use 8 bits (1 byte) of it for the 8 bit-fields.

There is no other way to create a single bit of storage. Otherwise, C provides bytes as the smallest addressable unit, and a byte must have at least 8 bits (historically, there were machines with 9-bit or 10-bit bytes, but most machines these days provide 8-bit bytes only — unless perhaps you're on a DSP where the smallest addressable unit may be a 16-bit quantity).

Why is the size of the datatype byte one byte?

It is impossible to allocate memory in units of less than one byte, since a byte is the smallest unit of addressable memory. So a bool, although it could be represented by only one bit, still takes up one byte of memory. One byte is one byte because it can be one byte. There's no reason it should be any bigger.

23 bit user defined type in c++

You can use a bit-field.

struct TwentyThreeBits {
int x : 23;

TwentryThreeBits & operator = (int y) {
x = y;
return *this;
}
};

This allows you manipulate the member x as a 23 bit value. The actual size of the type is likely larger (likely sizeof(TwentyThreeBits) is at least sizeof(int)).

If you would like to represent many items that only occupy 23 bits, you could create an array of bits (either with vector<bool> or with bitset) and access the right multiple of 23 into that array to get to the "object".



Related Topics



Leave a reply



Submit