Reading "Integer" Size Bytes from a Char* Array

Reading from memory and getting in char array.. What if a char in that contains 1 byte that suppose to be a 1 byte length and length is a number

But in C number is int or long.

char is an integral type which can be used to represent numbers, the same as short, int or long.

It has a maximum value of CHAR_MAX.

One problem is that the signedness of char is implementation-defined. When in doubt, be explicit with signed char (SCHAR_MAX) or unsigned char (UCHAR_MAX).

Alternatively, use fixed width integer types to make it easier to reason about the byte width of the data you are working with.

The fields in the EPS table are denoted as being of size BYTE, WORD, and DWORD. These can be represented by uint8_t, uint16_t, and uint32_t respectively, as you almost certainly want unsigned integers.


This code

char x1[3];
memcpy(x1,&mem[4],2);
x1[2]='\0';
long v=strtol(x1,'\0',16);

printf("max size %lx\n",v);

that attempts to parse the Maximum Structure Size as if it were a number represented by two characters is incorrect. The Maximum Structure Size is a 16-bit integer.

mem=mem+length; does not make much sense, as this would place you in memory beyond the table. I am not sure what the two printf calls that follow are trying to print.

Additionally, your example includes some errant code (unused variables: i, y, j).

Everything else is more-or-less correct, if messy.


Below is a simple example that seemingly works on my machine, using the smbios_entry_point table file. You should be able to use it as reference to adjust your program accordingly.

$ uname -rmo
5.14.21-210.current x86_64 GNU/Linux
#include <fcntl.h>
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

#define EPS_SIZE 31
#define TARGET_FILE "/sys/firmware/dmi/tables/smbios_entry_point"

void print_buffer(uint8_t *eps) {
printf("Anchor string: %c%c%c%c\n", eps[0], eps[1], eps[2], eps[3]);
printf("Checksum: %02Xh\n", eps[4]);
printf("Entry point length: %02Xh\n", eps[5]);
printf("Major version: %02Xh\n", eps[6]);
printf("Minor version: %02Xh\n", eps[7]);

uint16_t mss;
memcpy(&mss, eps + 8, sizeof mss);
printf("Maximum structure size: %" PRIu16 " bytes\n", mss);

printf("Entry point revision: %02Xh\n", eps[10]);
printf("Formatted area: %02Xh %02Xh %02Xh %02Xh %02Xh\n",
eps[11], eps[12], eps[13], eps[14], eps[15]);
printf("Intermediate anchor string: %c%c%c%c%c\n",
eps[16], eps[17], eps[18], eps[19], eps[20]);
printf("Intermediate checksum: %02Xh\n", eps[21]);

uint16_t stl;
memcpy(&stl, eps + 22, sizeof stl);
printf("Structure table length: %" PRIu16 " bytes \n", stl);

uint32_t sta;
memcpy(&sta, eps + 24, sizeof sta);
printf("Structure table address: 0x%08x\n", sta);

uint16_t nsmbs;
memcpy(&nsmbs, eps + 28, sizeof nsmbs);
printf("Number of SMBIOS structures: %" PRIu16 "\n", nsmbs);

printf("SMBIOS BCD revision: %02Xh %02Xh\n",
eps[30] >> 4, eps[30] & 0x0f);
}

int main(void) {
uint8_t buf[EPS_SIZE];
int fd = open(TARGET_FILE, O_RDONLY);

read(fd, buf, sizeof buf);
close(fd);

print_buffer(buf);
}

stdout:

Anchor string: _SM_
Checksum: C2h
Entry point length: 1Fh
Major version: 02h
Minor version: 07h
Maximum structure size: 184 bytes
Entry point revision: 00h
Formatted area: 00h 00h 00h 00h 00h
Intermediate anchor string: _DMI_
Intermediate checksum: DCh
Structure table length: 2229 bytes
Structure table address: 0x000ed490
Number of SMBIOS structures: 54
SMBIOS BCD revision: 02h 07h

You may also be interested in dmidecode and its source code.

C++, pack integer bytes into a char array?

If num is 127 or less, then the output is correct, but if it's 128 or more, then the output is all messed up.

It appears you're talking about the following part of the code where you try to read your integer back out of the char array and then display it:

int out = *(x + 3);
cout << "out: " << out << endl;

The problem is that x + 3 is of type char*, and when you dereference that it becomes a char value. On your system, that value is signed and integers are obviously stored in little-endian form. So, you think it "works" for values less than 128. But actually that's not even true. It will also break for values less than -128.

You see, even though you're assigning this to a int, it's still only a char. That char value is simply being copied into a larger integer type.

It looks like you actually wanted this:

int out = *(int*)(x + 3);  // DON'T DO THIS!

This is called type punning (treating some memory as if it contained another type) and should be avoided!!!

Instead, you should just copy the bytes out the same way you copied them in:

int out;
memcpy(&out, x + 3, sizeof(out));

Also beware that if you plan to transfer binary data over a network, that any machine reading this data must be aware of its endianness.

How can bytes in char array represent integers?

I have char array that I read from binary file (like ext2 formatted filesystem image file).

Open the file in binary mode

const char *file_name = ...;
FILE *infile = fopen(file_name, "rb"); // b is for binary
if (infile == NULL) {
fprintf(stderr, "Unable to open file <%s>.\n", file_name);
exit(1);
}

I need to read integer starting at offset byte 1024 ...

long offset = 1024; 
if (fseek(infile, offset, SEEK_SET)) {
fprintf(stderr, "Unable to seek to %ld.\n", offset);
exit(1);
}

So I believe can be represented in integer size of 4 byte on my system

Rather than use int, which may differ from 4-bytes, consider int32_t from <stdint.h>.

int32_t data4;
if (fread(&data4, sizeof data4, 1, infile) != 1) {
fprintf(stderr, "Unable to read data.\n");
exit(1);
}

Account for Endian.

As file data is little-endian, convert to native endian. See #include <endian.h>.

data4 = le32toh(data4);

Clean up when done

// Use data4

fclose(infile);


believe I need to use strtol like

No. strtol() examines a string and returns a long. File data is binary and not a string.

Read n bytes in a char array and return it as a double

You want to write and read the memory using the same data type. So if your image is stored as ints, use int* to write to your data structure and use int* to read from your data structure. Then at the last moment, cast the data to double (as opposed to casting the pointers).

Get int from char[] of bytes

you need to cast it like so:

int foo = *((int *) &buffer[0x8]);

Which will first cast the spot to a int pointer and the dereference it to the int itself.

[watch out for byte-ordering across different machine types though; some do high bytes first some do low]

And just to make sure the example is well understood, here's some code showing the results:

#include <stdio.h>

main() {
char buffer[14] = { 0,1,2,3,4,5,6,7,8,9,10,11,12,13 };
int foo = *((int *) &buffer[0x8]);
int bar = (int) buffer[0x8];

printf("raw: %d\n", buffer[8]);
printf("foo: %d\n", foo);
printf("bar: %d\n", bar);
}

And the results from running it:

raw: 8
foo: 185207048
bar: 8

size in bytes of a char number and int number

Yes, all of your answers are correct.

int will always take up sizeof(int) bytes, 8(int) assuming 32-bit int it will take 4 bytes, whereas 8(char) will take up one byte.

The way to think about your last question IMO is that data is stored as bytes. char and int are way of interpreting bytes, so in text files you write bytes, but if you want to write human-readable "8" into a text file, you must write this in some encoding, such as ASCII where bytes correspond to human-readable characters. So, to write "8" you would need to write the byte 0x38 (ASCII value of 8).

So, in files you have data, not int or chars.

What's the proper way to copy a char array of a given size to an integer in C?

Since you specify Len is at most (8), it's reasonable to assume little-endian storage, i.e., the least-significant byte at Arr[0].

If Len was fixed at (8), the compiler might be able to replace memcpy simply by loading the value from memory. That would also be dependent on whether the platform can do unaligned reads - if the compiler can't prove alignment - and may involve something like the bswap instruction on x86-64 if the architecture is big-endian.

The fact that a Len is a run-time value will likely generate a call to memcpy. The overhead of the call itself is not trivial. All things considered, it's probably best just to handle this in an endian-independent way using byte arithmetic. The code assumes 8-bit bytes, which seems consistent with your question.

uint64_t Word = 0;

while (Len--)
Word = (Word << 8) | Arr[Len];

On more exotic platforms, where (CHAR_BIT > 8), you can replace the right-hand side of the OR expression with (Arr[Len] & 0xff). In fact, this is optimised away on platforms with 8-bit (normative) bytes, so you might as well add it for completeness. Or just keep these issues in mind.

There are platforms with legal C implementations where char, short, int are 32-bit values, for example. These are quite common in the embedded world.

Reading 4 bytes from the end of a char array

Lets say the returned array is of size 8, it would look something like this in memory:


+---+
| c |
+---+
|
v
+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+---+---+---+---+---+---+---+

(The numbers inside is the indexes.)

Now if you make a new variable e to point at c + size it will point point to one beyond the end of the data:


+---+ +---+
| c | | e |
+---+ +---+
| |
v v
+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+---+---+---+---+---+---+---+

If you subtract 1 from e it now points to index 7:


+---+ +---+
| c | | e |
+---+ +---+
| |
v v
+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+---+---+---+---+---+---+---+

If you subtract two (in total) e would point to index 6, subtract 3 and e would be pointing at index 5 and subtract 4 and the index pointed to would be 4. If you subtract 5 the pointer e would point to index 3:


+---+ +---+
| c | | e |
+---+ +---+
| |
v v
+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+---+---+---+---+---+---+---+

And that's not four bytes from the end, that's five bytes from the end.

So you should be doing e.g.

char* end = c + size - 4;  /* Subtract by 4 and not 5 */

You should also be careful of the endianness, if the data comes from other systems e.g. over the Internet.



Related Topics



Leave a reply



Submit