Linux Kernel Code in Memory Check with Sha256 Sum

Linux kernel module - accessing memory mapping

From the module's code THIS_MODULE macro is actually a pointer to the struct module object. Its module_init and module_core fields point to memory regions, where all module sections are loaded.

As I understand, sections division is inaccessible from the module code(struct load_info is dropped after module is loaded into memory). But having module's file you can easily deduce section's addresses after load:

module_init:
- init sections with code (.init.text)
- init sections with readonly data
- init sections with writable data

module_core:
- sections with code (.text)
- sections with readonly data
- sections with writable data

If several sections suit to one category, they are placed in the same order, as in the module's file.

Within module's code you can also print address of any its symbol, and after calculate start of the section, contained this symbol.

How to calculate sha256 file checksum in Go

The SHA256 hasher implements the io.Writer interface, so one option would be to use the io.Copy() function to copy the data from an appropriate io.Reader in blocks. Something like this should do:

f, err := os.Open(os.Args[1])
if err != nil {
log.Fatal(err)
}
defer f.Close()
if _, err := io.Copy(hasher, f); err != nil {
log.Fatal(err)
}

Get a file SHA256 Hash code and Checksum

  1. My best guess is that there's some additional buffering in the Mono implementation of the File.Read operation. Having recently looked into checksums on a large file, on a decent spec Windows machine you should expect roughly 6 seconds per Gb if all is running smoothly.

    Oddly it has been reported in more than one benchmark test that SHA-512 is noticeably quicker than SHA-256 (see 3 below). One other possibility is that the problem is not in allocating the data, but in disposing of the bytes once read. You may be able to use TransformBlock (and TransformFinalBlock) on a single array rather than reading the stream in one big gulp—I have no idea if this will work, but it bears investigating.

  2. The difference between hashcode and checksum is (nearly) semantics. They both calculate a shorter 'magic' number that is fairly unique to the data in the input, though if you have 4.6GB of input and 64B of output, 'fairly' is somewhat limited.

    • A checksum is not secure, and with a bit of work you can figure out the input from enough outputs, work backwards from output to input and do all sorts of insecure things.
    • A Cryptographic hash takes longer to calculate, but changing just one bit in the input will radically change the output and for a good hash (e.g. SHA-512) there's no known way of getting from output back to input.
  3. MD5 is breakable: you can fabricate an input to produce any given output, if needed, on a PC. SHA-256 is (probably) still secure, but won't be in a few years time—if your project has a lifespan measured in decades, then assume you'll need to change it. SHA-512 has no known attacks and probably won't for quite a while, and since it's quicker than SHA-256 I'd recommend it anyway. Benchmarks show it takes about 3 times longer to calculate SHA-512 than MD5, so if your speed issue can be dealt with, it's the way to go.

  4. No idea, beyond those mentioned above. You're doing it right.

For a bit of light reading, see Crypto.SE: SHA51 is faster than SHA256?

Edit in response to question in comment

The purpose of a checksum is to allow you to check if a file has changed between the time you originally wrote it, and the time you come to use it. It does this by producing a small value (512 bits in the case of SHA512) where every bit of the original file contributes at least something to the output value. The purpose of a hashcode is the same, with the addition that it is really, really difficult for anyone else to get the same output value by making carefully managed changes to the file.

The premise is that if the checksums are the same at the start and when you check it, then the files are the same, and if they're different the file has certainly changed. What you are doing above is feeding the file, in its entirety, through an algorithm that rolls, folds and spindles the bits it reads to produce the small value.

As an example: in the application I'm currently writing, I need to know if parts of a file of any size have changed. I split the file into 16K blocks, take the SHA-512 hash of each block, and store it in a separate database on another drive. When I come to see if the file has changed, I reproduce the hash for each block and compare it to the original. Since I'm using SHA-512, the chances of a changed file having the same hash are unimaginably small, so I can be confident of detecting changes in 100s of GB of data whilst only storing a few MB of hashes in my database. I'm copying the file at the same time as taking the hash, and the process is entirely disk-bound; it takes about 5 minutes to transfer a file to a USB drive, of which 10 seconds is probably related to hashing.

Lack of disk space to store hashes is a problem I can't solve in a post—buy a USB stick?

Using MD5 in kernel space of Linux

Use Crypto API instead of rolling your own.

When are the contents of .exit.text section in an ARM vmlinux file discarded once it is loaded in memory?

The files vmlinux.lds.S and module.c handle this. The handling depends on your kernel version and configuration; neither have been given. My assumption is you mean the .exit.text section in the kernel and not some user space task. Generally, the kernel doesn't exit, so they are discarded by the linker script unless you have some debugging configurations.

Edit: The android kernel was made with a .config file which enables Linux kernel features at compile time. The .config for CONFIG_HOTPLUG_CPU is a CPU; some big servers can keep running even when the CPU is exchanged. It is hard to answer your question. I don't know where your Android kernel source is or what .config options it was built with.

The sections are kept with GENERIC_BUG; so it is possible your kernel was built with this. From the stock vmlinux.lds.S, the .exit.data section is put before __init_end, so it is freed and returned to memory after the init portion runs. Ie, present only during boot. Under these circumstances, you will see it in the vmlinux ELF, but not in the runtime kcore or however you are dumping it.

Specifically, main.c's start_kernel(), rest_init(), kernel_init(), and free_initmem() are the place where this stuff is discarded.



Related Topics



Leave a reply



Submit