Principle of Qemu CPU Emulation

principle of QEMU CPU emulation

Please see this file for the C-level modelling of the state of an ARM CPU as done by QEMU.

It's pretty straight-forward, and (of course) as you suspect the registers (and all other state) are modelled as C variables.

The core structure begins:

typedef struct CPUARMState {
/* Regs for current mode. */
uint32_t regs[16];
/* Frequently accessed CPSR bits are stored separately for efficiency.
This contains all the other bits. Use cpsr_{read,write} to access
the whole CPSR. */
uint32_t uncached_cpsr;
uint32_t spsr;

Emulation and Energy Consumption in QEMU

QEMU is neither cycle accurate or a microarchictecture simulation. You may be able to build your own energy model with the recently added plugins but I suspect you need more low level tools.

Timer supply to CPU in QEMU

QEMU's emulation does not generally try to emulate actual clock lines which send pulses at megahertz rates (this would be incredibly inefficient). Instead when the guest programs a timer device the model of the timer device sets up an internal QEMU timer to fire after the appropriate duration (and the handler for that then raises the interrupt line or does whatever is necessary for emulating the hardware behaviour). The duration is calculated from the values the guest has written to the device registers together with a value for what the clock frequency should be.

QEMU doesn't have any infrastructure for handling things like programmable clock dividers or a "clock tree" that routes clock signals around the SoC (one could be added, but nobody has got around to it yet). Instead timer devices are usually either written with a hard-coded frequency, or may be written to have a QOM property that allows the frequency to be set by the board or SoC model code that creates them.

In particular for the SysTick device in the Cortex-M models the current implementation will program the QEMU timer it uses with durations corresponding to a frequency of:

  • 1MHz, if the guest has set the CLKSOURCE bit to 1 (processor clock)
  • something which the board model has configured via the 'system_clock_scale' global variable (eg 25MHz for the mps2 boards), if the guest has set CLKSOURCE to 0 (external reference clock)

(The system_clock_scale global should be set to NANOSECONDS_PER_SECOND / clk_frq_in_hz.)

The 1MHz is just a silly hardcoded value that nobody has yet bothered to improve upon, because we haven't run into guest code that cares yet. The system_clock_scale global is clunky but works.

None of this affects the speed of the emulated QEMU CPU (ie how many instructions it executes in a given time period). By default QEMU CPUs will run "as fast as possible". You can use the -icount option to specify that you want the CPU to run at a particular rate relative to real time, which sort of implicitly sets the 'cpu frequency', but this will only sort of roughly set an average -- some instructions will run much faster than others, in a not very predictable way. In general QEMU's philosophy is "run guest code as fast as we can", and we don't make any attempt at anything approaching cycle-accurate or otherwise tightly timed emulation.

Update as of 2020: QEMU now has some API and infrastructure for modelling clock trees, which is documented in docs/devel/clocks.rst in the source tree. This is basically a formalized version of the concepts described above, to make it easier for one device to tell another "my clock rate is 20MHz now" without hacks like the "system_clock_scale" global variable or ad-hoc QOM properties.



Related Topics



Leave a reply



Submit