How to Run a Mips Binary on X86 Platform

How can I execute MIPS assembly programs on an x86 linux?

You will need either a cross compilation toolchain, or to build your own cross binutils.
For a prebuilt toolchain, you can visit code sourcery. If you just want to compile assembly, then all
you need is binutils. There are some guidelines on the Linux Mips wiki

For the emulation part, QEmu would be my choice.

Running MIPS instructions on a real machine [not on simulator]

Is it possible to create an executable file from MIPS instructions?

Yes, because an assembler is just a program that reads some input and produce some output files, therefore you can assemble for any architectures from any other architectures, in the same way as the possibility to cross-compile for other architectures from your PC

But executing the output file is an entirely different thing. Each architecture has its own machine language. Just like how humans can't understand people speaking other languages, machines only understand programs written in its language. That means MIPS hardware can only understand and run MIPS binaries. Other architectures like x86 can't interpret another instruction set with different instruction coding

So the only way for you is to use an emulator or simulator

Floating point differences when porting MIPS code to x86_64

According to the MIPSPro ABI:

the MIPS processors conform to the IEEE 754 floating point standard

Your target platform, x86_64, shares this quality.

As such, double means IEEE-754 double-precision float on both platforms.

When it comes to endianness, x86_64 processors are little-endian; but, according to the MIPSpro assembly programmers' guide, some MIPSPro processors are big-endian:

For R4000 and earlier systems, byte ordering is configurable into either big-endian or little-endian byte ordering (configuration occurs during hardware reset). When configured as a big-endian system, byte 0 is always the most-significant (leftmost) byte. When configured as a little-endian system, byte 0 is always the least-significant (rightmost byte).

The R8000 CPU, at present, supports big-endian only

So, you will have to check the datasheet for the original platform and see whether any byte swapping is needed.

Cross compiling for MIPS router from x86

You are right, you need a proper mips toolchain to cross-compile your application and Buildroot can do that. But you may need to tweak buildroot's menuconfig options.
Depending on the output of file, your options may change. On my system, binary apps inform the following:

ELF 32-bit MSB executable, MIPS, MIPS32 rel2 version 1 (SYSV)

These are the options I have enabled for my Buildroot's menuconfig:

Target Architecture (mips)  ---> 
Target Architecture Variant (mips 32r2)  --->                                                            
Target ABI (o32)  --->                                                                                   
Target options  --->                                                                                     
Build options  --->   
    (/opt/cross-mips-buildroot) Toolchain and header file location?                                                                                   
Toolchain  --->        
    Toolchain type (Buildroot toolchain)  ---> 
    Kernel Headers (Linux 2.6.34.x kernel headers)  --->
    uClibc C library Version (uClibc 0.9.31.x)  ---> 
    [*] Build/install a shared libgcc?
    [*] Enable compiler tls support       
    [*] Build gdb debugger for the Target
    [*] Build gdb server for the Target
    [*] Build gdb for the Host
        GDB debugger Version (gdb 6.8)  --->
    [*] Enable large file (files > 2 GB) support?
    [*] Enable WCHAR support
    [*] Use software floating point by default
    [*] Enable stack protection support
    [*] Build/install c++ compiler and libstdc++?
    [*] Include target utils in cross toolchain  
Package Selection for the target  --->   
    [*] BusyBox
    [*]   Run BusyBox's own full installation
    Libraries  ---> 
        Networking  ---> 
            [*] libcurl
        Text and terminal handling  ---> 
            [*] icu
            -*- ncurses    
Target filesystem options  --->                                                                          
Bootloaders  --->                                                                                        
Kernel  --->

The toolchain itself is installed at /opt/cross-mips-buildroot. You can find the compiler and other tools on /opt/cross-mips-buildroot/usr/bin/

Try to compile a simple hello world application and see if you can run it inside the mips system.

Note: this configuration will not build a C++ compiler. If you need it, you can grep LIBSTDCPP .config and check if it's enable or not and change it to your likes. Then make menuconfig to make it happen.

How is machine code stored in the EXE file?

Are ARM opcodes very different from x86 opcodes?

Yes, they are. You should assume that all instruction sets for different processor families are completely different and incompatible. An instruction set first defines an encoding, which specifies one or more of these:

the instruction opcode;
the addressing mode;
the operand size;
the address size;
the operands themselves.

The encoding further depends on how many registers it can address, whether it has to be backwards compatible, if it has to be decodable quickly, and how complex the instruction can be.

On the complexity: the ARM instruction set requires all operands to be loaded from memory to register and stored from register to memory using specialized load/store instructions, whereas x86 instructions can encode a single memory address as one of their operands and therefore do not have separate load/store instructions.

Then the instruction set itself: different processors will have specialized instructions to deal with specific situations. Even if two processors families have the same instruction for the same thing (e.g. an add instruction), they are encoded very differently and may have slightly different semantics.

As you see, since any CPU designer can decide on all these factors, this makes the instruction set architectures for different processor families completely different and incompatible.

Are registers, int/float/double and SIMD very different concepts on different architectures?

No they are very similar. Every modern architecture has registers and can handle integers, and most can handle IEEE 754 compatible floating-point instructions of some size. For example, the x86 architecture has 80-bit floating-point values that are truncated to fit the 32-bit or 64-bit floating-point values you know. The idea behind SIMD instructions is also the same on all architectures that support it, but many do not support it and most have different requirements or restrictions for them.

Are the opcodes common across all platforms like Windows, Mac and Unix?

Given three Intel x86 systems, one running Windows, one running Mac OS X and one running Unix/Linux, then yes the opcodes are exactly the same since they run on the same processor. However, each operating system is different. Many aspects such as memory allocation, graphics, device driver interfacing and threading require operating system specific code. So you generally can't run an executable compiled for Windows on Linux.

Does the PE format store the exact set of opcodes supported by the processor, or is it a more generic format that the OS converts to match the CPU?

No, the PE format does not store the set of opcodes. As explained earlier, the instruction set architectures of different processor families are simply too different to make this possible. A PE file usually stores machine code for one specific processor family and operating system family, and will only run on such processors and operating systems.

There is however one exception: .NET assemblies are also PE files but they contain generic instructions that are not specific to any processor or operating system. Such PE files can be 'run' on other systems, but not directly. For example, mono on Linux can run such .NET assemblies.

How does the EXE file indicate the instruction set extensions needed (like 3DNOW! or SSE/MMX?)

While the executable can indicate the instruction set for which it was built (see Chris Dodd's answer), I don't believe the executable can indicate the extensions that are required. However, the executable code, when run, can detect such extensions. For example, the x86 instruction set has a CPUID instruction that returns all the extensions and features supported by that particular CPU. The executable would just test that and abort when the processor does not meet the requirements.

.NET versus native code

You seem to know a thing or two about .NET assemblies and their instruction set, called CIL (Common Intermediate Language). Each CIL instruction follows a specific encoding and uses the evaluation stack for its operands. The CIL instruction set is kept very general and high-level. When it is run (on Windows by mscoree.dll, on Linux by mono) and a method is called, the Just-In-Time (JIT) compiler takes the method's CIL instructions and compiles them to machine code. Depending on the operating system and processor family the compiler has to decide which machine instructions to use and how to encode them. The compiled result is stored somewhere in memory. The next time the method is called the code jumps directly to the compiled machine code and can execute just as efficiently as a native executable.

How are ARM instructions encoded?

I have never worked with ARM, but from a quick glance at the documentation I can tell you the following. An ARM instruction is always 32-bits in length. There are many exceptional encodings (e.g. for branching and coprocessor instructions), but the general format of an ARM instruction is like this:


31             28  27  26  25              21  20              16
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+--
|   Condition   | 0 | 0 |R/I|    Opcode     | S |   Operand 1   | ...
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+--

                   12                                               0
  --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
... |  Destination  |               Operand 2                       |
  --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

The fields mean the following:

Condition: A condition that, when true, causes the instruction to be executed. This looks at the Zero, Carry, Negative and Overflow flags. When set to 1110, the instruction is always executed.
R/I: When 0, operand 2 is a register. When 1, operand 2 is a constant value.
Opcode: The instruction's opcode.
S: When 1, the Zero, Carry, Negative and Overflow flags are set according to the instruction's result.
Operand1: The index of a register that is used as the first operand.
Destination: The index of a register that is used as the destination operand.
Operand 2: The second operand. When R/I is 0, the index of a register. When R/I is 1, an unsigned 8-bit constant value. In addition to either one of these, some bits in operand 2 indicate whether the value is shifted/rotated.

For more detailed information you should read the documentation for the specific ARM version you want to know about. I used this ARM7TDMI-S Data Sheet, Chapter 4 for this example.

Note that each ARM instruction, no matter how simple, takes 4 bytes to encode. Because of the possible overhead, the modern ARM processors allow you to use an alternative 16-bit instruction set called Thumb. It cannot express all the things the 32-bit instruction set can, but it is also half as big.

On the other hand, x86-64 instructions have a variable length encoding, and use all kinds of modifiers to adjust the behavior of individual instructions. If you want to compare the ARM instructions with how x86 and x86-64 instructions are encoded, you should read the x86-64 Instruction Encoding article that I wrote on OSDev.org.

Your original question is very broad. If you want to know more, you should do some research and create a new question with the specific thing you want to know.