Is C++ Platform Dependent

Is C++ platform dependent?

This is actually a relatively extensive topic. For simplicity, it comes down to two things: operating system and CPU architecture.

First of all, *.exe is generally only Windows since it is binary code which the windows operating system knows how to execute. Furthermore, the operating system knows how to translate this to the proper code for the architecture (this is why Windows is "just compatible"). Note that a lot more is going on, but this is a (very) high-level abstraction of what is going on.

Now, compilers will take C++ code and generate its corresponding assembly code for the architecture (i.e. x86, MIPS, etc.). Usually the compiler also has an assembler (or one which it can rely on). The assembler the links code and generates binary code which the hardware can execute. For more information on this topic look for more information on co-generation.

Additional Note

Consider Java which is not platform-dependent. Java compilers generate Java bytecode which is run on the Java virtual machine (JVM). It is important to notice that any time you wish to run a Java application you must run the Java virtual machine. Since the precompiled JVM knows how to operate on your operating system and CPU architecture, it can run its Java bytecode and effectively run the corresponding actions for your particular system.

In a compiled binary file (i.e. one from C++ code), you have system bytecode. So the kind of instructions which Java simulates for you are directly hard-coded into the .exe or whatever binary format you are using. Consider the following example:

Notice that this java code must eventually be run in the JVM and cannot stand-alone.

Java Code:
System.out.println("hello") (To be compiled)

Compiled Java bytecode:
Print "hello" (To be run in JVM)

JVM:
(... some translation, maybe to architecture code - I forget exactly ...)
system_print_code "hello" (JVM translation to CPU specific)

Versus the C++ (which can be run in stand-alone mode):

C++ Code:
cout<< "hello";

Architecture Code:
some_assembly_routine "hello"

Binary output:
system_print_code "hello"

A real example

If you're curious about how this may actually look in a real-life example, I have included one below.

C++ Source
I placed this in a file called hello.cpp

#include <iostream>

int main() {
using namespace std;
cout << "Hello world!" << endl;
return 0;
}

Assembly (Generated from C++ source)
Generated via g++ -S hello.cpp

    .file   "test.c"
.section .rodata
.type _ZStL19piecewise_construct, @object
.size _ZStL19piecewise_construct, 1
_ZStL19piecewise_construct:
.zero 1
.local _ZStL8__ioinit
.comm _ZStL8__ioinit,1,1
.LC0:
.string "Hello world!"
.text
.globl main
.type main, @function
main:
.LFB1493:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rsi
leaq _ZSt4cout(%rip), %rdi
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc@PLT
movq %rax, %rdx
movq _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_@GOTPCREL(%rip), %rax
movq %rax, %rsi
movq %rdx, %rdi
call _ZNSolsEPFRSoS_E@PLT
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1493:
.size main, .-main
.type _Z41__static_initialization_and_destruction_0ii, @function
_Z41__static_initialization_and_destruction_0ii:
.LFB1982:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movl %edi, -4(%rbp)
movl %esi, -8(%rbp)
cmpl $1, -4(%rbp)
jne .L5
cmpl $65535, -8(%rbp)
jne .L5
leaq _ZStL8__ioinit(%rip), %rdi
call _ZNSt8ios_base4InitC1Ev@PLT
leaq __dso_handle(%rip), %rdx
leaq _ZStL8__ioinit(%rip), %rsi
movq _ZNSt8ios_base4InitD1Ev@GOTPCREL(%rip), %rax
movq %rax, %rdi
call __cxa_atexit@PLT
.L5:
nop
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1982:
.size _Z41__static_initialization_and_destruction_0ii, .-_Z41__static_initialization_and_destruction_0ii
.type _GLOBAL__sub_I_main, @function
_GLOBAL__sub_I_main:
.LFB1983:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $65535, %esi
movl $1, %edi
call _Z41__static_initialization_and_destruction_0ii
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1983:
.size _GLOBAL__sub_I_main, .-_GLOBAL__sub_I_main
.section .init_array,"aw"
.align 8
.quad _GLOBAL__sub_I_main
.hidden __dso_handle
.ident "GCC: (GNU) 7.2.1 20171128"
.section .note.GNU-stack,"",@progbits

Binary output (Generated from assembly)
This is the unlinked form (i.e. not yet fully populated with symbol locations) of the binary output generated via g++ -c in hexadecimal form. I generated the hexadecimal representation using xxd.

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0100 3e00 0100 0000 0000 0000 0000 0000 ..>.............
00000020: 0000 0000 0000 0000 0807 0000 0000 0000 ................
00000030: 0000 0000 4000 0000 0000 4000 0f00 0e00 ....@.....@.....
00000040: 5548 89e5 488d 3500 0000 0048 8d3d 0000 UH..H.5....H.=..
00000050: 0000 e800 0000 0048 89c2 488b 0500 0000 .......H..H.....
00000060: 0048 89c6 4889 d7e8 0000 0000 b800 0000 .H..H...........
00000070: 005d c355 4889 e548 83ec 1089 7dfc 8975 .].UH..H....}..u
00000080: f883 7dfc 0175 3281 7df8 ffff 0000 7529 ..}..u2.}.....u)
00000090: 488d 3d00 0000 00e8 0000 0000 488d 1500 H.=.........H...
000000a0: 0000 0048 8d35 0000 0000 488b 0500 0000 ...H.5....H.....
000000b0: 0048 89c7 e800 0000 0090 c9c3 5548 89e5 .H..........UH..
000000c0: beff ff00 00bf 0100 0000 e8a4 ffff ff5d ...............]
000000d0: c300 4865 6c6c 6f20 776f 726c 6421 0000 ..Hello world!..
000000e0: 0000 0000 0000 0000 0047 4343 3a20 2847 .........GCC: (G
000000f0: 4e55 2920 372e 322e 3120 3230 3137 3131 NU) 7.2.1 201711
00000100: 3238 0000 0000 0000 1400 0000 0000 0000 28..............
00000110: 017a 5200 0178 1001 1b0c 0708 9001 0000 .zR..x..........
00000120: 1c00 0000 1c00 0000 0000 0000 3300 0000 ............3...
00000130: 0041 0e10 8602 430d 066e 0c07 0800 0000 .A....C..n......
00000140: 1c00 0000 3c00 0000 0000 0000 4900 0000 ....<.......I...
00000150: 0041 0e10 8602 430d 0602 440c 0708 0000 .A....C...D.....
00000160: 1c00 0000 5c00 0000 0000 0000 1500 0000 ....\...........
00000170: 0041 0e10 8602 430d 0650 0c07 0800 0000 .A....C..P......
00000180: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000190: 0000 0000 0000 0000 0100 0000 0400 f1ff ................
000001a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001b0: 0000 0000 0300 0100 0000 0000 0000 0000 ................
000001c0: 0000 0000 0000 0000 0000 0000 0300 0300 ................
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001e0: 0000 0000 0300 0400 0000 0000 0000 0000 ................
000001f0: 0000 0000 0000 0000 0000 0000 0300 0500 ................
00000200: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000210: 0800 0000 0100 0500 0000 0000 0000 0000 ................
00000220: 0100 0000 0000 0000 2300 0000 0100 0400 ........#.......
00000230: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000240: 3200 0000 0200 0100 3300 0000 0000 0000 2.......3.......
00000250: 4900 0000 0000 0000 6200 0000 0200 0100 I.......b.......
00000260: 7c00 0000 0000 0000 1500 0000 0000 0000 |...............
00000270: 0000 0000 0300 0600 0000 0000 0000 0000 ................
00000280: 0000 0000 0000 0000 0000 0000 0300 0900 ................
00000290: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000002a0: 0000 0000 0300 0a00 0000 0000 0000 0000 ................
000002b0: 0000 0000 0000 0000 0000 0000 0300 0800 ................
000002c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000002d0: 7100 0000 1200 0100 0000 0000 0000 0000 q...............
000002e0: 3300 0000 0000 0000 7600 0000 1000 0000 3.......v.......
000002f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000300: 8000 0000 1000 0000 0000 0000 0000 0000 ................
00000310: 0000 0000 0000 0000 9600 0000 1000 0000 ................
00000320: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000330: ce00 0000 1000 0000 0000 0000 0000 0000 ................
00000340: 0000 0000 0000 0000 0901 0000 1000 0000 ................
00000350: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000360: 1a01 0000 1000 0000 0000 0000 0000 0000 ................
00000370: 0000 0000 0000 0000 3201 0000 1002 0000 ........2.......
00000380: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000390: 3f01 0000 1000 0000 0000 0000 0000 0000 ?...............
000003a0: 0000 0000 0000 0000 5701 0000 1000 0000 ........W.......
000003b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000003c0: 0074 6573 742e 6300 5f5a 5374 4c31 3970 .test.c._ZStL19p
000003d0: 6965 6365 7769 7365 5f63 6f6e 7374 7275 iecewise_constru
000003e0: 6374 005f 5a53 744c 385f 5f69 6f69 6e69 ct._ZStL8__ioini
000003f0: 7400 5f5a 3431 5f5f 7374 6174 6963 5f69 t._Z41__static_i
00000400: 6e69 7469 616c 697a 6174 696f 6e5f 616e nitialization_an
00000410: 645f 6465 7374 7275 6374 696f 6e5f 3069 d_destruction_0i
00000420: 6900 5f47 4c4f 4241 4c5f 5f73 7562 5f49 i._GLOBAL__sub_I
00000430: 5f6d 6169 6e00 5f5a 5374 3463 6f75 7400 _main._ZSt4cout.
00000440: 5f47 4c4f 4241 4c5f 4f46 4653 4554 5f54 _GLOBAL_OFFSET_T
00000450: 4142 4c45 5f00 5f5a 5374 6c73 4953 7431 ABLE_._ZStlsISt1
00000460: 3163 6861 725f 7472 6169 7473 4963 4545 1char_traitsIcEE
00000470: 5253 7431 3362 6173 6963 5f6f 7374 7265 RSt13basic_ostre
00000480: 616d 4963 545f 4553 355f 504b 6300 5f5a amIcT_ES5_PKc._Z
00000490: 5374 3465 6e64 6c49 6353 7431 3163 6861 St4endlIcSt11cha
000004a0: 725f 7472 6169 7473 4963 4545 5253 7431 r_traitsIcEERSt1
000004b0: 3362 6173 6963 5f6f 7374 7265 616d 4954 3basic_ostreamIT
000004c0: 5f54 305f 4553 365f 005f 5a4e 536f 6c73 _T0_ES6_._ZNSols
000004d0: 4550 4652 536f 535f 4500 5f5a 4e53 7438 EPFRSoS_E._ZNSt8
000004e0: 696f 735f 6261 7365 3449 6e69 7443 3145 ios_base4InitC1E
000004f0: 7600 5f5f 6473 6f5f 6861 6e64 6c65 005f v.__dso_handle._
00000500: 5a4e 5374 3869 6f73 5f62 6173 6534 496e ZNSt8ios_base4In
00000510: 6974 4431 4576 005f 5f63 7861 5f61 7465 itD1Ev.__cxa_ate
00000520: 7869 7400 0000 0000 0700 0000 0000 0000 xit.............
00000530: 0200 0000 0500 0000 fdff ffff ffff ffff ................
00000540: 0e00 0000 0000 0000 0200 0000 0f00 0000 ................
00000550: fcff ffff ffff ffff 1300 0000 0000 0000 ................
00000560: 0400 0000 1100 0000 fcff ffff ffff ffff ................
00000570: 1d00 0000 0000 0000 2a00 0000 1200 0000 ........*.......
00000580: fcff ffff ffff ffff 2800 0000 0000 0000 ........(.......
00000590: 0400 0000 1300 0000 fcff ffff ffff ffff ................
000005a0: 5300 0000 0000 0000 0200 0000 0400 0000 S...............
000005b0: fcff ffff ffff ffff 5800 0000 0000 0000 ........X.......
000005c0: 0400 0000 1400 0000 fcff ffff ffff ffff ................
000005d0: 5f00 0000 0000 0000 0200 0000 1500 0000 _...............
000005e0: fcff ffff ffff ffff 6600 0000 0000 0000 ........f.......
000005f0: 0200 0000 0400 0000 fcff ffff ffff ffff ................
00000600: 6d00 0000 0000 0000 2a00 0000 1600 0000 m.......*.......
00000610: fcff ffff ffff ffff 7500 0000 0000 0000 ........u.......
00000620: 0400 0000 1700 0000 fcff ffff ffff ffff ................
00000630: 0000 0000 0000 0000 0100 0000 0200 0000 ................
00000640: 7c00 0000 0000 0000 2000 0000 0000 0000 |....... .......
00000650: 0200 0000 0200 0000 0000 0000 0000 0000 ................
00000660: 4000 0000 0000 0000 0200 0000 0200 0000 @...............
00000670: 3300 0000 0000 0000 6000 0000 0000 0000 3.......`.......
00000680: 0200 0000 0200 0000 7c00 0000 0000 0000 ........|.......
00000690: 002e 7379 6d74 6162 002e 7374 7274 6162 ..symtab..strtab
000006a0: 002e 7368 7374 7274 6162 002e 7265 6c61 ..shstrtab..rela
000006b0: 2e74 6578 7400 2e64 6174 6100 2e62 7373 .text..data..bss
000006c0: 002e 726f 6461 7461 002e 7265 6c61 2e69 ..rodata..rela.i
000006d0: 6e69 745f 6172 7261 7900 2e63 6f6d 6d65 nit_array..comme
000006e0: 6e74 002e 6e6f 7465 2e47 4e55 2d73 7461 nt..note.GNU-sta
000006f0: 636b 002e 7265 6c61 2e65 685f 6672 616d ck..rela.eh_fram
00000700: 6500 0000 0000 0000 0000 0000 0000 0000 e...............
00000710: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000720: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000730: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000740: 0000 0000 0000 0000 2000 0000 0100 0000 ........ .......
00000750: 0600 0000 0000 0000 0000 0000 0000 0000 ................
00000760: 4000 0000 0000 0000 9100 0000 0000 0000 @...............
00000770: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000780: 0000 0000 0000 0000 1b00 0000 0400 0000 ................
00000790: 4000 0000 0000 0000 0000 0000 0000 0000 @...............
000007a0: 2805 0000 0000 0000 0801 0000 0000 0000 (...............
000007b0: 0c00 0000 0100 0000 0800 0000 0000 0000 ................
000007c0: 1800 0000 0000 0000 2600 0000 0100 0000 ........&.......
000007d0: 0300 0000 0000 0000 0000 0000 0000 0000 ................
000007e0: d100 0000 0000 0000 0000 0000 0000 0000 ................
000007f0: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000800: 0000 0000 0000 0000 2c00 0000 0800 0000 ........,.......
00000810: 0300 0000 0000 0000 0000 0000 0000 0000 ................
00000820: d100 0000 0000 0000 0100 0000 0000 0000 ................
00000830: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000840: 0000 0000 0000 0000 3100 0000 0100 0000 ........1.......
00000850: 0200 0000 0000 0000 0000 0000 0000 0000 ................
00000860: d100 0000 0000 0000 0e00 0000 0000 0000 ................
00000870: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000880: 0000 0000 0000 0000 3e00 0000 0e00 0000 ........>.......
00000890: 0300 0000 0000 0000 0000 0000 0000 0000 ................
000008a0: e000 0000 0000 0000 0800 0000 0000 0000 ................
000008b0: 0000 0000 0000 0000 0800 0000 0000 0000 ................
000008c0: 0800 0000 0000 0000 3900 0000 0400 0000 ........9.......
000008d0: 4000 0000 0000 0000 0000 0000 0000 0000 @...............
000008e0: 3006 0000 0000 0000 1800 0000 0000 0000 0...............
000008f0: 0c00 0000 0600 0000 0800 0000 0000 0000 ................
00000900: 1800 0000 0000 0000 4a00 0000 0100 0000 ........J.......
00000910: 3000 0000 0000 0000 0000 0000 0000 0000 0...............
00000920: e800 0000 0000 0000 1b00 0000 0000 0000 ................
00000930: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000940: 0100 0000 0000 0000 5300 0000 0100 0000 ........S.......
00000950: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000960: 0301 0000 0000 0000 0000 0000 0000 0000 ................
00000970: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000980: 0000 0000 0000 0000 6800 0000 0100 0000 ........h.......
00000990: 0200 0000 0000 0000 0000 0000 0000 0000 ................
000009a0: 0801 0000 0000 0000 7800 0000 0000 0000 ........x.......
000009b0: 0000 0000 0000 0000 0800 0000 0000 0000 ................
000009c0: 0000 0000 0000 0000 6300 0000 0400 0000 ........c.......
000009d0: 4000 0000 0000 0000 0000 0000 0000 0000 @...............
000009e0: 4806 0000 0000 0000 4800 0000 0000 0000 H.......H.......
000009f0: 0c00 0000 0a00 0000 0800 0000 0000 0000 ................
00000a00: 1800 0000 0000 0000 0100 0000 0200 0000 ................
00000a10: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000a20: 8001 0000 0000 0000 4002 0000 0000 0000 ........@.......
00000a30: 0d00 0000 0e00 0000 0800 0000 0000 0000 ................
00000a40: 1800 0000 0000 0000 0900 0000 0300 0000 ................
00000a50: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000a60: c003 0000 0000 0000 6401 0000 0000 0000 ........d.......
00000a70: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000a80: 0000 0000 0000 0000 1100 0000 0300 0000 ................
00000a90: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000aa0: 9006 0000 0000 0000 7200 0000 0000 0000 ........r.......
00000ab0: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000ac0: 0000 0000 0000 0000 ........

These instructions correspond to an x86_64 machine. If you're interested in following along and matching the op codes, you can look at this reference or download the Intel manual for completeness. Likewise, it is an ELF file so you can observe that we see things we expect (i.e. starting magic number of 0x7f, etc.).

In any case, once linked against the system (i.e. run g++ test.cpp or g++ test.s or g++ test.o), this executable runs directly on top of your OS. There are no additional translation layers between this and the OS. Even so, the OS still does OS things like abstracting hardware interfaces, manage system resources, etc.

Tying this back to the original question, the output binary will look very different on a windows machine (for the same C++ code). At the very least, on a windows machine you would expect to see the file in the Portable Executable (PE) format which is distinctly not ELF.

This is unlike the following Java example which requires a JVM to run:

Java Source File
This is placed in a file called Test.java

package mytest;

public class Test {
public static void main(String[] args) {
System.out.println("Hello world!");
}
}

Java Byte Code (Generated from Java Source)
This is generated by running javac -d . Test.java and running the output file (i.e. mytest/Test.class) through xxd

00000000: cafe babe 0000 0034 001d 0a00 0600 0f09  .......4........
00000010: 0010 0011 0800 120a 0013 0014 0700 1507 ................
00000020: 0016 0100 063c 696e 6974 3e01 0003 2829 .....<init>...()
00000030: 5601 0004 436f 6465 0100 0f4c 696e 654e V...Code...LineN
00000040: 756d 6265 7254 6162 6c65 0100 046d 6169 umberTable...mai
00000050: 6e01 0016 285b 4c6a 6176 612f 6c61 6e67 n...([Ljava/lang
00000060: 2f53 7472 696e 673b 2956 0100 0a53 6f75 /String;)V...Sou
00000070: 7263 6546 696c 6501 0009 5465 7374 2e6a rceFile...Test.j
00000080: 6176 610c 0007 0008 0700 170c 0018 0019 ava.............
00000090: 0100 0c48 656c 6c6f 2077 6f72 6c64 2107 ...Hello world!.
000000a0: 001a 0c00 1b00 1c01 000b 6d79 7465 7374 ..........mytest
000000b0: 2f54 6573 7401 0010 6a61 7661 2f6c 616e /Test...java/lan
000000c0: 672f 4f62 6a65 6374 0100 106a 6176 612f g/Object...java/
000000d0: 6c61 6e67 2f53 7973 7465 6d01 0003 6f75 lang/System...ou
000000e0: 7401 0015 4c6a 6176 612f 696f 2f50 7269 t...Ljava/io/Pri
000000f0: 6e74 5374 7265 616d 3b01 0013 6a61 7661 ntStream;...java
00000100: 2f69 6f2f 5072 696e 7453 7472 6561 6d01 /io/PrintStream.
00000110: 0007 7072 696e 746c 6e01 0015 284c 6a61 ..println...(Lja
00000120: 7661 2f6c 616e 672f 5374 7269 6e67 3b29 va/lang/String;)
00000130: 5600 2100 0500 0600 0000 0000 0200 0100 V.!.............
00000140: 0700 0800 0100 0900 0000 1d00 0100 0100 ................
00000150: 0000 052a b700 01b1 0000 0001 000a 0000 ...*............
00000160: 0006 0001 0000 0003 0009 000b 000c 0001 ................
00000170: 0009 0000 0025 0002 0001 0000 0009 b200 .....%..........
00000180: 0212 03b6 0004 b100 0000 0100 0a00 0000 ................
00000190: 0a00 0200 0000 0500 0800 0600 0100 0d00 ................
000001a0: 0000 0200 0e .....

As one would expect, the byte code output starts with 0xCAFEBABE.

The critical distinction here, however, is that this code cannot be run directly. It is still a binary output, but it's not intended to be executed directly by the operating system. If you tried to run this without a JVM on your system, you would just get an error. However, this code can be run on any operating system that contains a compatible JVM. The set of compatible JVM's depends on how you've set your source and target. By default, it's equivalent to the Java version you're using to compile. In this case, I used Java 8.

The way this works is that your JVM is compiled for each system specifically (similarly to the C++ example above) and translates its binary Java byte code into something your system can now execute.

At the end of the day, there is no free lunch-- as DanielKO mentioned in the comments, the JVM is still a "platform" but it's one-level higher than the OS so it can seem a bit more portable. Eventually, somewhere along the way, the code must translate to instructions valid for your specific operating system family and CPU architecture. However, in the case of Java and the JVM, you only have to compile a single application (i.e. the JVM itself) for all system flavors. At that point, everything written on top of the JVM has system support "for free" so to speak (as long as your application is entirely written in Java and isn't using native interfaces, etc.).

As I mentioned before, there are many caveats to this information :) This was a pretty straightforward example intended to illustrate what you might actually observe. That said, we didn't get into calling native code, using custom JVM agents, or anything else which may affect this answer slightly. In general, however, these more often fall into the category of "special cases" and you wouldn't often be using these things unless you understood why and (hopefully) their implications to portability.

What does platform independent languages really mean?

Isn't different compilers and JVM doing same thing and achieving platform independence.

Not really. Your Java program is running within the JVM, which acts as a translation layer between the Java byte code and the native machine code. It hides the platform-specific details from the Java application code.

This is not the case with C. C code (typically) runs natively, so there is no translation layer isolating it from platform-specific details. Your C code can be directly affected by platform-specific differences (word sizes, type representations, byte order, etc.).

A strictly conforming C program, which uses nothing outside of the standard library and makes no assumptions about type sizes or representation beyond the minimums guaranteed by the language standard, should exhibit the same behavior on any platform for which it is compiled. All you need to do is recompile it for the target platform.

The problem is that most useful real-world C and C++ code isn't strictly conforming; to do almost anything interesting you have to rely on third-party and system-specific libraries and utilities, and as soon as you do you lose that platform independence. I could write a command-line tool manipulates files in the local file system that would run on Windows and MacOS and Linux and VMS and MPE; all I would need to do is recompile it for the different targets. However, if I wanted to write something GUI-driven, or something that communicated over a network, or something that had to navigate the file system, or anything like that, then I'm reliant on system-specific tools and I can't just rebuild the code on different platforms.

Why is C platform-dependent?

  1. Different CPUs have different instruction architectures (e.g., x86 vs ARM).
    • Early Macs used the Motorola 68k architecture; later ones used PowerPC; and still later ones used x86. During each of these transitions, developers had to ship their executables as fat binaries, which would contain object code for both architectures.
  2. Current x86 CPUs have 32-bit and 64-bit modes.
    • This is why you have 32-bit and 64-bit versions of Windows, Ubuntu, etc.
  3. Different operating systems provide different system calls, libraries, etc.
    • Different OS versions can provide different system calls, libraries, etc. also (although OS vendors do aim to be backward compatible as much as possible).
  4. Even on the same operating system, the calling convention is not guaranteed to be the same between different compilers.
    • Even on the same OS, different executable file formats may be in use. For example, on many Unix systems, a.out used to be the format used, but most eventually switched to ELF. During the transition period, libraries had to be provided in both formats.

Is C platform (H/W + S/W) dependent or independent

There's more to an executable binary than being a combination of the machine language of the sources and libraries required, but skipping that for now the answer would still be no: libraries you're linking in are specific to the system they are going to run on. For example, you question mac vs linux... each system has its own way of making calls that trap into the system (for I/O, for example), and if your executable has those items linked into it (rather than through shared objects), you're doomed from the start.

Back to "specific to the system"... not only are things like system calls an issue, but there's more than one executable binary format and this will cause you more than mild pain much of the time. For example, OS X uses the "Mach-O" binary format while Linux uses the "ELF" format. Certainly within these files you'll find similar code for the same executable, but you'll find different binary layouts of that data that make what appears to be straight forward on one system a complete unknown on another. One of many examples... the Mach-O format embeds within the executable the place its dynamically loaded libraries are expected to be -- either absolute locations or locations relative to the loading executable for example. This concept (as far as I know) isn't in either ELF or the Windows "PE" format.

Why was C not made a platform independent language?

What the Dragon Book is describing is the following process:

  1. Compile the source code into an intermediate machine-independent byte code format
  2. Perform optimizations and analyses on that IR
  3. Translate the IR to the target platform's actual machine code

The upside of this is that if you want to support additional systems, you just need to add a new code generator for step 3 without having to touch steps 1 and 2.

All common C compilers work this way. So if your question is "Why don't C compilers do what the Dragon Book describes?", the answer is: "They do".

Now you mentioned Java. What a Java compiler does is the following:

  1. Compile the Java code into Java byte code. As far as the Java compiler is concerned, this is not an intermediate format, but the actual target language.
  2. The end

Now to run this byte code you need a JVM, which interprets the byte code and/or JIT-compiles it. The optimizations and analyses usually happen during JIT-compilation. This is not the process described in the Dragon Book.

From the language implementers' point of view, this doesn't change the effort of supporting a new target system very much. You no longer have to change the compiler, but instead you have to change the JVM: Instead of having to add a new backend to the javac compiler, you instead add a new backend to the JIT-compiler. The effort remains basically the same.

The major difference is for the Java programmers: Instead of compiling the program for every target platform and distributing packages for each platform, you can now compile the code once and give the resulting package to everyone. Now the people running your code need to install an JVM to be able to use the package, so you basically moved the effort from the programmer to the end user, but installing a JVM is something you need to do only once (not for every Java program you want to run).

So instead of "write once, compile everywhere", you now have "compile once, run everywhere".

So why didn't C do the same thing that Java does? Performance. Interpreting byte code is slow (compared to running compiled code) and JIT-compilation leads to increased start-up time.

How is Python platform independent?

C is platform independent in the sense that it can be compiled for any machine for which a compiler is made to target that machine. That's why the Python source code is platform independent, even if a Python binary can only work on one platform.

Low level languages and their dependencies

In the end processors executes machine code which is basicly a collection of binary numbers. The processor decode each binary number to figure out what it is supposed to do. One binary number could mean "Add register X to register Y and store the result in register Z". Another binary number could mean "Store the content of register X into the memory address held by register Y". And so on...

The complete description of these decoding rules (i.e. binary number into operation) represents the processors instruction set (aka ISA).

A low level language is a language where the code you can write maps very closely to the specific processors instruction set. Assembly is one obvious example. Since different processor may have different instruction sets, it's clear that an assembly program written for one processors ISA can't be used on a processor with a different ISA.

Let's take for example C, well if it is machine-dependent does it mean that if it was compiled on one computer it might not be able to run on another?

Correct. A program compiled for one processor (family) can't run on another processor with (completely) different ISA. The program needs to be recompiled.

Also notice that the target OS also plays a role. If you use the same processor but use different OS you'll also need to recompile.

There are at least 3 different kind of languages.

  1. A languages that is so close to the target systems ISA that the source code can only be used on that specific target. Example: Assembly

  2. A language that allows you to write code that can be used on many different targets using a target specific compilation. Example: C



  3. Related Topics



Leave a reply



Submit