How to Share Memory Between Linux Program and Windows Program Running Through Wine (Same Computer)

How to share memory between linux program and windows program running through Wine (same computer)?

The purpose of Wine is to provide a WinAPI-like environment on Unix(-like) systems. This implies that Wine may be considered a separate, API-facaded, "independent" operating system on top and along a Unix-like system. Thus, that machine you say may actually have two OSes, one over the other. Firstly, the "real" (controlling real-hardware) one, that is, GNU/Linux. Secondly, there is the WinAPI implementation known as Wine in top of the POSIX/SUS interfaces.

And, as far as humankind is concerned, there's one, and only one single portable way to create inter-process communication between machines with different operating systems, and, as you may have already noticed, I refer to sockets.

The Wine subsystem may be considered a semi-virtual machine by its own right, isolated from the Linux kernel, but tightly coupled to it at the same time.

For efficiency purposes, my proposal is to use what sockets in conjunction with what I call the SHMNP (Shared Memory Network Protocol) to provide network-wide shared memory. Again, remember, both "machines" (although it's physically just one) shall be though to be independent. The Wine implementation is too dirty for the clumsy details to be easily work-arounded (although that's nothing compared to Cygwin's hacks).

The SHMNP works this way. Note, however, that the SHMNP does not exist! It's just theoretical, and the protocol structures et al are not presented for obvious reasons.

Both machines create their own sockets/shared-memory areas (it's assumed they negotiated the area's size previously). At the same time, they choose a port number and one of the machines becomes the server, the other one becoming the client. The connection is initialized.
Initially, all "shared" memory in both machines contains uninitialized data (the other machine may have different values for any given shared memory block).
Until the connection is closed, if any of the two machines write to any of address of the shared memory area, a message shall be sent to the other machine with the information that changed. The Linux kernel's funky features may be exploited to allow even raw pointers to work perfectly fine with this (see below). I'm, however, not aware of doing it in Windows rather that by specialized ReadNetworkShared() and WriteNetworkShared()-like procedures.
The implementation may provide some sort of synchronization mechanism, so to allow network-wide semaphores, mutexes, et al.

Linux kernel specific quirks:
Most modern general-purpose hardware architectures and operating systems provide for a way to protect memory from malicious/buggy/unintended use by a user process. Whenever you read/write to memory that isn't mapped in your process's virtual address space, the CPU will notify the operating system kernel that a page fault has occured. Subsequently, the kernel (if Unix(-like)) will send a segmentation violation signal to the offending process, or in other words, you receive SIGSEGV.

The hidden magical secret is that SIGSEGV may be caught, and handled. Thus, we may mmap() some memory (the shared memory area), mark it as read-only with mprotect(), then, whenever we try to write to an address in the shared memory area, the process will receive a SIGSEGV. The signal handler subsequently performs checks in the siginfo_t passed on by the kernel, and deduces one of two actions.

If the faulty address is not in the shared memory area, abort() or whatever.
Otherwise, the to be written page shall be copied to a temporary storage (maybe with the help of splice()?). Then, mark the to be written page as read/write, and setup a timer so that in within a timeout the page is marked read-only again and the (maybe compressed) difference between the old copy and the now-written page is sent through the socket (SIMD may help you here). The handler then returns, allowing the write (and maybe, other writes!) to complete without further intervention until the timer fires out.

Whenever a machine receives compressed data through the socket, it's simply decompressed and written where it belongs.

Hope this helps you!

Edit: I just found an obvious flaw of the pre-edit design. If a (compressed) page was sent to another machine, that other machine would be unable to differentiate between data that has been modified within the page and data that hasn't been modified. This involves a race condition, where the receiving machine may lose information it hasn't yet sended. However, some more Linux-kernel-specific stuff fixes it.

How a Windows Developer can most easily get his software to work well under Wine

Download VMWare and an Ubuntu virtual machine (Ubuntu is a popular Linux distribution) from the VMWare site. This will provide you with a working Linux O/S inside your Windows environment without needing to install Linux manually.

You can then use the instructions here to install Wine, that Wiki page also provides you with some instructions on how to use it.

If you follow what Adam Rosenfield suggested and just try running your application in Wine unmodified, you will be able to determine quickly whether there are problems. My guess would be that there are some, otherwise your users would not have contacted you about it :)

There are many ways for getting help with debugging applications in Wine, consult the website for options and pick a few ways that suit you. As always, it's best not to rely on a single channel for communication.

Also, if you are more comfortable with developing in Windows, the approach of using a virtual machine will allow you to compile your code as usual in Windows and copy the binary into the virtual machine for testing (Ubuntu supports browsing/mounting Windows shares).

How do I reserve memory regions before Windows maps my program's DLLs?

I think I finally got it using a method similar to what dxiv suggested in the comments. Instead of using a dummy DLL, I build a basic executable that loads at the beginning of my reserved region using the /FIXED and /BASE compiler flags. The code for the executable contains an uninitialized array that ensures the image covers the needed addresses in memory, but doesn't take up any extra space in the file:

unsigned char Reserved[4194304]; // 4MB

At runtime, the executable copies itself to a new location in memory and updates a couple of fields in the Process Environment Block to point to it. Without updating the fields, calling certain functions like FormatMessage would cause a crash.

#include <intrin.h>
#include <windows.h>
#include <winternl.h>

#pragma intrinsic(__movsb)

void Relocate() {
    void *Base, *NewBase;
    ULONG SizeOfImage;
    PEB *Peb;
    LIST_ENTRY *ModuleList, *NextEntry;

    /* Get info about the PE image. */
    Base = GetModuleHandleW(NULL);
    SizeOfImage = ((IMAGE_NT_HEADERS *)(((ULONG_PTR)Base) +
        ((IMAGE_DOS_HEADER *)Base)->e_lfanew))->OptionalHeader.SizeOfImage;

    /* Allocate memory to hold a copy of the PE image. */
    NewBase = VirtualAlloc(NULL, SizeOfImage, MEM_COMMIT, PAGE_READWRITE);
    if (!NewBase) {
        ExitProcess(GetLastError());
    }

    /* Copy the PE image to the new location using __movsb since we don't have
       a C library. */
    __movsb(NewBase, Base, SizeOfImage);

    /* Locate the Process Environment Block. */
    Peb = (PEB *)__readfsdword(0x30);

    /* Update the ImageBaseAddress field of the PEB. */
    *((PVOID *)((ULONG_PTR)Peb + 0x08)) = NewBase;

    /* Update the base address in the PEB's loader data table. */
    ModuleList = &Peb->Ldr->InMemoryOrderModuleList;
    NextEntry = ModuleList->Flink;
    while (NextEntry != ModuleList) {
        LDR_DATA_TABLE_ENTRY *LdrEntry = CONTAINING_RECORD(
            NextEntry, LDR_DATA_TABLE_ENTRY, InMemoryOrderLinks);
        if (LdrEntry->DllBase == Base) {
            LdrEntry->DllBase = NewBase;
            break;
        }
        NextEntry = NextEntry->Flink;
    }
}

I built the executable with /NODEFAULTLIB just to reduce its size and the number of DLLs loaded at runtime, hence the use of the __movsb intrinsic. You could probably get away with linking to MSVCRT if you wanted to and then replace __movsb with memcpy. You can also import memcpy from ntdll.dll or write your own.

Once the executable is moved out of the way, I call a function in a DLL that contains the rest of my code. The DLL uses UnmapViewOfFile to get rid of the original PE image, which gives me a nice 4MB+ chunk of memory to work with, guaranteed not to contain mapped files, thread stacks, or heaps.

A few things to keep in mind with this technique:

This is a huge hack. I felt dirty writing it and it very well could fall apart in future versions of Windows. ~~I also haven't tested this on anything other than Windows 7.~~ This code works on Windows 7 and Windows 10, at least.
Since the executable is built with /FIXED /BASE, its code is not position-independent and you can't just jump to the relocated executable.
If the DLL function that calls UnmapViewOfFile returns, the program will crash because the code section we called from doesn't exist anymore. I use ExitProcess to ensure the function never returns.
Some sections in the relocated PE image like those containing code can be released using VirtualFree to free up some physical memory.
My code doesn't bother re-sorting the loader data table entries. It seems to work fine that way, but it could break if something were to depend on the entries being ordered by image address.
Some anti-virus programs might get suspicious about this stuff. Microsoft Security Essentials didn't complain, at least.
In hindsight, dxiv's dummy DLL method may have been easier, because I wouldn't need to mess with the PEB. But I stuck with this technique because the executable is more likely to be loaded at its desired base address. The dummy DLL method didn't work for me. DLLs are loaded by Ntdll after Windows has already reserved regions of memory that I need.

How to Share Memory Between Linux Program and Windows Program Running Through Wine (Same Computer)