Deterministic builds under Windows
I solved this to an extent.
Currently we have build system that makes sure all new builds are on the path of constant length (builds/001, builds/002, etc), thus avoiding shifts in the PE layout. After build a tool compares old and new binaries ignoring relevant PE fields and other locations with known superficial changes. It also runs some simple heuristics to detect dynamic ignorable changes. Here is full list of things to ignore:
- PE timestamp and checksum
- Digital signature directory entry
- Export table timestamp
- Debugger section timestamp
- PDB signature, age and file path
- Resources timestamp
- All file/product versions in VS_VERSION_INFO resource
- Digital signature section
- MIDL vanity stub for embedded type libraries (contains timestamp string)
- __FILE__, __DATE__ and __TIME__ macros when they are used as literal strings (can be wide or narrow char)
Once in a while linker would make some PE sections bigger without throwing anything else out of alignment. Looks like it moves section boundary inside the padding -- it is zeros all around anyway, but because of it I'll get binaries with 1 byte difference.
UPDATE: we recently opensourced the tool on GitHub. See Compare section in documentation.
How to program in Windows 7.0 to make it more deterministic?
Real time solutions for Windows such as LabVIEW Real-time or RTX are expensive; a stand-alone RTOS would often be less expensive (or even free), but if you need Windows functionality as well, you are perhaps no further forward.
If cost is critical, you might run a free or low-cost RTOS in a virtual machine. This can work, though there is no cooperation over hardware access between the RTOS and Windows, and no direct communication mechanism (you could use TCP/IP over a virtual (or real) network I suppose.
Another alternative is to perform the real-time data acquisition on stand-alone hardware (a microcontroller development board or SBC for example) and communicate with Windows via USB or TCP/IP for example. It is possible that way to get timing jitter down to the microsecond level or better.
Deterministic Library Build Using CMake
CMAKE_CXX_ARCHIVE_FINISH worked for me.
CMakeLists.txt :
cmake_minimum_required(VERSION 3.10)
project(Test)
SET(CMAKE_CXX_ARCHIVE_CREATE "<CMAKE_AR> -crD <TARGET> <LINK_FLAGS> <OBJECTS>")
SET(CMAKE_CXX_ARCHIVE_APPEND "<CMAKE_AR> -rD <TARGET> <LINK_FLAGS> <OBJECTS>")
SET(CMAKE_CXX_ARCHIVE_FINISH "<CMAKE_RANLIB> -D <TARGET>")
add_library(Test Main.cpp)
Is BCryptGetProperty call deterministic?
Yes, the size of a SHA256 hash is always the same. Getting the size by asking the crypto provider is useful if you are working at a higher level.
Imagine you have a generic hash class:
class Hash {
bool Init(LPCWSTR pszAlgId) { BCryptGetProperty(m_AlgoProvider, BCRYPT_OBJECT_LENGTH, ...); m_data = malloc(); ... BCryptCreateHash(..., pszAlgId, m_data, ...) ... }
void AddData(LPCVOID p, SIZE_T cb) { ... }
DWORD GetHashSize() { BCryptGetProperty(m_HashObj, BCRYPT_HASH_LENGTH, ...); }
bool Finalize(LPVOID pHash) { ... }
};
The class does not know the hash algorithm nor the hash size at compile time.
BCRYPT_OBJECT_LENGTH
is the size of the internal data used by the hashing function. It is the same for all hashes of a specific type implemented by a specific crypto provider. If you only support Windows 7 and later you can ask Windows to allocate this memory for you and you don't have to query the object size.
I believe all BCRYPT properties are deterministic after the crypto object has been properly created/initialized and you can cache obvious constant fields like sizes and modes. Things like BCRYPT_INITIALIZATION_VECTOR
are obviously a per-object property and should only be cached for that specific object.
Deterministic python script behaves in non-deterministic way
In general, linalg libraries on Windows give different answers on different runs at machine precision level. I never heard of an explanation why this happens only or mainly on Windows.
If your matrix is ill conditioned, then the inv will be largely numerical noise. On Windows the noise is not always the same in consecutive runs, on other operating systems the noise might be always the same but can differ depending on the details of the linear algebra library, on threading options, cache usage and so on.
I've seen on and posted to the scipy mailing list several examples for this on Windows, I was using mostly the official 32 bit binaries with ATLAS BLAS/LAPACK.
The only solution is to make the outcome of your calculation not depend so much on floating point precision issues and numerical noise, for example regularize the matrix inverse, use generalized inverse, pinv, reparameterize or similar.
Related Topics
How to Receive a Lambda as Parameter by Reference
Undefined Symbols for Architecture X86_64 - Mavericks (Yosemite, El Capitan...)
What Are Potential Dangers When Using Boost::Shared_Ptr
What Use Are Const Pointers (As Opposed to Pointers to Const Objects)
Glpixelstorei(Gl_Unpack_Alignment, 1) Disadvantages
How to Use New Std::Byte Type in Places Where Old-Style Unsigned Char Is Needed
Why Do We Use Std::Function in C++ Rather Than the Original C Function Pointer
Add a Method to Existing C++ Class in Other File
Cannot Open Include File: 'Stdio.H' - Visual Studio Community 2017 - C++ Error
C++ Templates Declare in .H, Define in .Hpp
Linear Index Upper Triangular Matrix
"Roll-Back" or Undo Any Manipulators Applied to a Stream Without Knowing What the Manipulators Were
Pointing to a Function That Is a Class Member - Glfw Setkeycallback
How to Create and Initialize an Array of Values Using Template Metaprogramming
Can Nullptr Be Emulated in Gcc