Is 1.0 a valid output from std::generate_canonical?
The problem is in mapping from the codomain of std::mt19937
(std::uint_fast32_t
) to float
; the algorithm described by the standard gives incorrect results (inconsistent with its description of the output of the algorithm) when loss of precision occurs if the current IEEE754 rounding mode is anything other than round-to-negative-infinity (note that the default is round-to-nearest).
The 7549723rd output of mt19937 with your seed is 4294967257 (0xffffffd9u
), which when rounded to 32-bit float gives 0x1p+32
, which is equal to the max value of mt19937, 4294967295 (0xffffffffu
) when that is also rounded to 32-bit float.
The standard could ensure correct behavior if it were to specify that when converting from the output of the URNG to the RealType
of generate_canonical
, rounding is to be performed towards negative infinity; this would give a correct result in this case. As QOI, it would be good for libstdc++ to make this change.
With this change, 1.0
will no longer be generated; instead the boundary values 0x1.fffffep-N
for 0 < N <= 8
will be generated more often (approximately 2^(8 - N - 32)
per N
, depending on the actual distribution of MT19937).
I would recommend to not use float
with std::generate_canonical
directly; rather generate the number in double
and then round towards negative infinity:
double rd = std::generate_canonical<double,
std::numeric_limits<float>::digits>(rng);
float rf = rd;
if (rf > rd) {
rf = std::nextafter(rf, -std::numeric_limits<float>::infinity());
}
This problem can also occur with std::uniform_real_distribution<float>
; the solution is the same, to specialize the distribution on double
and round the result towards negative infinity in float
.
Is generate_canonical output consistent across platforms?
The difficulties encountered in the linked question points to the basic problem with consistency: rounding mode. The clear intent of the mathematical definition of generate_canonical
in the standard is that the URNG be called several times, each producing a non-overlapping block of entropy to fill the result with; that would be entirely consistent across platforms. The problem is, no indication is given as to what to do with the extra bits below the LSB. Depending on rounding mode and summation order, these can round upwards, spilling into the next block (which is what allows for a 1.0 result).
Now, the precise wording is "the instantiation’s results...are distributed as uniformly as possible as specified below". If the rounding mode is round-to-nearest, an implementation which produces 1.0 is not as uniform as possible (because 1-eps is less likely than 1-2*eps). But it's still "as specified below". So depending on how you parse that sentence, generate_canonical
is either fully specified and consistent, or has delegated some extra un-discussed bits to the implementation.
In any case, the fact that certain implementations produce 1.0 makes it quite clear that the current behavior is not cross-platform consistent. If you want that, it seems like the most straightforward approach would be to wrap your URNG in an independent_bits_engine
to produce some factor of bits
bits, so there's never anything to round.
how to use std::generate_canonical to generate random number in range [0,1)?
26.5.7.2 Function template generate_canonical [rand.util.canonical]
Each function instantiated from the template described in this section
26.5.7.2 maps the result of one or more invocations of a supplied uniform random number generatorg
to one member of the specified
RealType
such that, if the valuesg
i produced byg
are uniformly
distributed, the instantiation's resultstj
, 0 ≤tj
< 1, are
distributed as uniformly as possible as specified below.template<class RealType, size_t bits, class URNG>
RealType generate_canonical(URNG& g);
Also, Standard describes, that this function returns , where
So, this function should return a value from zero to one. I think, Microsoft implementation is wrong here.
Is 1.0 a valid output from std::generate_canonical?
The problem is in mapping from the codomain of std::mt19937
(std::uint_fast32_t
) to float
; the algorithm described by the standard gives incorrect results (inconsistent with its description of the output of the algorithm) when loss of precision occurs if the current IEEE754 rounding mode is anything other than round-to-negative-infinity (note that the default is round-to-nearest).
The 7549723rd output of mt19937 with your seed is 4294967257 (0xffffffd9u
), which when rounded to 32-bit float gives 0x1p+32
, which is equal to the max value of mt19937, 4294967295 (0xffffffffu
) when that is also rounded to 32-bit float.
The standard could ensure correct behavior if it were to specify that when converting from the output of the URNG to the RealType
of generate_canonical
, rounding is to be performed towards negative infinity; this would give a correct result in this case. As QOI, it would be good for libstdc++ to make this change.
With this change, 1.0
will no longer be generated; instead the boundary values 0x1.fffffep-N
for 0 < N <= 8
will be generated more often (approximately 2^(8 - N - 32)
per N
, depending on the actual distribution of MT19937).
I would recommend to not use float
with std::generate_canonical
directly; rather generate the number in double
and then round towards negative infinity:
double rd = std::generate_canonical<double,
std::numeric_limits<float>::digits>(rng);
float rf = rd;
if (rf > rd) {
rf = std::nextafter(rf, -std::numeric_limits<float>::infinity());
}
This problem can also occur with std::uniform_real_distribution<float>
; the solution is the same, to specialize the distribution on double
and round the result towards negative infinity in float
.
std::uniform_real_distribution inclusive range
This is easier to think about if you start by looking at integers. If you pass [-1, 1) you would expect to get -1, 0
. Since you want to include 1
, you would pass [-1, (1+1)), or [-1, 2). Now you get -1, 0, 1
.
You want to do the same thing, but with doubles:
Borrowing from this answer:
#include <cfloat> // DBL_MAX
#include <cmath> // std::nextafter
#include <random>
#include <iostream>
int main()
{
const double start = -1.0;
const double stop = 1.0;
std::random_device rd;
std::mt19937 gen(rd());
// Note: uniform_real_distribution does [start, stop),
// but we want to do [start, stop].
// Pass the next largest value instead.
std::uniform_real_distribution<> dis(start, std::nextafter(stop, DBL_MAX));
for (auto i = 0; i < 100; ++i)
{
std::cout << dis(gen) << "\n";
}
std::cout << std::endl;
}
(See the code run here)
That is, find the next largest double value after the one you want, and pass that as the end value instead.
How efficient is it to make a temporary uniform random distribution each time round a loop?
std::uniform_real_distribution
's objects are lightweight, so it's not a problem to construct them each time inside the loop.
Sometimes, the hidden internal state of distribution is important, but not in this case. reset()
function is doing nothing in all popular STL implementations:
void
reset() { }
For example, it's not true for std::normal_distribution
:
void
reset()
{ _M_saved_available = false; }
In C++ string, why after the last character, behavior is different when accessed by index and at()?
shouldn't both behavior be the same either both( accessing by index and at()) give abnormal termination or exit normally?
No, they should not have the same behaviour. The behaviour is different intentionally. If it wasn't, then there would only be a need for one of them to exist.
The at
member function performs bounds checks. Any access outside the bounds of the container results in an exception. This is the same as the at
member function of std::array
or std::vector
for example. Note that an uncaught throw will cause the program to be terminated.
The subscript operator does not perform any out of bounds checks. Prior to C++11, any access to elements at indices > size()
has undefined behaviour. Under no circumstance is the subscript operator guaranteed to throw an exception. This is the same as subscript operator of an array, std::array
or std::vector
for example.
Since C++11, the behaviour of the subscript operator of std::string
was changed such that reading the element at index == size()
(i.e. one past the last element) is well defined, and returns a null terminator. Only modifying the object through the returned reference has undefined behaviour. Reading other indices outside the bounds still has undefined behaviour.
I do not know for a fact the rationale for not making corresponding change to at
to allow access to the null terminator, but I suspect that it was considered to be a backwards incompatible change. Making UB well defined is always backwards compatible, while ceasing exception throwing is not. Another possible reason is that it would have opened a route to UB (if the null terminator is modified), and the design of at
is to keep it free from UB.
Getting strange characters in fields with output from json c++
You should save the result in a std::string
and then call c_str()
on that string to get a C-string. If you chain those calls and save the pointer instantly or only do asCString()
the string object holding the memory the C-string is pointing at will have been cleared and you will invoke undefined behavior in your code which is not what you want.
I.E
std::string runtime = result.get("Runtime", "NULL").asString();
runtimeInput->value(runtime.c_str());
Related Topics
Enum VS Constexpr for Actual Static Constants Inside Classes
How to Compare Two Vectors for Equality Element by Element in C++
Template Specialization Based on Inherit Class
What Is the Safe Way to Fill Multidimensional Array Using Std::Fill
Who Defines C Operator Precedence and Associativity
When Do Programmers Use Empty Base Optimization (Ebo)
Convert a Static Library to a Shared Library (Create Libsome.So from Libsome.A): Where's My Symbols
Calling a Constructor to Re-Initialize Object
C++ View Types: Pass by Const& or by Value
How to Iterate Over a Std::Tuple in C++ 11
Access Private Member Using Template Trick
How to Include the String Header