Generate Random Numbers with Fixed Mean and Sd

Generate random numbers with fixed mean and sd

Since you asked for a one-liner:

rnorm2 <- function(n,mean,sd) { mean+sd*scale(rnorm(n)) }
r <- rnorm2(100,4,1)
mean(r) ## 4
sd(r) ## 1

Generate positive random numbers with fixed mean and SD

It really depends on the underlying distribution.

With R = normrnd(mu,sigma) you can generate normal distributed random numbers with specified mean and standard deviation.

R = lognrnd(mu,sigma) generates lognormal distributed random numbers.

You can also take a look at this where you can specify the distribution.

Just make sure that you check if it really is positive, e.g.

mu = 10;
sigma = 1;
R = normrnd(mu,sigma,1,500);
%R(R<=0) = resample

Javascript: Generate random numbers with fixed mean and standard deviation

How can I create a list of random numbers with a given mean and standard deviation (sd) in JavaScript?

This appears to be a question about randomly creating a list of numbers that has an exactly specified mean and an exactly specified standard deviation (and not a question about drawing random numbers from a specific probability distribution with a given mean and sd).

A straightforward solution is to draw a list of random numbers, then to shift and scale this list to have the desired mean and sd, as described in this answer from stats.stackexchange.

Say, we generate the following 5 random numbers between 1 and 10:

4.527991433628388
6.3254986488276055
5.123502737960912
7.3331068522336125
9.069573681037484

This list has a mean of 6.475934670737601 and sd of 1.8102412442104023.

Then, we transform each number in the list like this:

newNum = newSD * (oldNum - oldMean) / oldSD + newMean

By setting the new mean to 5 and new sd to 2, we get the following transformed list:

2.847863379160965
4.83379450402964
3.505799227476338
5.947025358346529
7.865517530986525

Computing the mean and sd of this list confirms that they are indeed 5 and 2.

Below is code demonstrating this approach in JavaScript:

// create a list of 5 random numbers between 1 and 10
var list = randomList(5, 1, 10);

// transform the list to have an exact mean of 5 and sd of 2
var newList = forceDescriptives(list, 5, 2);

// display the transformed list and descriptive statistics (mean and sd)
console.log('Transformed random list', newList, descriptives(newList));

// display the original list and descriptive statistics (mean and sd)
console.log('Original random list', list, descriptives(list));


/* demo functions */

function randomList(n, a, b) {
// create a list of n numbers between a and b
var list = [],
i;
for (i = 0; i < n; i++) {
list[i] = Math.random() * (b - a) + a;
}
return list;
}

function descriptives(list) {
// compute mean, sd and the interval range: [min, max]
var mean,
sd,
i,
len = list.length,
sum,
a = Infinity,
b = -a;
for (sum = i = 0; i < len; i++) {
sum += list[i];
a = Math.min(a, list[i]);
b = Math.max(b, list[i]);
}
mean = sum / len;
for (sum = i = 0; i < len; i++) {
sum += (list[i] - mean) * (list[i] - mean);
}
sd = Math.sqrt(sum / (len - 1));
return {
mean: mean,
sd: sd,
range: [a, b]
};
}

function forceDescriptives(list, mean, sd) {
// transfom a list to have an exact mean and sd
var oldDescriptives = descriptives(list),
oldMean = oldDescriptives.mean,
oldSD = oldDescriptives.sd,
newList = [],
len = list.length,
i;
for (i = 0; i < len; i++) {
newList[i] = sd * (list[i] - oldMean) / oldSD + mean;
}
return newList;
}

Is there a way in Java to generate random numbers following fixed mean and standard deviation?

The nextGaussian() method returns random numbers with a mean of 0 and a standard deviation of 1.

This means that numbers returned by nextGaussian() will tend to "cluster" around 0, and that (approximately) 70% of values will be between -1 and 1. Based on the values returned by nextGaussian(), you can scale and shift them to get other normal distributions:

  • to change the maen (average) of the distribution, add the required
    value;

  • to change the standard deviation, multiply the value.

Examples:
to generate values with an average of 500 and a standard deviation of 100:

double val = r.nextGaussian() * 100 + 500;

to generate values with an average of 30.5 and a standard deviation of 2.5:

double val = r.nextGaussian() * 2.5 + 30.5;

with this 70% of values will be between 28 and 33. As 99.7% of the values lie in the 3-sigma range the height of the monkeys is between 24 and 36.

Generating random numbers with predefined mean, std, min and max

You need to choose a probability distribution according to your needs. There are a number of continuous distributions with bounded intervals. For example, you can pick the (scaled) beta distribution and compute the parameters α and β to fit your mean and standard deviation:

import numpy as np
import scipy.stats
import matplotlib.pyplot as plt

def my_distribution(min_val, max_val, mean, std):
scale = max_val - min_val
location = min_val
# Mean and standard deviation of the unscaled beta distribution
unscaled_mean = (mean - min_val) / scale
unscaled_var = (std / scale) ** 2
# Computation of alpha and beta can be derived from mean and variance formulas
t = unscaled_mean / (1 - unscaled_mean)
beta = ((t / unscaled_var) - (t * t) - (2 * t) - 1) / ((t * t * t) + (3 * t * t) + (3 * t) + 1)
alpha = beta * t
# Not all parameters may produce a valid distribution
if alpha <= 0 or beta <= 0:
raise ValueError('Cannot create distribution for the given parameters.')
# Make scaled beta distribution with computed parameters
return scipy.stats.beta(alpha, beta, scale=scale, loc=location)

np.random.seed(100)

min_val = 1.5
max_val = 35
mean = 9.87
std = 3.1
my_dist = my_distribution(min_val, max_val, mean, std)
# Plot distribution PDF
x = np.linspace(min_val, max_val, 100)
plt.plot(x, my_dist.pdf(x))
# Stats
print('mean:', my_dist.mean(), 'std:', my_dist.std())
# Get a large sample to check bounds
sample = my_dist.rvs(size=100000)
print('min:', sample.min(), 'max:', sample.max())

Output:

mean: 9.87 std: 3.100000000000001
min: 1.9290674232087306 max: 25.03903889816994

Probability density function plot:

Probability density function

Not every possible combination of bounds, mean and standard deviation will produce a valid distribution in this case, and the beta distribution has some particular properties that you may or may not desire. There are potentially infinite possible distributions that match some given requirements of bounds, mean and standard deviation with different qualities (skew, kurtosis, modality, ...). You need to decide what is the best distribution for your case.

Generate two sequences of random numbers with fixed mean and sd, with ordering constraint

As worded, this is not mathematically possible. Let D = S - T. Your constraint that T <= S means S - T = D >= 0. Since S and T are normally distributed, so is D because linear combinations of normals are normal. You can't have a normal distribution with a finite lower bound. Consequently, you can't simultaneously meet requirements of normality for S and T and meet your constraint.

You can construct non-normal solutions by generating T with any distribution of your choice, generating D independently with a strictly positive distribution (such as gamma, Weibull, uniform(0,?), truncated normal,...), and creating S = T + D. Since expectations and variances of independent random variables sum, you can get your desired mean and s.d. for both T and S by tuning the parameterization of D appropriately. The results can even look pretty bell-shaped, but strictly speaking won't be normal.

Since variances of independent random variables are additive and must be positive, S = T + D only works if the variance of S is larger than the variance of T. The more general solution is to generate whichever of S and T has the smaller variance. If it's T, add D to get S. If it's S, subtract D to get T.

Since you said in comments that approximations are okay, here's an example. Suppose you want the smaller distribution to have a μsmaller = 10 and σsmaller = 3, and the larger to have μlarger = 15 and σlarger = 5. Then the difference between them should have a strictly positive distribution with μdelta = 5 and σdelta = 4 (σlarger = sqrt(32 + 42) = 5). I chose a gamma distribution for delta, parameterized to have the desired mean and standard deviation. Here it is in Python:

import random

alpha = 25.0 / 16.0
beta = 16.0 / 5.0
for _ in range(100000):
smaller = random.gauss(10, 3)
delta = random.gammavariate(alpha, beta)
print(smaller, delta, smaller + delta)

I saved the results to a file, and imported them into JMP. Here's a snapshot of my analysis:

Descriptive statistics of Python program's output

As you can see, smaller and larger have the desired means and standard deviations. You can also confirm that all of the deltas are positive, so larger is always greater than smaller. Finally, the normal q-q plot above larger's histogram shows that the result, while unimodal and roughly bell-shaped, is not normal because the plotted points don't fall along a straight line.


Another answer has proposed matching the two distributions by generating a single random uniform and using it as the input for inversion with both CDFs:

q = random()
t = inverseCDF(q, mu_T, sd_T)
s = inverseCDF(q, mu_S, sd_S)

This is a well-known correlation induction strategy called "Common Random Numbers" in which q is the same quantile being used to generate both distributions via inversion. With symmetric distributions, such as the normal, this produces a correlation of 1 between T and S. A correlation of 1 tells us that (either) one is a linear transformation of the other.

In fact, there's a simpler way to accomplish this for normals without having to do two inversions. Generate one of T or S, by whatever mechanism you wish—inversion, polar method, Ziggurat method, etc. Then use the standard transformation to convert it to a standard normal, and from there to the other normal. If we let T be the normal that we generate directly, then

S = (σS / σT) * (T - μT) + μS.

We would like to have T <= S for all possible quantiles to meet the objectives of the original problem. So under what circumstances can we have S < T? Since S is a function of T, that implies

TT) * (T - μT) + μS < T

which after some algebra becomes

T * (σS - σT) / σS < μT - μS * (σTS).

This reduces to 3 cases.

  1. σS = σT: In this case, T gets eliminated and the originally desired outcome of T <= S is achieved as long as μT <= μS.

  2. σS > σT: In this case, T > S when T < (μT * σS / (σS - σT)) - (μS * σT / (σS - σT)).

  3. σS < σT: T > S when T > (μT * σS / (σS - σT)) - (μS * σT / (σS - σT)) because of the sign flip induced in the result in #2 by having (σS - σT) < 0.

Bottom line - the only case in which the correlation induction scheme works is when the variances of the two distributions are equal. Unequal variances will result in outcomes where T > S.

The following picture may give some intuition. The red curve is a standard normal with mean 0 and standard deviation 1. The green curve is a normal with mean 1 and standard deviation 2. We can see that because the green curve is wider, there is some quantile below which it produces smaller outcomes than the red. If T, the "lower" distribution, had the larger variability there would be some quantile above which it would produce larger outcomes.

Sample Image



Related Topics



Leave a reply



Submit