Shuffle Output of Find with Fixed Seed

Shuffle output of find with fixed seed

You can create your fixed_random function, using openssl to generate your random-source flow, like this

get_fixed_random()
{
openssl enc -aes-256-ctr -pass pass:"$1" -nosalt </dev/zero 2>/dev/null
}

Load the function into your environment

. /file-containing/get_fixed_random

Launch the find command, pipe the output to shuf using the random function to feed the --random-source option

find . -name '*.wav'  | shuf --random-source=<(get_fixed_random 55)

NB: 55 is just the seed parameter passed. Change it to change the random result

np.random.shuffle() with seed giving different shuffles every time

Recreate a labels variable and the result will be the same every time. Here you are just creating labels once and executing the shuffling code several time.

np.random.shuffle modifies your list in-place. That also explains why a return value is not necessary.

How can I get same result with same seed set at random.shuffle()

Just putting @Tim Peters' comment into an answer. You need to reset the list each time, since random.shuffle is destructive:

import random
SEED = 448

original_list = ['list', 'elements', 'go', 'here']

random.seed(SEED)
my_list = original_list[:]
random.shuffle(my_list)
print "RUN1: ", my_list

random.seed(SEED)
my_list = original_list[:]
random.shuffle(my_list)
print "RUN2: ", my_list

# Output:
# RUN1: ['here', 'go', 'list', 'elements']
# RUN2: ['here', 'go', 'list', 'elements']

Shuffling numbers in bash using seed

The GNU implementation of shuf has a --random-source argument. Passing this argument with the name of a file with known contents will result in a reliable set of output.

See the Random sources documentation in the GNU coreutils manual, which contains the following sample implementation:

get_seeded_random()
{
seed="$1"
openssl enc -aes-256-ctr -pass pass:"$seed" -nosalt \
</dev/zero 2>/dev/null
}

shuf -i1-100 --random-source=<(get_seeded_random 42)

To load a result into a bash array in a manner that doesn't rely on string-splitting (and thus the current value of IFS), your implementation may instead look like:

# with bash 4.0 or newer
readarray -t array < <(shuf -i1-100 --random-source=<(get_seeded_random 42))

# or, supporting bash 3.x as well
IFS=$'\n' read -r -d '' -a array \
< <(shuf -i1-100 --random-source=<(get_seeded_random 42) && printf '\0')

Python shuffle(): Granularity of its seed numbers / shuffle() result diversity

You are passing in a function that returns a fixed number:

shuffle(num_list, lambda: seed)

Here seed is one of your floating point values. That's very different from the default random() function; you are returning the same number repeatedly, forever. From the documentation:

The optional argument random is a 0-argument function returning a random float in [0.0, 1.0); by default, this is the function random().

You produced the Dilbert accounting department random number generator here:

Sample Image

When you pass in an alternative random() function as the second argument, the value it returns is used to pick what preceding index to swap the 'current' index with (starting from the end); the source code that is run essentially does this:

x = list_to_shuffle
for i in reversed(range(1, len(x))):
# pick an element in x[:i+1] with which to exchange x[i]
j = int(random() * (i+1))
x[i], x[j] = x[j], x[i]

So your fixed number would always pick the same relative index to swap with. For small enough differences in that fixed value the rounding down to the nearest integer would result in the exact same indices being used to swap with.

This is what happens for 0.5 to 0.55, for example; in both cases the indices picked are (5, 4, 4, 3, 3, 2, 2, 1, 1), not much of a 'random' shuffle. Ditto for 0 and 0.05, when you swap everything with index 0, and for 0.9 and 0.95, when you swap each index with itself.

If you wanted to test how seeding works, create an instance of the random.Random() class with your seed and call shuffle() on that object:

from random import Random

seed_list = [ 0.0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35,
0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75,
0.8, 0.85, 0.9, 0.95
]

last_list = ten_digits = list(range(10))

for seed in seed_list:
num_list = ten_digits[:]
Random(seed).shuffle(num_list)
print("Seed {}:\t {} {}".format(seed, num_list, num_list==last_list))
last_list = num_list

which outputs

Seed 0.0:    [7, 8, 1, 5, 3, 4, 2, 0, 9, 6] False
Seed 0.05: [3, 8, 5, 4, 2, 1, 9, 7, 0, 6] False
Seed 0.1: [0, 4, 8, 7, 1, 9, 5, 6, 2, 3] False
Seed 0.15: [6, 1, 8, 7, 9, 5, 2, 4, 3, 0] False
Seed 0.2: [9, 6, 8, 2, 7, 4, 5, 0, 1, 3] False
Seed 0.25: [2, 8, 0, 3, 1, 6, 5, 9, 7, 4] False
Seed 0.3: [7, 4, 5, 1, 2, 3, 8, 9, 6, 0] False
Seed 0.35: [0, 7, 6, 2, 8, 3, 9, 5, 1, 4] False
Seed 0.4: [3, 5, 7, 1, 9, 4, 6, 0, 8, 2] False
Seed 0.45: [4, 3, 6, 8, 1, 7, 5, 2, 9, 0] False
Seed 0.5: [8, 9, 3, 5, 0, 6, 1, 2, 7, 4] False
Seed 0.55: [3, 0, 4, 6, 2, 8, 7, 1, 9, 5] False
Seed 0.6: [3, 4, 7, 2, 9, 1, 6, 5, 8, 0] False
Seed 0.65: [9, 1, 8, 2, 4, 0, 7, 3, 6, 5] False
Seed 0.7: [1, 6, 2, 4, 8, 5, 7, 9, 3, 0] False
Seed 0.75: [8, 3, 6, 1, 9, 0, 4, 5, 7, 2] False
Seed 0.8: [4, 7, 5, 2, 0, 3, 8, 1, 9, 6] False
Seed 0.85: [2, 4, 6, 5, 7, 8, 0, 3, 9, 1] False
Seed 0.9: [3, 6, 5, 0, 8, 9, 1, 4, 7, 2] False
Seed 0.95: [1, 5, 2, 6, 4, 9, 3, 8, 0, 7] False

Or you could just call random.seed() each test, passing in the seed value, but this changes the global Random() instance affecting other modules using it too.

That second argument to random.seed() should really just be forgotten about, you never need it. It was only there in the first revision of the function as a performance improvement, to ensure that in a tight loop a local name was used instead of a global. But because it was added to the function signature without a leading underscore, it became part of the public API in perpetuity, by accident. There is no real use-case that requires it to be used.



Related Topics



Leave a reply



Submit