Shuffle Multiple Files in Same Order

How to shuffle multiple files and save different files?

Here’s a simple script that does what you want. Specify all the input
files on the command line. It assumes all of the files have the same
number of lines.

First it creates a list of numbers and shuffles it. Then it combines
those numbers with each input file, sorts that, and removes the numbers.
Thus, each input file is shuffled in the same order.

#!/bin/bash

# Temp file to hold shuffled order
shuffile=$(mktemp)

# Create shuffled order
lines=$(wc -l < "$1")
digits=$(printf "%d" $lines | wc -c)
fmt=$(printf "%%0%d.0f" $digits)
seq -f "$fmt" $lines | shuf > $shuffile

# Shuffle each file in same way
for fname in "$@"; do
paste $shuffile "$fname" | sort | cut -f 2- > "$fname.shuf"
done

# Clean up
rm $shuffile

Shuffle two list at once with same order

You can do it as:

import random

a = ['a', 'b', 'c']
b = [1, 2, 3]

c = list(zip(a, b))

random.shuffle(c)

a, b = zip(*c)

print a
print b

[OUTPUT]
['a', 'c', 'b']
[1, 3, 2]

Of course, this was an example with simpler lists, but the adaptation will be the same for your case.

How to shuffle multiple files and save different files?

Here’s a simple script that does what you want. Specify all the input
files on the command line. It assumes all of the files have the same
number of lines.

First it creates a list of numbers and shuffles it. Then it combines
those numbers with each input file, sorts that, and removes the numbers.
Thus, each input file is shuffled in the same order.

#!/bin/bash

# Temp file to hold shuffled order
shuffile=$(mktemp)

# Create shuffled order
lines=$(wc -l < "$1")
digits=$(printf "%d" $lines | wc -c)
fmt=$(printf "%%0%d.0f" $digits)
seq -f "$fmt" $lines | shuf > $shuffile

# Shuffle each file in same way
for fname in "$@"; do
paste $shuffile "$fname" | sort | cut -f 2- > "$fname.shuf"
done

# Clean up
rm $shuffile

Shuffling pairs of lines in two text files


TL;DR

  • paste to create separate columns from two files into a single file
  • shuf on the single file
  • cut to split the columns

Paste

$ cat test.en 
a b c
d e f
g h i

$ cat test.de
1 2 3
4 5 6
7 8 9

$ paste test.en test.de > test.en-de

$ cat test.en-de
a b c 1 2 3
d e f 4 5 6
g h i 7 8 9

Shuffle

$ shuf test.en-de > test.en-de.shuf

$ cat test.en-de.shuf
d e f 4 5 6
a b c 1 2 3
g h i 7 8 9

Cut

$ cut -f1 test.en-de.shuf> test.en-de.shuf.en
$ cut -f2 test.en-de.shuf> test.en-de.shuf.de

$ cat test.en-de.shuf.en
d e f
a b c
g h i

$ cat test.en-de.shuf.de
4 5 6
1 2 3
7 8 9

Randomly shuffle data and labels from different files in the same order

Generate a random order of elements with np.random.permutation and simply index into the arrays data and classes with those -

idx = np.random.permutation(len(data))
x,y = data[idx], classes[idx]


Related Topics



Leave a reply



Submit