How to Randomize (Or Permute) a Dataframe Rowwise and Columnwise

R: Shuffle dataframe columnwise

You might want to just sample the column-names. Something like:

names(df) <- names(df)[sample(ncol(df))]

How to shuffle a dataframe column wise, but independent of rows?

something like:

t(apply(df1, 1, function(x) { sample(x, length(x)) } ))

This will give you the result in matrix form. If you have factors, a mix of numeric and characters etc, be aware that this will coerce everything to character.

reshuffle the sequence of rows in data frame

If you want to sample (but keep) the same order of the rows then you can just sample the rows.

df <- data.frame(x=1:8, y=1:8, z=1:8)
df[sample(1:nrow(df)),]

which will produce

If you rows should be sampled individually for each row then you can do something like

lapply(df, function(x) { sample(x)})

which results in

$x
[1] 3 1 4 6 5 2 8 7

$y
[1] 2 5 6 3 4 8 7 1

$z
[1] 6 1 8 3 2 7 4 5

How to permute a dataframe columnwise with paird colume in R?

We get the sample on the sequence of rows, and use that as row index to modify the values of 'C2', 'C3'

i1 <- sample(seq_len(nrow(df1)))
df1[c("C2", "C3")] <- df1[i1, c("C2", "C3")]

-output

df1
#   C1 C2 C3 C4
#R1  a  0 27  8
#R2  b  1 15  5
#R3  c  1 39  2
#R4  d  0 30  1
#R5  e  1 10  4

data

df1 <- structure(list(C1 = c("a", "b", "c", "d", "e"), C2 = c(1L, 0L, 
1L, 0L, 1L), C3 = c(15L, 30L, 10L, 27L, 39L), C4 = c(8L, 5L, 
2L, 1L, 4L)), class = "data.frame", row.names = c("R1", "R2", 
"R3", "R4", "R5"))

Random change the order of rows in a data frame

We can use sample

df$name[sample(nrow(df))]

Shuffle DataFrame rows

The idiomatic way to do this with Pandas is to use the .sample method of your data frame to sample all rows without replacement:

df.sample(frac=1)

The frac keyword argument specifies the fraction of rows to return in the random sample, so frac=1 means to return all rows (in random order).

Note:
If you wish to shuffle your dataframe in-place and reset the index, you could do e.g.

df = df.sample(frac=1).reset_index(drop=True)

Here, specifying drop=True prevents .reset_index from creating a column containing the old index entries.

Follow-up note: Although it may not look like the above operation is in-place, python/pandas is smart enough not to do another malloc for the shuffled object. That is, even though the reference object has changed (by which I mean id(df_old) is not the same as id(df_new)), the underlying C object is still the same. To show that this is indeed the case, you could run a simple memory profiler:

$ python3 -m memory_profiler .\test.py
Filename: .\test.py

Line #    Mem usage    Increment   Line Contents
================================================
     5     68.5 MiB     68.5 MiB   @profile
     6                             def shuffle():
     7    847.8 MiB    779.3 MiB       df = pd.DataFrame(np.random.randn(100, 1000000))
     8    847.9 MiB      0.1 MiB       df = df.sample(frac=1).reset_index(drop=True)

Randomizing/Shuffling rows in a dataframe in pandas

Edit: I misunderstood the question, which was just to shuffle rows and not all the table (right?)

I think using dataframes does not make lots of sense, because columns names become useless. So you can just use 2D numpy arrays :

In [1]: A
Out[1]: 
array([[11, 'Blue', 'Mon'],
       [8, 'Red', 'Tues'],
       [10, 'Green', 'Wed'],
       [15, 'Yellow', 'Thurs'],
       [11, 'Black', 'Fri']], dtype=object)

In [2]: _ = [np.random.shuffle(i) for i in A] # shuffle in-place, so return None

In [3]: A
Out[3]: 
array([['Mon', 11, 'Blue'],
       [8, 'Tues', 'Red'],
       ['Wed', 10, 'Green'],
       ['Thurs', 15, 'Yellow'],
       [11, 'Black', 'Fri']], dtype=object)

And if you want to keep dataframe :

In [4]: pd.DataFrame(A, columns=data.columns)
Out[4]: 
  Number  color     day
0    Mon     11    Blue
1      8   Tues     Red
2    Wed     10   Green
3  Thurs     15  Yellow
4     11  Black     Fri

Here a function to shuffle rows and columns:

import numpy as np
import pandas as pd

def shuffle(df):
    col = df.columns
    val = df.values
    shape = val.shape
    val_flat = val.flatten()
    np.random.shuffle(val_flat)
    return pd.DataFrame(val_flat.reshape(shape),columns=col)

In [2]: data
Out[2]: 
   Number   color    day
0      11    Blue    Mon
1       8     Red   Tues
2      10   Green    Wed
3      15  Yellow  Thurs
4      11   Black    Fri

In [3]: shuffle(data)
Out[3]: 
  Number  color     day
0    Fri    Wed  Yellow
1  Thurs  Black     Red
2  Green   Blue      11
3     11      8      10
4    Mon   Tues      15

Hope this helps