How to Split a Data Frame

Split a data frame into six equal parts based on number of rows without knowing the number of rows - pandas

You can use np.array_split():

dfs = np.array_split(df, 6)

for index, df in enumerate(dfs):
df.to_csv(f'df{index+1}.csv')
>>> print(dfs)

[ ID Job Salary
0 1 A 100
1 2 B 200
2 3 B 20,

ID Job Salary
3 4 C 150
4 5 A 500
5 6 A 600,

ID Job Salary
6 7 A 200
7 8 B 150,

ID Job Salary
8 9 C 110
9 10 B 200,

ID Job Salary
10 11 B 220
11 12 A 150,

ID Job Salary
12 13 C 20
13 14 B 50]

Split a large pandas dataframe

Use np.array_split:

Docstring:
Split an array into multiple sub-arrays.

Please refer to the ``split`` documentation. The only difference
between these functions is that ``array_split`` allows
`indices_or_sections` to be an integer that does *not* equally
divide the axis.
In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
...: 'foo', 'bar', 'foo', 'foo'],
...: 'B' : ['one', 'one', 'two', 'three',
...: 'two', 'two', 'one', 'three'],
...: 'C' : randn(8), 'D' : randn(8)})

In [3]: print df
A B C D
0 foo one -0.174067 -0.608579
1 bar one -0.860386 -1.210518
2 foo two 0.614102 1.689837
3 bar three -0.284792 -1.071160
4 foo two 0.843610 0.803712
5 bar two -1.514722 0.870861
6 foo one 0.131529 -0.968151
7 foo three -1.002946 -0.257468

In [4]: import numpy as np
In [5]: np.array_split(df, 3)
Out[5]:
[ A B C D
0 foo one -0.174067 -0.608579
1 bar one -0.860386 -1.210518
2 foo two 0.614102 1.689837,
A B C D
3 bar three -0.284792 -1.071160
4 foo two 0.843610 0.803712
5 bar two -1.514722 0.870861,
A B C D
6 foo one 0.131529 -0.968151
7 foo three -1.002946 -0.257468]

Pandas Split Dataframe into two Dataframes at a specific row

iloc

df1 = datasX.iloc[:, :72]
df2 = datasX.iloc[:, 72:]

(iloc docs)

How to split data frame into x and y

The correct way to slice is x = train.iloc[:, 0:2].

Split a pandas dataframe into two dataframes efficiently based on some condition

IICU

Use boolean select

m=df.score>15

Lessthan15=df[~m]
Morethan15=df[m]

Morethan15

Sample Image

LessThan15

Sample Image

Split a dataframe into smaller dataframes in R using dplyr

We may use gl to create the grouping column in group_split

library(dplyr)
df1 %>%
group_split(grp = as.integer(gl(n(), 59, n())), .keep = FALSE)


Related Topics



Leave a reply



Submit