Split a data frame into six equal parts based on number of rows without knowing the number of rows - pandas
You can use np.array_split()
:
dfs = np.array_split(df, 6)
for index, df in enumerate(dfs):
df.to_csv(f'df{index+1}.csv')
>>> print(dfs)
[ ID Job Salary
0 1 A 100
1 2 B 200
2 3 B 20,
ID Job Salary
3 4 C 150
4 5 A 500
5 6 A 600,
ID Job Salary
6 7 A 200
7 8 B 150,
ID Job Salary
8 9 C 110
9 10 B 200,
ID Job Salary
10 11 B 220
11 12 A 150,
ID Job Salary
12 13 C 20
13 14 B 50]
Split a large pandas dataframe
Use np.array_split
:
Docstring:
Split an array into multiple sub-arrays.
Please refer to the ``split`` documentation. The only difference
between these functions is that ``array_split`` allows
`indices_or_sections` to be an integer that does *not* equally
divide the axis.
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
...: 'foo', 'bar', 'foo', 'foo'],
...: 'B' : ['one', 'one', 'two', 'three',
...: 'two', 'two', 'one', 'three'],
...: 'C' : randn(8), 'D' : randn(8)})
In [3]: print df
A B C D
0 foo one -0.174067 -0.608579
1 bar one -0.860386 -1.210518
2 foo two 0.614102 1.689837
3 bar three -0.284792 -1.071160
4 foo two 0.843610 0.803712
5 bar two -1.514722 0.870861
6 foo one 0.131529 -0.968151
7 foo three -1.002946 -0.257468
In [4]: import numpy as np
In [5]: np.array_split(df, 3)
Out[5]:
[ A B C D
0 foo one -0.174067 -0.608579
1 bar one -0.860386 -1.210518
2 foo two 0.614102 1.689837,
A B C D
3 bar three -0.284792 -1.071160
4 foo two 0.843610 0.803712
5 bar two -1.514722 0.870861,
A B C D
6 foo one 0.131529 -0.968151
7 foo three -1.002946 -0.257468]
Pandas Split Dataframe into two Dataframes at a specific row
iloc
df1 = datasX.iloc[:, :72]
df2 = datasX.iloc[:, 72:]
(iloc docs)
How to split data frame into x and y
The correct way to slice is x = train.iloc[:, 0:2]
.
Split a pandas dataframe into two dataframes efficiently based on some condition
IICU
Use boolean select
m=df.score>15
Lessthan15=df[~m]
Morethan15=df[m]
Morethan15
LessThan15
Split a dataframe into smaller dataframes in R using dplyr
We may use gl
to create the grouping column in group_split
library(dplyr)
df1 %>%
group_split(grp = as.integer(gl(n(), 59, n())), .keep = FALSE)
Related Topics
Regex to Replace Comma to Dot Separator
Mapping Columns/Rows from One Dataframe to Another Based on Row Number
Add Legend to Geom_Line() Graph in R
Rstudio Suddenly Stopped Showing Plots in the Plot Pane
Delete Rows Containing Specific Strings in R
Gsub a Every Element After a Keyword in R
Easier Way to Use Grepl and Ifelse Across Multiple Columns
Force R to Stop Plotting Abbreviated Axis Labels (Scientific Notation) - E.G. 1E+00
Grouping Functions (Tapply, By, Aggregate) and the *Apply Family
Remove Rows With All or Some Nas (Missing Values) in Data.Frame
Counting Unique/Distinct Values by Group in a Data Frame
Replace Specific Characters Within Strings
How to Use Greek Symbols in Ggplot2