Pandas every nth row
I'd use iloc
, which takes a row/column slice, both based on integer position and following normal python syntax. If you want every 5th row:
df.iloc[::5, :]
How to extract every nth row from dataframe?
You are close, need for default RangeIndex
compare by 1
:
df1 = [df.index % 100 == 1]
Solution with general index:
df1 = [np.arange(len(df)) % 100 == 1]
If want also omit 1.
and 101.
rows:
df2 = (df[(df.index % 100 == 1) & (df.index > 200)]
And:
a = np.arange(len(df))
df2 = df[(a % 100 == 1) & (a > 200)]
Sample:
np.random.seed(100)
df = pd.DataFrame(np.random.randint(10, size=(1000,3)), columns=list('ABC'))
#print (df)
a = np.arange(len(df))
df2 = df[(a % 100 == 1) & (a > 200)]
print (df2)
A B C
201 4 4 4
301 1 3 2
401 0 3 5
501 5 8 4
601 3 7 9
701 5 5 7
801 4 1 0
901 4 7 6
Select nth rows every nth element in Python dataframe
You could use a startswith() option for this
df = df[(df['Date'].str.startswith('Ene')) | (df['Date'].str.startswith('Feb'))]
Select every other nth row of data frame and add to a list of data frames in R
Use split
with 1:5
to create dataframes with a 5-row interval.
split(df, 1:5)
output
$`1`
X1 X2 X3 X4 X5
1 1 0 0 1.501990 0
6 6 0 0 2.186790 0
11 11 0 0 2.190029 0
16 16 0 0 1.842470 0
$`2`
X1 X2 X3 X4 X5
2 2 0 0 1.883904 0
7 7 0 0 1.269592 0
12 12 0 0 0.000000 0
17 17 0 0 1.937999 0
$`3`
X1 X2 X3 X4 X5
3 3 0 0 1.333195 0
8 8 0 0 1.458405 0
13 13 0 0 1.460534 0
18 18 0 0 0.000000 0
$`4`
X1 X2 X3 X4 X5
4 4 0 0 0.000000 0
9 9 0 0 1.816493 0
14 14 0 0 1.470776 0
19 19 0 0 1.649926 0
$`5`
X1 X2 X3 X4 X5
5 5 0 0 2.136760 0
10 10 0 0 0.000000 0
15 15 0 0 1.675406 0
20 20 0 0 2.067902 0
An alternative with dplyr::group_split
is:
group_split(df, rep(1:5, nrow(df)/5), .keep = F)
data
df <- structure(list(X1 = 1:20, X2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X3 = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L), X4 = c(1.50199, 1.883904, 1.333195, 0, 2.13676,
2.18679, 1.269592, 1.458405, 1.816493, 0, 2.190029, 0, 1.460534,
1.470776, 1.675406, 1.84247, 1.937999, 0, 1.649926, 2.067902),
X5 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA,
-20L))
python pandas how to get data every n and every nth rows?
Use generator with iloc
to select the desire rows:
def rows_generator(df):
i = 0
while (i+3) <= df.shape[0]:
yield df.iloc[i:(i+3):1, :]
i += 1
i = 1
for df in rows_generator(df):
print(f'Time #{i}')
print(df)
i += 1
Example output:
Time #1
Group Cat Value
0 Group1 Cat1 1230
1 Group2 Cat2 4019
2 Group3 Cat3 9491
Time #2
Group Cat Value
1 Group2 Cat2 4019
2 Group3 Cat3 9491
3 Group4 Cat4 9588
Time #3
Group Cat Value
2 Group3 Cat3 9491
3 Group4 Cat4 9588
4 Group5 Cat5 6402
Time #4
Group Cat Value
3 Group4 Cat4 9588
4 Group5 Cat5 6402
5 Group6 Cat 1923
Time #5
Group Cat Value
4 Group5 Cat5 6402
5 Group6 Cat 1923
6 Group7 Cat7 492
Time #6
Group Cat Value
5 Group6 Cat 1923
6 Group7 Cat7 492
7 Group8 Cat8 8589
Time #7
Group Cat Value
6 Group7 Cat7 492
7 Group8 Cat8 8589
8 Group9 Cat9 8582
How do you sample every nth row within a range in a pandas dataframe?
First, we can create a test dataframe:
from pandas import util
tdf= util.testing.makeDataFrame()
then, we can index it in the following way:
tdf[start_index:end_index:step_size]
so, getting every other row from index 10 to 20 would look like this:
tdf[10:20:2]
Slicing Pandas DataFrame every nth row
You can do it with a for
loop:
for i in range(round(len(df)/5)): #This ensures all rows are captured
df.loc[i*5:(i+1)*5,:].to_csv('Stored_files_'+str(i)+'.csv')
So the first iteration it'll be rows 0 to 5 stored with name "Stored_files_0.csv
The second iteration rows 5 to 10 with name "Stored_files_1.csv"
And so on...
Related Topics
How to Specify the Size of a Graph in Ggplot2 Independent of Axis Labels
How to Add a Diagonal Line to a Plot
How to Combine Multiple Variable Data to a Single Variable Data
Error: Could Not Find Function ... in R
Convert Continuous Numeric Values to Discrete Categories Defined by Intervals
How to Save a Plot as Image on the Disk
How to Split Data into Training/Testing Sets Using Sample Function
How to Succinctly Write a Formula With Many Variables from a Data Frame
Rotating and Spacing Axis Labels in Ggplot2
Numeric Comparison Difficulty in R
Dplyr Conditional Summarise Function
Using Ggplot2, How to Insert a Break in the Axis
How to Output the Columns With the Maximum Value
How to Prevent Ifelse() from Turning Date Objects into Numeric Objects
How to Call an Object With the Character Variable of the Same Name