append specific amount of empty rows to pandas dataframe
You can use df.reindex
to achieve this goal.
df.reindex(list(range(0, 10))).reset_index(drop=True)
cow shark pudle
0 2.0 2.0 10.0
1 4.0 0.0 2.0
2 8.0 0.0 1.0
3 NaN NaN NaN
4 NaN NaN NaN
5 NaN NaN NaN
6 NaN NaN NaN
7 NaN NaN NaN
8 NaN NaN NaN
9 NaN NaN NaN
The arguments you provide to df.reindex
is going to be the total number of rows the new DataFrame has. So if your DataFrame has 3 objects, providing a list that caps out at 10 will add 7 new rows.
Add n empty rows in a dataframe
You can using merge
pd.DataFrame({'depth':depth}).merge(df1,how='left')
How can I add an empty row before a definite row in Python DataFrame?
Create a DataFrame with the index labels based on your condition that has all null values. [Assumes df
has a non-duplicated index]. Then concat and sort_index
which will place the missing row before (because we concat df
to empty
). Then reset_index
to remove the duplicate index labels.
import pandas as pd
empty = pd.DataFrame(columns=df.columns, index=df[df.Price.eq(27000)].index)
df = pd.concat([empty, df]).sort_index().reset_index(drop=True)
# Brand Price
#0 Honda Civic 22000
#1 Toyota Corolla 25000
#2 NaN NaN
#3 Ford Focus 27000
#4 Audi A4 35000
This will add a blank row before every 27000 row
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4','Jeep'],
'Price': [22000,25000,27000,35000,27000]}
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
empty = pd.DataFrame(columns=df.columns, index=df[df.Price.eq(27000)].index)
df = pd.concat([empty, df]).sort_index().reset_index(drop=True)
# Brand Price
#0 Honda Civic 22000
#1 Toyota Corolla 25000
#2 NaN NaN
#3 Ford Focus 27000
#4 Audi A4 35000
#5 NaN NaN
#6 Jeep 27000
Most elegant ways to add a few empty rows into a data frame in R?
A way to do what you want would be formating the empty_rows_id
inside a dataframe with the zeroes and then use bind_rows()
in a dplyr
pipeline to add the data. Here the code:
library(dplyr)
#Data
df <- data.frame(x=1:100,y=1:100)
empty_row_ids <- c(5,10)
#Create data for rows
dfindex <- data.frame(id=empty_row_ids,x=0,y=0)
#Now bind
df2 <- df %>% mutate(id=1:n()) %>%
bind_rows(dfindex) %>%
arrange(id) %>% select(-id)
Output (some rows):
x y
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 0 0
7 6 6
8 7 7
9 8 8
10 9 9
11 10 10
12 0 0
13 11 11
14 12 12
15 13 13
If you want to export to other source to format your tables, it would be better to use NA
instead of zero as @MrFlick said.
Add empty rows at specific positions of dataframe
Do it all at once, no need for looping. Make a sequence of row numbers, add the new rows in, sort, then replace the duplicated row numbers with NA
:
s <- sort(c(seq_len(nrow(df)), rows))
out <- df[s,]
out[duplicated(s),] <- NA
# var1 var2
#1 1 9
#1.1 NA NA
#2 2 8
#3 3 7
#3.1 NA NA
#4 4 6
#5 5 5
#5.1 NA NA
#6 6 4
#7 7 3
#8 8 2
#9 9 1
This will be much more efficient than looping or loop-like code, for even moderately sized data:
df <- df[rep(1:9,1e4),]
rows <- seq(1,9e4,100)
system.time({
s <- sort(c(seq_len(nrow(df)), rows))
out <- df[s,]
out[duplicated(s),] <- NA
})
# user system elapsed
# 0.01 0.00 0.02
df <- df[rep(1:9,1e4),]
rows <- seq(1,9e4,100)
system.time({
Reduce(function(x, y) tibble::add_row(x, .after = y), rev(rows), init = df)
})
# user system elapsed
# 26.03 0.00 26.03
df <- df[rep(1:9,1e4),]
rows <- seq(1,9e4,100)
system.time({
for (i in rev(rows)) {
df <- tibble::add_row(df, .after = i)
}
})
# user system elapsed
# 25.05 0.00 25.04
Adding a blank row after a specific data in a dataframe column
There are probably other (faster?) ways to do this. Here an attempt.
First create a sample dataframe to use for this example. We assume that the index defines the groups that need to be split:
users = {'user_id': ['A','A','A','A', 'B','B','B'],
'status': ['S1', 'S2', 'S1', 'S3', 'S1', 'S2', 'S1'],
'value': [100, 30, 100, 20, 50, 30, 60 ],
}
df1 = pd.DataFrame(users, columns = ['user_id', 'status', 'value'])
df1.set_index('user_id', drop=True, inplace=True)
Here the output:
user_id | status | value |
---|---|---|
A | S1 | 100 |
A | S2 | 30 |
A | S1 | 100 |
A | S3 | 20 |
B | S1 | 50 |
B | S2 | 30 |
B | S1 | 60 |
Add multiple empty rows at beginning of a populated Dataframe with Python
I'm not sure why you would want to do this but I did it by splitting up the original dataframe into a dataframe with a row of the column names and a separate dataframe of the data. I then created a dataframe of nans to be the blank rows and joined the 3 together. You will need to import numpy for this.
I created a variable no_cols
to be the number of columns in the dataframe and no_empty_rows
to be how many empty rows to simplify code:
no_cols = len(df.columns)
no_empty_rows = 6
Then I turned the columns into their own dataframe, with 1 row which is the column names, and headers as np.nan:
cols = pd.DataFrame([df.columns], columns = [np.nan]*no_cols)
NaN NaN NaN NaN
0 Col1 col2 col3 col4
Next I renamed the columns in the original dataframe to nan:
df.columns = [np.nan]*no_cols
NaN NaN NaN NaN
0 One Two Three four
1 2 4 5 8
Then I created a new dataframe of nans, with 6 blank rows (this can be changed):
df_empty_rows = (pd.DataFrame(data=[[np.nan]*no_cols]*no_empty_rows,
columns=[np.nan]*no_cols,
index=[np.nan]*no_empty_rows))
NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
You can then append together all 3. First I put the columns and data of df
back together and reset their index, then append that to df_empty_rows
:
df_out = df_empty_rows.append(cols.append(df).reset_index(drop=True))
NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
0.0 Col1 col2 col3 col4
1.0 One Two Three four
2.0 2 4 5 8
Full code:
no_cols = len(df.columns)
no_empty_rows = 6
cols = pd.DataFrame([df.columns], columns=[np.nan]*no_cols)
df.columns = [np.nan]*no_cols
df_empty_rows = (pd.DataFrame(data=[[np.nan]*no_cols]*no_empty_rows,
columns=[np.nan]*no_cols,
index=[np.nan]*no_empty_rows))
df_out = df_empty_rows.append(cols.append(df).reset_index(drop=True))
Add multiple empty rows at beginning of a populated Dataframe with Python
I'm not sure why you would want to do this but I did it by splitting up the original dataframe into a dataframe with a row of the column names and a separate dataframe of the data. I then created a dataframe of nans to be the blank rows and joined the 3 together. You will need to import numpy for this.
I created a variable no_cols
to be the number of columns in the dataframe and no_empty_rows
to be how many empty rows to simplify code:
no_cols = len(df.columns)
no_empty_rows = 6
Then I turned the columns into their own dataframe, with 1 row which is the column names, and headers as np.nan:
cols = pd.DataFrame([df.columns], columns = [np.nan]*no_cols)
NaN NaN NaN NaN
0 Col1 col2 col3 col4
Next I renamed the columns in the original dataframe to nan:
df.columns = [np.nan]*no_cols
NaN NaN NaN NaN
0 One Two Three four
1 2 4 5 8
Then I created a new dataframe of nans, with 6 blank rows (this can be changed):
df_empty_rows = (pd.DataFrame(data=[[np.nan]*no_cols]*no_empty_rows,
columns=[np.nan]*no_cols,
index=[np.nan]*no_empty_rows))
NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
You can then append together all 3. First I put the columns and data of df
back together and reset their index, then append that to df_empty_rows
:
df_out = df_empty_rows.append(cols.append(df).reset_index(drop=True))
NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
0.0 Col1 col2 col3 col4
1.0 One Two Three four
2.0 2 4 5 8
Full code:
no_cols = len(df.columns)
no_empty_rows = 6
cols = pd.DataFrame([df.columns], columns=[np.nan]*no_cols)
df.columns = [np.nan]*no_cols
df_empty_rows = (pd.DataFrame(data=[[np.nan]*no_cols]*no_empty_rows,
columns=[np.nan]*no_cols,
index=[np.nan]*no_empty_rows))
df_out = df_empty_rows.append(cols.append(df).reset_index(drop=True))
Related Topics
How to Limit a Number to Be Within a Specified Range (Python)
Importing Modules from Parent Folder
How to Transfer Data from One Worksheet into Another Using Python in the Same Workbook
Update Json Element in Json Object Using Python
Pandas Filtering for Multiple Substrings in Series
How to Select Last Row and Also How to Access Pyspark Dataframe by Index
How to Get the Column Name in Pandas Based on Row Values
Python Creating Dictionary from Excel Data
Python - Use Previous Row'S Value to Update the New Rows Values
How to Get All Users in a Telegram Channel Using Telethon
Python: Plotting Percentage in Seaborn Bar Plot
Programme to Print Mulitples of 5 in a Range Specified by User
How to Loop Over Multiple Dataframes and Produce Multiple List