Add N Empty Rows in a Dataframe

append specific amount of empty rows to pandas dataframe

You can use df.reindex to achieve this goal.

df.reindex(list(range(0, 10))).reset_index(drop=True)

cow shark pudle
0 2.0 2.0 10.0
1 4.0 0.0 2.0
2 8.0 0.0 1.0
3 NaN NaN NaN
4 NaN NaN NaN
5 NaN NaN NaN
6 NaN NaN NaN
7 NaN NaN NaN
8 NaN NaN NaN
9 NaN NaN NaN

The arguments you provide to df.reindex is going to be the total number of rows the new DataFrame has. So if your DataFrame has 3 objects, providing a list that caps out at 10 will add 7 new rows.

Add n empty rows in a dataframe

You can using merge

pd.DataFrame({'depth':depth}).merge(df1,how='left')

How can I add an empty row before a definite row in Python DataFrame?

Create a DataFrame with the index labels based on your condition that has all null values. [Assumes df has a non-duplicated index]. Then concat and sort_index which will place the missing row before (because we concat df to empty). Then reset_index to remove the duplicate index labels.

import pandas as pd

empty = pd.DataFrame(columns=df.columns, index=df[df.Price.eq(27000)].index)
df = pd.concat([empty, df]).sort_index().reset_index(drop=True)
# Brand Price
#0 Honda Civic 22000
#1 Toyota Corolla 25000
#2 NaN NaN
#3 Ford Focus 27000
#4 Audi A4 35000

This will add a blank row before every 27000 row

cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4','Jeep'],
'Price': [22000,25000,27000,35000,27000]}
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])

empty = pd.DataFrame(columns=df.columns, index=df[df.Price.eq(27000)].index)
df = pd.concat([empty, df]).sort_index().reset_index(drop=True)
# Brand Price
#0 Honda Civic 22000
#1 Toyota Corolla 25000
#2 NaN NaN
#3 Ford Focus 27000
#4 Audi A4 35000
#5 NaN NaN
#6 Jeep 27000

Most elegant ways to add a few empty rows into a data frame in R?

A way to do what you want would be formating the empty_rows_id inside a dataframe with the zeroes and then use bind_rows() in a dplyr pipeline to add the data. Here the code:

library(dplyr)
#Data
df <- data.frame(x=1:100,y=1:100)
empty_row_ids <- c(5,10)
#Create data for rows
dfindex <- data.frame(id=empty_row_ids,x=0,y=0)
#Now bind
df2 <- df %>% mutate(id=1:n()) %>%
bind_rows(dfindex) %>%
arrange(id) %>% select(-id)

Output (some rows):

      x   y
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 0 0
7 6 6
8 7 7
9 8 8
10 9 9
11 10 10
12 0 0
13 11 11
14 12 12
15 13 13

If you want to export to other source to format your tables, it would be better to use NA instead of zero as @MrFlick said.

Add empty rows at specific positions of dataframe

Do it all at once, no need for looping. Make a sequence of row numbers, add the new rows in, sort, then replace the duplicated row numbers with NA:

s <- sort(c(seq_len(nrow(df)), rows))
out <- df[s,]
out[duplicated(s),] <- NA

# var1 var2
#1 1 9
#1.1 NA NA
#2 2 8
#3 3 7
#3.1 NA NA
#4 4 6
#5 5 5
#5.1 NA NA
#6 6 4
#7 7 3
#8 8 2
#9 9 1

This will be much more efficient than looping or loop-like code, for even moderately sized data:

df <- df[rep(1:9,1e4),]
rows <- seq(1,9e4,100)

system.time({
s <- sort(c(seq_len(nrow(df)), rows))
out <- df[s,]
out[duplicated(s),] <- NA
})
# user system elapsed
# 0.01 0.00 0.02

df <- df[rep(1:9,1e4),]
rows <- seq(1,9e4,100)

system.time({
Reduce(function(x, y) tibble::add_row(x, .after = y), rev(rows), init = df)
})
# user system elapsed
# 26.03 0.00 26.03

df <- df[rep(1:9,1e4),]
rows <- seq(1,9e4,100)

system.time({
for (i in rev(rows)) {
df <- tibble::add_row(df, .after = i)
}
})
# user system elapsed
# 25.05 0.00 25.04

Adding a blank row after a specific data in a dataframe column

There are probably other (faster?) ways to do this. Here an attempt.

First create a sample dataframe to use for this example. We assume that the index defines the groups that need to be split:

users = {'user_id': ['A','A','A','A', 'B','B','B'],
'status': ['S1', 'S2', 'S1', 'S3', 'S1', 'S2', 'S1'],
'value': [100, 30, 100, 20, 50, 30, 60 ],
}

df1 = pd.DataFrame(users, columns = ['user_id', 'status', 'value'])
df1.set_index('user_id', drop=True, inplace=True)

Here the output:















































user_idstatusvalue
AS1100
AS230
AS1100
AS320
BS150
BS230
BS160

Add multiple empty rows at beginning of a populated Dataframe with Python

I'm not sure why you would want to do this but I did it by splitting up the original dataframe into a dataframe with a row of the column names and a separate dataframe of the data. I then created a dataframe of nans to be the blank rows and joined the 3 together. You will need to import numpy for this.

I created a variable no_cols to be the number of columns in the dataframe and no_empty_rows to be how many empty rows to simplify code:

no_cols = len(df.columns)
no_empty_rows = 6

Then I turned the columns into their own dataframe, with 1 row which is the column names, and headers as np.nan:

cols = pd.DataFrame([df.columns], columns = [np.nan]*no_cols)

NaN NaN NaN NaN
0 Col1 col2 col3 col4

Next I renamed the columns in the original dataframe to nan:

df.columns = [np.nan]*no_cols

NaN NaN NaN NaN
0 One Two Three four
1 2 4 5 8

Then I created a new dataframe of nans, with 6 blank rows (this can be changed):

df_empty_rows = (pd.DataFrame(data=[[np.nan]*no_cols]*no_empty_rows,
columns=[np.nan]*no_cols,
index=[np.nan]*no_empty_rows))

NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN

You can then append together all 3. First I put the columns and data of df back together and reset their index, then append that to df_empty_rows:

df_out = df_empty_rows.append(cols.append(df).reset_index(drop=True))

NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
0.0 Col1 col2 col3 col4
1.0 One Two Three four
2.0 2 4 5 8

Full code:

no_cols = len(df.columns)
no_empty_rows = 6
cols = pd.DataFrame([df.columns], columns=[np.nan]*no_cols)
df.columns = [np.nan]*no_cols
df_empty_rows = (pd.DataFrame(data=[[np.nan]*no_cols]*no_empty_rows,
columns=[np.nan]*no_cols,
index=[np.nan]*no_empty_rows))
df_out = df_empty_rows.append(cols.append(df).reset_index(drop=True))

Add multiple empty rows at beginning of a populated Dataframe with Python

I'm not sure why you would want to do this but I did it by splitting up the original dataframe into a dataframe with a row of the column names and a separate dataframe of the data. I then created a dataframe of nans to be the blank rows and joined the 3 together. You will need to import numpy for this.

I created a variable no_cols to be the number of columns in the dataframe and no_empty_rows to be how many empty rows to simplify code:

no_cols = len(df.columns)
no_empty_rows = 6

Then I turned the columns into their own dataframe, with 1 row which is the column names, and headers as np.nan:

cols = pd.DataFrame([df.columns], columns = [np.nan]*no_cols)

NaN NaN NaN NaN
0 Col1 col2 col3 col4

Next I renamed the columns in the original dataframe to nan:

df.columns = [np.nan]*no_cols

NaN NaN NaN NaN
0 One Two Three four
1 2 4 5 8

Then I created a new dataframe of nans, with 6 blank rows (this can be changed):

df_empty_rows = (pd.DataFrame(data=[[np.nan]*no_cols]*no_empty_rows,
columns=[np.nan]*no_cols,
index=[np.nan]*no_empty_rows))

NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN

You can then append together all 3. First I put the columns and data of df back together and reset their index, then append that to df_empty_rows:

df_out = df_empty_rows.append(cols.append(df).reset_index(drop=True))

NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
0.0 Col1 col2 col3 col4
1.0 One Two Three four
2.0 2 4 5 8

Full code:

no_cols = len(df.columns)
no_empty_rows = 6
cols = pd.DataFrame([df.columns], columns=[np.nan]*no_cols)
df.columns = [np.nan]*no_cols
df_empty_rows = (pd.DataFrame(data=[[np.nan]*no_cols]*no_empty_rows,
columns=[np.nan]*no_cols,
index=[np.nan]*no_empty_rows))
df_out = df_empty_rows.append(cols.append(df).reset_index(drop=True))


Related Topics



Leave a reply



Submit