Insert a blank row after each group of data
First transform all columns to character vectors:
df_new <- as.data.frame(lapply(df, as.character), stringsAsFactors = FALSE)
Then you can create the output you're looking for:
head(do.call(rbind, by(df_new, df$group, rbind, "")), -1 )
# group xvalue yvalue
# a.1 a 16 1
# a.2
# b.2 b 17 2
# b.3 b 18 3
# b.31
# c.4 c 19 4
# c.5 c 20 5
# c.6 c 21 6
# c.41
# d.7 d 22 7
# d.8 d 23 8
# d.9 d 24 9
# d.10 d 25 10
How to automatically insert a blank row after a group of data
This does exactly what you are asking, checks the rows, and inserts a blank empty row at each change in column A:
sub AddBlankRows()
'
dim iRow as integer, iCol as integer
dim oRng as range
set oRng=range("a1")
irow=oRng.row
icol=oRng.column
do
'
if cells(irow+1, iCol)<>cells(irow,iCol) then
cells(irow+1,iCol).entirerow.insert shift:=xldown
irow=irow+2
else
irow=irow+1
end if
'
loop while not cells (irow,iCol).text=""
'
end sub
I hope that gets you started, let us know!
Philip
Insert blank row between groups and maintain original order
The issue is that after the column 'month' got changed to an ordered
factor, ""
is not specified as one of the levels
. So, naturally any value that is not a level
is treated as missing value and hence we get the NA
. Correction can be done at the earlier step by including the ""
as one of the levels
df$month <- ordered(df$month, levels = c("2", "5", "8", "11", "annual", ""))
NOTE: The order
of the ""
is not clear. So, it is specified as the last level
Add a blank row after each unique value of a column in a dataframe in R
You can use group_split
to split your data by group into a list of data frames. Then map a function over each list element and stack their output back into a data frame using map_dfr
.
library(dplyr)
library(tibble)
library(purrr)
data %>%
group_split(study) %>%
map_dfr(~ add_row(.x, .after = Inf))
Output
study year
<dbl> <dbl>
1 1 2001
2 1 2001
3 NA NA
4 2 2002
5 NA NA
6 3 2003
7 NA NA
Adding a blank row after a specific data in a dataframe column
There are probably other (faster?) ways to do this. Here an attempt.
First create a sample dataframe to use for this example. We assume that the index defines the groups that need to be split:
users = {'user_id': ['A','A','A','A', 'B','B','B'],
'status': ['S1', 'S2', 'S1', 'S3', 'S1', 'S2', 'S1'],
'value': [100, 30, 100, 20, 50, 30, 60 ],
}
df1 = pd.DataFrame(users, columns = ['user_id', 'status', 'value'])
df1.set_index('user_id', drop=True, inplace=True)
Here the output:
user_id | status | value |
---|---|---|
A | S1 | 100 |
A | S2 | 30 |
A | S1 | 100 |
A | S3 | 20 |
B | S1 | 50 |
B | S2 | 30 |
B | S1 | 60 |
How can I insert blank rows every 3 existing rows in a data frame?
If the indices of a data frame contain NA
, then the output will have NA
rows. So my goal is to create a vector like 1 2 3 NA 4 5 6 NA ...
and set it as the indices of mi_df
.
cut <- rep(1:(nrow(mi_df)/3), each = 3)
mi_df[sapply(split(1:nrow(mi_df), cut), c, NA), ]
# ID X Y
# 1 1 a 1
# 2 1 a 3
# 3 1 a 5
# NA NA <NA> NA
# 4 2 b 7
# 5 2 b 9
# 6 2 b 11
# NA.1 NA <NA> NA
# 7 3 c 13
# 8 3 c 15
# 9 3 c 17
# NA.2 NA <NA> NA
If nrow(mi_df)
is not a multiple of 3, then the following is a general solution:
# Version 1
cut <- rep(1:ceiling(nrow(mi_df)/3), each = 3, len = nrow(mi_df))
mi_df[Reduce(c, lapply(split(1:nrow(mi_df), cut), c, NA)), ]
# Version 2
cut <- rep(1:ceiling(nrow(mi_df)/3), each = 3, len = nrow(mi_df))
mi_df[Reduce(function(x, y) c(x, NA, y), split(1:nrow(mi_df), cut)), ]
Don't mind the NA
in the output because some functions which write data to an excel file have an optional argument controls if NA
values are converted to strings or be empty. E.g.
library(openxlsx)
write.xlsx(df, "test.xlsx", keepNA = FALSE) # defaults to FALSE
Insert 2 Blank Rows In DF by Group
You can do:
num_empty_rows = 2
df = (df.groupby('Col1',as_index=False).apply(lambda g: g.append(
pd.DataFrame(data=[['']*len(df.columns)]*num_empty_rows,
columns=df.columns))).reset_index(drop=True).iloc[:-num_empty_rows])
As you can see, after each group df is appended by a dataframe to accommodate num_empty_rows
and then at the end reset_index is performed. The last iloc[:-num_empty_rows] is optional i.e. to remove empty rows at the end.
Example input:
df = pd.DataFrame({'Col1': ['A', 'A', 'A', 'B', 'C'],
'Col2':['s','s','b','b','l'],
'Col3':['b','j','d','a','k'],
'Col4':['d','k','q','d','p']
})
Output:
Col1 Col2 Col3 Col4
0 A s b d
1 A s j k
2 A b d q
3
4
5 B b a d
6
7
8 C l k p
Related Topics
Create Counter of Consecutive Runs of a Certain Value
How to Split a Data Frame into Multiple Dataframes with Each Two Columns as a New Dataframe
How to Extract Certain Columns from a List of Data Frames
Unlist a Data Frame by Rows, Not Columns
How to Collapse Many Records into One While Removing Na Values
Convert a Character Vector of Mixed Numbers, Fractions, and Integers to Numeric
Combining 'Expression()' with '\N'
Dt: Dynamically Change Column Values Based on Selectinput from Another Column in R Shiny App
Creating a Density Histogram in Ggplot2
Filling Missing Dates in a Grouped Time Series - a Tidyverse-Way
Add Text to Horizontal Barplot in R, Y-Axis at Different Scale
Insert a Blank Row After Each Group of Data
How to Sort All Dataframes in a List of Dataframes on the Same Column
Sort a String of Comma-Separated Items Alphabetically
How to Use Data.Table Within Functions and Loops