Convert Data Frame Common Rows to Columns

Convert data frame common rows to columns

a <- c(rep(1:10, 3))
b <- c(rep("aa", 10), rep("bb", 10), rep("cc", 10))
set.seed(123)
c <- sample(seq(from = 20, to = 50, by = 5), size = 30, replace = TRUE)
d <- data.frame(a,b, c)
#how to transpose it#
e<-reshape(d,idvar='a',timevar='b',direction='wide')
e

convert rows into columns according to the date that they have in common in R

Here is an option in tidyverse where we create a grouping column based on the presence of 'Station Name:' string in 'Column1', create a new column by extracting the first value of 'Column2' ('A', 'B', 'C'), then remove the first two rows as they are headers (slice), rename the column, and reshape to 'wide' format with pivot_wider. If needed, arrange the rows based on the 'Date' in ascending order

library(dplyr)
library(tidyr)
library(stringr)
library(lubridate)
df %>%
group_by(grp = cumsum(str_detect(Column1, 'Station Name:'))) %>%
mutate(nm1 = first(Column2)) %>%
slice(-(1:2)) %>%
ungroup %>%
rename(Date = Column1) %>%
type.convert(as.is = TRUE) %>%
select(-grp) %>%
pivot_wider(names_from = nm1, values_from = Column2) %>%
arrange(dmy(Date))

-output

# A tibble: 7 x 4
# Date A B C
# <chr> <dbl> <dbl> <dbl>
#1 01/01/1999 NA NA 12.5
#2 02/01/1999 NA NA 8.39
#3 01/01/2000 2.9 1.19 NA
#4 02/01/2000 2.42 1.16 NA
#5 01/10/2009 NA NA 6.48
#6 07/03/2010 2.06 1.13 NA
#7 31/12/2020 1.92 1.08 9.87

Or in base R with split/Reduce/merge

out <- type.convert(Reduce(function(...) merge(..., by = 'Date', all = TRUE), 
lapply(split(df, cumsum(grepl('Station Name:', df$Column1))),
function(x) setNames(x, c("Date", x$Column2[1]))[-(1:2),])),
as.is = TRUE)

how to convert pandas data frame rows into columns

What you want is a called a pivot:

df.pivot(*df).fillna(0).add_suffix('_Sales')

output:

Brand             B1_Sales  B2_Sales  B3_Sales  B4_Sales  B5_Sales
ChannelPartnerID
10000 29630 38573 1530 21793 7155
10001 26477 42158 0 0 14612
10002 6649 0 0 6468 0

NB. df.pivot(*df) is a shortcut for df.pivot(index='ChannelPartnerID', columns='Brand', values='Sales')

Converting rows to columns for a dataframe in R

Try

library(reshape2)
df
Date Time Object_Name Object_Value
1 7/28/2017 8:00 A1 58.56
2 7/28/2017 8:00 A2 51.66
3 7/28/2017 8:30 A1 60.20
4 7/28/2017 8:30 A2 65.20

dcast(df, Date + Time ~ Object_Name)

Date Time A1 A2
1 7/28/2017 8:00 58.56 51.66
2 7/28/2017 8:30 60.20 65.20

Alternatively,

library(tidyr)
spread(df, Object_Name, Object_Value)
Date Time A1 A2
1 7/28/2017 8:00 58.56 51.66
2 7/28/2017 8:30 60.20 65.20

To address the comment, the above works well if you have unique cases. Consider for instance the following:

df
Date Time Object_Name Object_Value
1 7/28/2017 8:00 A1 58.56
2 7/28/2017 8:00 A1 50.00
3 7/28/2017 8:00 A2 51.66
4 7/28/2017 8:30 A1 60.20
5 7/28/2017 8:30 A2 65.20

Look at the first two rows, and you can see that for the same date, time and Object_Name, we have two values. This implies that dcast does not know what to do and gives the following warning: Aggregation function missing: defaulting to length. We can handle this by specifying the aggregation function. For instance, let's take the mean of these values:

dcast(df, Date + Time ~ Object_Name, fun.aggregate = mean)
Date Time A1 A2
1 7/28/2017 8:00 54.28 51.66
2 7/28/2017 8:30 60.20 65.20

R - Convert and transpose data to columns by group

We can use tidyr::spread

library(tidyverse)
df %>% group_by(a) %>% mutate(n = 1:n()) %>% spread(a, b) %>% select(-n)
## A tibble: 5 x 3
# Group1 Group2 Group3
# <fct> <fct> <fct>
#1 Item1 Item4 Item9
#2 Item2 Item5 NA
#3 Item3 Item6 NA
#4 NA Item7 NA
#5 NA Item8 NA

Or if you prefer "--" instead of NA you can do (thanks @AntoniosK)

df %>%
group_by(a) %>%
mutate(n = 1:n()) %>%
spread(a, b) %>%
select(-n) %>%
mutate_all(~ifelse(is.na(.), "--", as.character(.)))
## A tibble: 5 x 3
# Group1 Group2 Group3
# <chr> <chr> <chr>
#1 Item1 Item4 Item9
#2 Item2 Item5 --
#3 Item3 Item6 --
#4 -- Item7 --
#5 -- Item8 --

or using tidyr::spreads fill argument

df %>%
mutate_if(is.factor, as.character) %>%
group_by(a) %>%
mutate(n = 1:n()) %>%
spread(a, b, fill = "--") %>%
select(-n)

giving the same result.


Sample data

a <- c("Group1", "Group1", "Group1", "Group2", "Group2", "Group2", "Group2", "Group2", "Group3")
b <- c("Item1", "Item2", "Item3", "Item4", "Item5", "Item6", "Item7", "Item8", "Item9")
df <- data.frame(a = a, b = b)

How to find common rows between two data frames?

You can use the following code:

c<- data.frame(A = c(4,6,7), B = c(5,9,8),C = c("T","T","F"))
d<- data.frame(A = c(6,7,3),B = c(9,8,3),C = c("T","F","F"))

merge(c, d, by= c("A", "B", "C"))

Output:

  A B C
1 6 9 T
2 7 8 F


Related Topics



Leave a reply



Submit