Convert Data Frame Common Rows to Columns

Convert data frame common rows to columns

a <- c(rep(1:10, 3))
b <- c(rep("aa", 10), rep("bb", 10), rep("cc", 10))
set.seed(123)
c <- sample(seq(from = 20, to = 50, by = 5), size = 30, replace = TRUE)
d <- data.frame(a,b, c)
#how to transpose it#
e<-reshape(d,idvar='a',timevar='b',direction='wide')
e

convert rows into columns according to the date that they have in common in R

Here is an option in tidyverse where we create a grouping column based on the presence of 'Station Name:' string in 'Column1', create a new column by extracting the first value of 'Column2' ('A', 'B', 'C'), then remove the first two rows as they are headers (slice), rename the column, and reshape to 'wide' format with pivot_wider. If needed, arrange the rows based on the 'Date' in ascending order

library(dplyr)
library(tidyr)
library(stringr)
library(lubridate)
df %>%
  group_by(grp = cumsum(str_detect(Column1, 'Station Name:'))) %>% 
  mutate(nm1 = first(Column2)) %>%
  slice(-(1:2)) %>% 
  ungroup %>%
  rename(Date = Column1) %>%
  type.convert(as.is = TRUE) %>% 
  select(-grp) %>%
  pivot_wider(names_from = nm1, values_from = Column2) %>%       
  arrange(dmy(Date))

-output

# A tibble: 7 x 4
#  Date           A     B     C
#  <chr>      <dbl> <dbl> <dbl>
#1 01/01/1999 NA    NA    12.5 
#2 02/01/1999 NA    NA     8.39
#3 01/01/2000  2.9   1.19 NA   
#4 02/01/2000  2.42  1.16 NA   
#5 01/10/2009 NA    NA     6.48
#6 07/03/2010  2.06  1.13 NA   
#7 31/12/2020  1.92  1.08  9.87

Or in base R with split/Reduce/merge

out <- type.convert(Reduce(function(...) merge(..., by = 'Date', all = TRUE), 
   lapply(split(df, cumsum(grepl('Station Name:', df$Column1))), 
       function(x) setNames(x, c("Date", x$Column2[1]))[-(1:2),])),
       as.is = TRUE)

how to convert pandas data frame rows into columns

What you want is a called a pivot:

df.pivot(*df).fillna(0).add_suffix('_Sales')

output:

Brand             B1_Sales  B2_Sales  B3_Sales  B4_Sales  B5_Sales
ChannelPartnerID                                                  
10000                29630     38573      1530     21793      7155
10001                26477     42158         0         0     14612
10002                 6649         0         0      6468         0

NB. df.pivot(*df) is a shortcut for df.pivot(index='ChannelPartnerID', columns='Brand', values='Sales')

Converting rows to columns for a dataframe in R

Try

library(reshape2)
df
       Date Time Object_Name Object_Value
1 7/28/2017 8:00          A1        58.56
2 7/28/2017 8:00          A2        51.66
3 7/28/2017 8:30          A1        60.20
4 7/28/2017 8:30          A2        65.20

dcast(df, Date + Time ~ Object_Name)

       Date Time    A1    A2
1 7/28/2017 8:00 58.56 51.66
2 7/28/2017 8:30 60.20 65.20

Alternatively,

library(tidyr)
spread(df, Object_Name, Object_Value)
       Date Time    A1    A2
1 7/28/2017 8:00 58.56 51.66
2 7/28/2017 8:30 60.20 65.20

To address the comment, the above works well if you have unique cases. Consider for instance the following:

df
       Date Time Object_Name Object_Value
1 7/28/2017 8:00          A1        58.56
2 7/28/2017 8:00          A1        50.00
3 7/28/2017 8:00          A2        51.66
4 7/28/2017 8:30          A1        60.20
5 7/28/2017 8:30          A2        65.20

Look at the first two rows, and you can see that for the same date, time and Object_Name, we have two values. This implies that dcast does not know what to do and gives the following warning: Aggregation function missing: defaulting to length. We can handle this by specifying the aggregation function. For instance, let's take the mean of these values:

dcast(df, Date + Time ~ Object_Name, fun.aggregate = mean)
       Date Time    A1    A2
1 7/28/2017 8:00 54.28 51.66
2 7/28/2017 8:30 60.20 65.20

R - Convert and transpose data to columns by group

We can use tidyr::spread

library(tidyverse)
df %>% group_by(a) %>% mutate(n = 1:n()) %>% spread(a, b) %>% select(-n)
## A tibble: 5 x 3
#  Group1 Group2 Group3
#  <fct>  <fct>  <fct>
#1 Item1  Item4  Item9
#2 Item2  Item5  NA
#3 Item3  Item6  NA
#4 NA     Item7  NA
#5 NA     Item8  NA

Or if you prefer "--" instead of NA you can do (thanks @AntoniosK)

df %>%
    group_by(a) %>%
    mutate(n = 1:n()) %>%
    spread(a, b) %>%
    select(-n) %>%
    mutate_all(~ifelse(is.na(.), "--", as.character(.)))
## A tibble: 5 x 3
#  Group1 Group2 Group3
#  <chr>  <chr>  <chr>
#1 Item1  Item4  Item9
#2 Item2  Item5  --
#3 Item3  Item6  --
#4 --     Item7  --
#5 --     Item8  --

or using tidyr::spreads fill argument

df %>%
    mutate_if(is.factor, as.character) %>%
    group_by(a) %>%
    mutate(n = 1:n()) %>%
    spread(a, b, fill = "--") %>%
    select(-n)

giving the same result.

Sample data

a <- c("Group1", "Group1", "Group1", "Group2", "Group2", "Group2", "Group2", "Group2", "Group3")
b <- c("Item1", "Item2", "Item3", "Item4", "Item5", "Item6", "Item7", "Item8", "Item9")
df <- data.frame(a = a, b = b)

How to find common rows between two data frames?

You can use the following code:

c<- data.frame(A = c(4,6,7), B = c(5,9,8),C = c("T","T","F"))
d<- data.frame(A = c(6,7,3),B = c(9,8,3),C = c("T","F","F"))

merge(c, d, by= c("A", "B", "C"))

Output:

  A B C
1 6 9 T
2 7 8 F

Convert Data Frame Common Rows to Columns