Reshaping Data.Frame from Wide to Long Format

Reshaping data.frame from wide to long format

reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:

reshape(d, 
direction = "long",
varying = list(names(d)[3:7]),
v.names = "Value",
idvar = c("Code", "Country"),
timevar = "Year",
times = 1950:1954)

reshape dataframe from wide to long in R

Using data.table:

library(data.table)
setDT(mydata)
result <- melt(mydata, id=c('id', 'name'),
measure.vars = patterns(fixed='fixed_', current='current_'),
variable.name = 'year')
years <- as.numeric(gsub('.+_(\\d+)', '\\1', grep('fixed', names(mydata), value = TRUE)))
result[, year:=years[year]]
result[, id:=seq(.N), by=.(name)]
result
## id name year fixed current
## 1: 1 A 2020 2300 3000
## 2: 2 A 2019 2100 3100
## 3: 3 A 2018 2600 3200
## 4: 4 A 2017 2600 3300
## 5: 5 A 2016 1900 3400

This should be very fast but your data-set is not very big tbh.

Note that this assumes the fixed and current columns are in the same order and associated with the same year(s). So if there is a fixed_2020 as the first fixed_* column, there is also a current_2020 as the first current_* column, and so on. Otherwise, the year column will correctly associate with fixed but not current

Reshaping long format dataframe to wide format according to the value of the elements in columns

Reshape the dataframe using pivot then subtract 5 from all the values and add prefix of n to column names:

df.pivot('group', 'ID', 'rank').rsub(5).add_prefix('n')


ID      n1   n2   n3   n4
group
1 3.0 2.0 4.0 1.0
2 3.0 4.0 NaN NaN
3 4.0 1.0 2.0 3.0
4 4.0 NaN NaN NaN
5 2.0 3.0 4.0 NaN
6 4.0 NaN NaN NaN

Reshaping wide dataframe to long format

Does this work for you?

library(dplyr)
library(tidyr)

structure(list(name = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), other_info = structure(1:3, .Label = c("Info1",
"Info2", "Info3"), class = "factor"), revenues_2015 = structure(c(1L,
3L, 2L), .Label = c("1", "11", "6"), class = "factor"), ebitda_2015 = structure(c(2L,
3L, 1L), .Label = c("12", "2", "7"), class = "factor"), ebitda_2016 = structure(c(2L,
3L, 1L), .Label = c("13", "3", "8"), class = "factor"), revenues_2015 = structure(c(2L,
3L, 1L), .Label = c("14", "4", "9"), class = "factor"), other_2017 = structure(c(3L,
1L, 2L), .Label = c("10", "15", "5"), class = "factor")), class = "data.frame", row.names = c(NA,
-3L)) %>%
pivot_longer(revenues_2015:other_2017, names_pattern = "(.+)_(\\d{4})", names_to = c("metric", "year"))

Reshape wide dataframe of unequal size to long format

You could use

library(dplyr)
library(tidyr)

df %>%
pivot_longer(-value, values_to = "word") %>%
drop_na(word) %>%
select(word, value)

This returns

# A tibble: 5 x 2
word value
<chr> <dbl>
1 abmachen 0.4
2 abgemacht 0.4
3 abmachst 0.4
4 Aktualisierung 0.2
5 Aktualisierungen 0.2


Related Topics



Leave a reply



Submit