Merge 2 columns into one in dataframe
Some alternative way with function unite
in tidyr
:
library(tidyr)
df = data.frame(year=2009:2013, customerID=20227:20231) # using akrun's data
unite(df, newcol, c(year, customerID), remove=FALSE)
# newcol year customerID
#1 2009_20227 2009 20227
#2 2010_20228 2010 20228
#3 2011_20229 2011 20229
#4 2012_20230 2012 20230
#5 2013_20231 2013 20231
Combine the contents of two columns into one column using R
We can use pivot_longer
and this should be more general as it can also do reshaping based on other patterns and multiple columns as well. Note that pivot_longer
succeeds the reshape2
function melt
with more enhanced capabilities and bug fixes
library(dplyr)
library(tidyr)
pivot_longer(df1, cols = time1:time2, values_to = 'time') %>%
select(-name)
-output
# A tibble: 6 x 2
# id time
# <dbl> <dbl>
#1 1 10
#2 1 15
#3 1 20
#4 1 25
#5 1 30
#6 1 35
Or using base R
with stack
transform(stack(df1[-1])[1], id = rep(df1$id, 2))[2:1]
Or can use data.frame
with unlist
data.frame(id = df1$id, value = unlist(df1[-1], use.names = FALSE))
How can I combine multiple columns into one in an R dataset?
A solution using tidyverse
. dat4
is the final output.
library(tidyverse)
dat2 <- dat %>%
mutate(ID = 1:n())
dat3 <- dat2 %>%
pivot_longer(a:f, names_to = "value", values_to = "number") %>%
filter(number == 1) %>%
select(-number)
dat4 <- dat2 %>%
left_join(dat3) %>%
select(-ID, -c(a:f)) %>%
replace_na(list(value = "none"))
dat4
# age gender race insured value
# 1 13 Female white 0 none
# 2 12 Female white 1 none
# 3 19 Male other 0 f
# 4 19 Female white 0 b
# 5 13 Female white 0 a
# 6 13 Female white 0 b
# 7 13 Female white 0 f
DATA
dat <- read.table(text = " age gender a b c d e f race insured
1 13 Female 0 0 0 0 0 0 white 0
2 12 Female 0 0 0 0 0 0 white 1
3 19 Male 0 0 0 0 0 1 other 0
4 19 Female 0 1 0 0 0 0 white 0
5 13 Female 1 1 0 0 0 1 white 0",
header = TRUE)
How can I combine several columns into one variable, tacking each onto the end of the other and grouping by values in an ID variable?
Try to set the inputs of the function pivot_longer()
correctly as cols and values_to. cols=...
defines the columns which you are taking the values from. values_to = ...
defines the new name of the column where you are writing the values you took from 'cols'. Actually I think you were doing good, just pivot_longer
returns always the names of the columns which values you are taking from, unless you try other trickier things.
library(tidyverse)
df = data.frame(
a = c("string1","string2"),
b= c("string11","string12"),
c = c("string21", "string22"),
ID = c("1111","2222")
)
df %>%
pivot_longer(cols = names(df)[1:3],
values_to = "newvar") %>%
select(newvar, ID)
Output:
# A tibble: 6 x 2
newvar ID
<chr> <chr>
1 string1 1111
2 string11 1111
3 string21 1111
4 string2 2222
5 string12 2222
6 string22 2222
combine two similar columns in r
I guess you can use coalesce
here which finds the first non-missing value at each position.
library(dplyr)
gadd.us %>% mutate(w1iq = coalesce(w1iq, wasiIQw1))
This will select values from w1iq
if present or if w1iq
is NA
then it would select value from wasiIQw1
. You can switch the position of w1iq
and wasiIQw1
if you want to give priority to wasiIQw1
.
How to combine two columns into one in R, so that each value in the second column becomes every other value in the first column?
Not fully sure what your logic is for the ymin
/ymax
but this is the general idea, run it by line to see what's happening.
percent_car %>%
pivot_longer(names_to = "key", values_to = "value", -position) %>%
mutate(
yes_no = str_extract(key, "yes|no"),
key = str_remove_all(key, "yes|no|_")
) %>%
pivot_wider(names_from = key, values_from = value) %>%
arrange(position) %>%
mutate(
ymax = if_else(yes_no == "yes", perc, 1),
ymin = if_else(yes_no == "yes", 0, 1-perc)
)
)
case_when
will be your friend if the if_else
needs to be nested
How to combine two columns of different datasets in R?
You could use rbind
after ensuring that they both have the same names:
C <- rbind(setNames(A, 'X'), setNames(B, 'X'))
Another way is to concatenate the two:
C <- data.frame(X = c(A$X1, B$term))
R Merging multiple columns into one depending on if the cell is empty
Try replacing the "" in your columns with NA and your code should work.
df <- data.frame(
ResponseID = c (1:6),
ZER_Condition = c ("Low","Med",NA,NA,NA,NA),
LOW_Condition = c (NA,NA,"High","Low",NA,NA),
MED_Condition = c (NA,NA,NA,NA,"High",NA),
HIG_Condition = c (NA,NA,NA,NA,NA,"Low")
)
df %>% mutate (Merged_Condition = coalesce(ZER_Condition,LOW_Condition,MED_Condition,HIG_Condition)) %>%
select(ResponseID, Merged_Condition)
Combining two columns in order to get one column in R
One solution is using dplyr
's coalesce
function
lebanon$test <- dplyr::coalesce(lebanon$income_under_median, lebanon$income_above_median)
or, within a pipeline
library(dplyr)
lebanon %>%
mutate(test = coalesce(income_under_median, income_above_median))
Output
# income_under_median income_above_median test
# 1 <NA> 2.501.000 - 3.000.000 2.501.000 - 3.000.000
# 2 751.000 - 1.000.000 <NA> 751.000 - 1.000.000
# 3 751.000 - 1.000.000 <NA> 751.000 - 1.000.000
# 4 Below 451.000 <NA> Below 451.000
# 5 <NA> Below 1.501.000 Below 1.501.000
# 6 <NA> Below 1.501.000 Below 1.501.000
# 7 <NA> 2.001.000 - 2.500.000 2.001.000 - 2.500.000
# 8 <NA> 1.501.000 - 2.000.000 1.501.000 - 2.000.000
# 9 451.000 - 750.000 <NA> 451.000 - 750.000
# 10 <NA> 3.001.000 - 4.000.000 3.001.000 - 4.000.000
Related Topics
Duplicate 'Row.Names' Are Not Allowed Error
Add Objects to Package Namespace
R: Assign Variable Labels of Data Frame Columns
Mean of a Column in a Data Frame, Given the Column's Name
Marker Mouse Click Event in R Leaflet for Shiny
Ggmap Error: Geomrasterann Was Built with an Incompatible Version of Ggproto
Emulate Split() with Dplyr Group_By: Return a List of Data Frames
Displaying a PDF from a Local Drive in Shiny
Show Frequencies Along with Barplot in Ggplot2
Reading 40 Gb CSV File into R Using Bigmemory
Count Number of Rows Matching a Criteria
Split Character Data into Numbers and Letters
Ggplot Centered Names on a Map
How to Stop Executing of R Code Inside Shiny (Without Stopping the Shiny Process)
Add a Row by Reference at the End of a Data.Table Object