sum two columns in R
The sum
function will add all numbers together to produce a single number, not a vector (well, at least not a vector of length greater than 1).
It looks as though at least one of your columns is a factor. You could convert them into numeric vectors by checking this
head(as.numeric(data$col1)) # make sure this gives you the right output
And if that looks right, do
data$col1 <- as.numeric(data$col1)
data$col2 <- as.numeric(data$col2)
You might have to convert them into characters first. In which case do
data$col1 <- as.numeric(as.character(data$col1))
data$col2 <- as.numeric(as.character(data$col2))
It's hard to tell which you should do without being able to see your data.
Once the columns are numeric, you just have to do
data$col3 <- data$col1 + data$col2
summing multiple columns in an R data-frame quickly
Here's an alternative approach using tidyverse
:
library(tidyverse)
# input columns of interest
cols = c("mpg", "cyl", "disp", "hp", "drat")
mtcars %>%
group_by(id = row_number()) %>% # for each row
nest(cols) %>% # nest selected columns
mutate(SUM = map_dbl(data, sum)) # calculate the sum of those columns
# # A tibble: 32 x 3
# id data SUM
# <int> <list> <dbl>
# 1 1 <tibble [1 x 5]> 301.
# 2 2 <tibble [1 x 5]> 301.
# 3 3 <tibble [1 x 5]> 232.
# 4 4 <tibble [1 x 5]> 398.
# 5 5 <tibble [1 x 5]> 565.
# 6 6 <tibble [1 x 5]> 357.
# 7 7 <tibble [1 x 5]> 631.
# 8 8 <tibble [1 x 5]> 241.
# 9 9 <tibble [1 x 5]> 267.
# 10 10 <tibble [1 x 5]> 320.
# # ... with 22 more rows
The output here is a data frame containing the row id (id
), the data used at each row (data
) and the calculated sum (SUM
).
You can get a vector of the calculated SUM
if you add ... %>% pull(SUM)
.
How to sum multiple columns in two data frames in r
Here's a base R option :
tmp <- cbind(df1, df2)
data.frame(sapply(split.default(tmp, names(tmp)), rowSums))
# V1 V2 V3 V4 V5
#1 4 8 5 5 4
#2 6 10 7 7 0
data
df1 < -structure(list(V1 = 2:3, V2 = 4:5, V3 = c(5L, 7L)),
class = "data.frame", row.names = c(NA, -2L))
df2 <- structure(list(V1 = 2:3, V5 = c(4L, 0L), V2 = 4:5, V4 = c(5L,
7L)), class = "data.frame", row.names = c(NA, -2L))
Conditional cumulative sum from two columns
You were essentially in the right direction. Since you provide an .init
value to accumulate
, the resulting vector is of size n+1
, with the first value being .init
. You have to remove the first value to get a vector that fit to your column size.
Then, if you want NAs on the remaining values, here's a way to do it. Also, since the "starting row" is the third, .init
has to be set to 8.
df %>%
mutate(test =
ifelse(source == "B", accumulate(add, .init = 8, ~.x + .y)[-1], NA))
# A tibble: 6 x 4
source value add test
<chr> <dbl> <dbl> <dbl>
1 A 5 1 NA
2 A 10 1 NA
3 B NA 1 11
4 B NA 2 13
5 B NA 3 16
6 C 20 4 NA
join and sum columns together R
Using plyr
and dplyr
you can do this:
df %>%
rowwise() %>%
mutate(f_new=sum(f, f2, na.rm = T))
# A tibble: 6 x 5
# ca f f2 f3 f_new
# <fct> <dbl> <dbl> <dbl> <dbl>
#1 a 3 NA 3 3
#2 b 4 5 0 9
#3 a 0 6 6 6
#4 c NA 1 3 1
#5 b 3 9 0 12
#6 b 4 7 8 11
This method will retain and NA
values
R: data.table group and sum two columns
From your comments it seems like you are getting the desired result but in scientific notation. Try rounding to 3 decimals if you want to see it as you say:
DT[, lapply(.SD, function(x) round(sum(x), 3)), by = c("PARK", "WTG")]
Related Topics
How to Delete Rows Where All the Columns Are Zero
How to Find the Difference in Value in Every Two Consecutive Rows in R
Suppress Console Output in R Markdown, But Keep Plot
Mean Per Group in a Data.Frame
Count Number of Rows in a Data Frame in R Based on Group
Selecting Data Frame Rows Based on Partial String Match in a Column
What Does "The Following Object Is Masked from 'Package:Xxx'" Mean
Expert R Users, What's in Your .Rprofile
Numeric Comparison Difficulty in R
Conditionally Replace Values of Subset of Rows With Column Name in R Using Only Tidy
Convert Dataframe Column to 1 or 0 for "True"/"False" Values and Assign to Dataframe
Extract Rows for the First Occurrence of a Variable in a Data Frame
Split Data Frame String Column into Multiple Columns
How to Replace Na Values With Zeros in an R Dataframe
Relative Frequencies/Proportions With Dplyr