Calculating Standard Deviation Across Rows

Calculating standard deviation of each row

You can use apply and transform functions

set.seed(007)
X <- data.frame(matrix(sample(c(10:20, NA), 100, replace=TRUE), ncol=10))
transform(X, SD=apply(X,1, sd, na.rm = TRUE))
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 SD
1 NA 12 17 18 19 16 12 13 20 14 3.041381
2 14 12 13 13 14 18 16 17 20 10 3.020302
3 11 19 NA 12 19 19 19 20 12 20 3.865805
4 10 11 20 12 15 17 18 17 18 12 3.496029
5 12 15 NA 14 20 18 16 11 14 18 2.958040
6 19 11 10 20 13 14 17 16 10 16 3.596294
7 14 16 17 15 10 11 15 15 11 16 2.449490
8 NA 10 15 19 19 12 15 15 19 14 3.201562
9 11 NA NA 20 20 14 14 17 14 19 3.356763
10 15 13 14 15 NA 13 15 NA 15 12 1.195229

From ?apply you can see ... which allows using optional arguments to FUN, in this case you can use na.rm=TRUE to omit NA values.

Using rowSds from matrixStats package also requires setting na.rm=TRUE to omit NA

library(matrixStats)
transform(X, SD=rowSds(X, na.rm=TRUE)) # same result as before.

Calculating standard deviation across rows

Try this (using), withrowSds from the matrixStats package,

library(dplyr)
library(matrixStats)

columns <- c('colB', 'colC', 'colD')

df %>%
mutate(Mean= rowMeans(.[columns]), stdev=rowSds(as.matrix(.[columns])))

Returns

   colA colB colC colD     Mean    stdev
1 SampA 21 15 10 15.33333 5.507571
2 SampB 20 14 22 18.66667 4.163332
3 SampC 30 12 18 20.00000 9.165151

Your data

colA <- c("SampA", "SampB", "SampC")
colB <- c(21, 20, 30)
colC <- c(15, 14, 12)
colD <- c(10, 22, 18)
df <- data.frame(colA, colB, colC, colD)
df

R Standard Deviation Across Rows

This should do the trick.

iris %>% mutate(stDev = apply(.[(1:4)],1,sd))

Calculate standard deviation across multiple rows grouped by ID

You can use pivot_longer() to stack y1 to y3 and then calculate the sd.

library(dplyr)
library(tidyr)

df %>%
pivot_longer(y1:y3) %>%
group_by(ID) %>%
summarise(sd = sd(value))

# # A tibble: 3 x 2
# ID sd
# <chr> <dbl>
# 1 a 2.96
# 2 b 1.91
# 3 c 2.39

R Standard deviation across columns and rows by id

You can try :

library(dplyr)
df %>%
group_by(id) %>%
mutate(SD = sd(unlist(select(cur_data(), col2:col4))))

# id col1 col2 col3 col4 col5 SD
# <int> <int> <int> <int> <int> <chr> <dbl>
#1 1 4 3 5 4 A 2.12
#2 1 3 5 4 9 Z 2.12
#3 1 5 8 3 4 H 2.12
#4 2 6 9 2 1 B 3.41
#5 2 4 9 5 4 K 3.41
#6 3 2 1 7 5 J 2.62
#7 3 5 8 4 3 B 2.62
#8 3 6 4 3 9 C 2.62

How to calculate standard deviation per row?

apply lets you apply a function to all rows of your data:

apply(values_for_all, 1, sd, na.rm = TRUE)

To compute the standard deviation for each column instead, replace the 1 by 2.

Is there a way to calculate standard deviation for each cell across multiple data frames in R?

Simplify to an array, then get sd across each row/col position in each set of data:

data.frame(apply(sapply(sampledData, as.matrix, simplify="array"), c(1,2), sd))


Related Topics



Leave a reply



Submit