## How to calculate row mean from selected columns

Just subset each row by their means in respective rows `w`

before calculating their means.

`w <- c("01-01-2018", "02-01-2018", "03-01-2018") ## define columns`

apply(data[, w], 1, function(x) mean(x[x > mean(x)]))

# [1] 3.40 2.75 4.90 -0.10 1.15

Another way is to `replace`

data points that don't exceed the row means with `NA's`

before calculating `rowMeans`

. This is about **30 times faster**.

`rowMeans(replace(data, data <= rowMeans(data[, w]), NA), na.rm=TRUE)`

# [1] 3.40 2.75 4.90 -0.10 1.15

*Data:*

`data <- structure(list(`01-01-2018` = c(1.2, 3.1, 0.7, -0.3, 2), `02-01-2018` = c(-0.1, `

2.4, 4.9, -3.3, -2.7), `03-01-2018` = c(3.4, -2.6, -1.8, 0.1,

0.3)), class = "data.frame", row.names = c(NA, -5L))

## R tidy row means from subset of columns

In my previous version I thought that `rowMeans`

is the concern, but actually what is slowing down the calculation is the usage of `select`

- better just stick with the `grep`

family:

`df %>% mutate(A = rowMeans(.[, grepl("^A", names(.))]))`

## Calculate row means on subset of columns

**Calculate row means on a subset of columns:**

Create a new data.frame which specifies the first column from DF as an column called ID and calculates the mean of all the other fields on that row, and puts that into column entitled 'Means':

`data.frame(ID=DF[,1], Means=rowMeans(DF[,-1]))`

ID Means

1 A 3.666667

2 B 4.333333

3 C 3.333333

4 D 4.666667

5 E 4.333333

## How to calculate rowMeans of columns with similar colnames in r?

We can iterate over unique names, subset them from original dataframe and take `rowMeans`

.

`sapply(c("A", "B"), function(x) rowMeans(df[,colnames(df) == x]))`

# A B

#[1,] 2 6.67

#[2,] 3 7.00

## Issue with calculating row mean in data table for selected columns in R

Ok so you're doing a couple of things wrong. First, `rowMeans`

can't evaluate a character vector, if you want to select columns by using it you must use `.SD`

and pass the character vector to `.SDcols`

. Second, you're trying to calculate a row aggregation and grouping, which I don't think makes much sense. Third, even if your expression didn't throw an error, you are assigning it back to `Table`

, which would destroy your original data (if you want to add a new column use `:=`

to add it by reference).

What you want to do is calculate the row means of your selected columns, which you can do like this:

`Table[, AvgGM := rowMeans(.SD), .SDcols = sel_cols_GM] `

Table[, AvgPM := rowMeans(.SD), .SDcols = sel_cols_PM]

This means create these new columns as the row means of my subset of data (`.SD`

) which refers to these columns (`.SDcols`

)

## calculating row means for only rows that have more than one data point in R

You could make a function that applies a mean to a row based on some condition. In your example, if there are two or more valid measurements, calculate mean.

`a <- c(1,0,NA,1,NA,0,1,0,NA,0,NA)`

b <- c(1,0,NA,1,0,1,1,1,NA,0,1)

c <- c(1,NA,NA,0,NA,0,1,1,1,0,0)

mydata <- data.frame(a,b,c)

Reading functions is best done from inside out. This one will take a vector `x`

and see how many are *not* NA. When it sums (`sum`

) the TRUE/FALSE values it turns them beforehand to 1 and 0, respectively. It then performs a test if there are more than 1 (so 2 or more) values - that are not NA.

`conditionalMean <- function(x) {`

if (sum(!is.na(x)) > 1) {

mean(x, na.rm = TRUE)

} else {

NA

}

}

We apply this function to your `data.frame`

row-wise, as denoted by `MARGIN = 1`

. If you had a function that worked column-wise, you would use `MARGIN = 2`

. You can try it out. Compare `apply(mydata, MARGIN = 2, FUN = mean, na.rm = TRUE)`

and `colMeans(mydata, na.rm = TRUE)`

.

`apply(mydata, MARGIN = 1, FUN = conditionalMean)`

[1] 1.0000000 0.0000000 NA 0.6666667 NA 0.3333333 1.0000000

[8] 0.6666667 NA 0.0000000 0.5000000

## Calculate row means on specific columns

You can use `aggregate`

`aggregate(Reading~Sample,data=yourdata, mean)`

## Aggregate the mean 2 columns to become one column for each row

You can use `apply`

:

Data:

`df <- data.frame(`

Time = c(-200, -1.34, 0.536),

"1a" = c(-0.02, -0.003, 0.057),

"1b" = c(-0.006, -0.04, 0.0235)

)

Solution:

`df$mean <- apply(df[-1], 1, mean)`

Result:

`df`

Time X1a X1b mean

1 -200.000 -0.020 -0.0060 -0.01300

2 -1.340 -0.003 -0.0400 -0.02150

3 0.536 0.057 0.0235 0.04025

Alternatively, as suggested by @jay.sf, use `rowMeans`

, which is faster in terms of execution:

`rowMeans(df[2:3])`

[1] -0.01300 -0.02150 0.04025

## Calculate row means on subset of columns selected via external rank

For the first question,

you could get the mean of the first 2 non NA values per row using `apply`

:

`df$BestAvg = apply(df,1,function(x) mean(x[!is.na(x)][1:2]))`

In the case that the ranking of coders is actually `CoderD > CoderB > CoderC > CoderA`

:

`r = c("CoderD", "CoderB", "CoderC", "CoderA")`

df$BestAvg2 = apply(df,1,function(x) mean(x[r][!is.na(x[r])][1:2]))

This returns:

` CoderA CoderB CoderC CoderD BestAvg BestAvg2`

1 2 1 NA 1 1.5 1.0

2 1 3 3 NA 2.0 3.0

3 NA NA 4 5 4.5 4.5

4 7 6 7 6 6.5 6.0

5 3 3 4 2 3.0 2.5

6 2 2 NA NA 2.0 2.0

7 2 NA 2 1 2.0 1.5

8 5 3 NA 4 4.0 3.5

9 7 7 6 NA 7.0 6.5

10 1 NA 3 4 2.0 3.5

### Related Topics

Delete Rows With Negative Values

Break Dataframe into Smaller Dataframe'S and Save Them

Duplicate Columns in Spark Dataframe

How to Write Ifelse Statement With Multiple Conditions in R

Apply Several Summary Functions on Several Variables by Group in One Call

Strptime, As.Posixct and As.Date Return Unexpected Na

Installing Older Version of R Package

How to Change Legend Title in Ggplot

How to Remove Na from a Factor Variable (And from a Ggplot Chart)

How to Change the Default Colors in Plotly Chart

Add Legend to Geom_Line() Graph in R

How to Change Y Axis Limits in Decimal Points in R

Plotting Two Variables as Lines Using Ggplot2 on the Same Graph

Pass a Data.Frame Column Name to a Function

Plot Two Graphs in Same Plot in R

Selecting Data Frame Rows Based on Partial String Match in a Column