Dividing each cell in a data set by the column sum in R
Given this:
> d = data.frame(sample=c("a2","a3"),a=c(1,5),b=c(4,5),c=c(6,4))
> d
sample a b c
1 a2 1 4 6
2 a3 5 5 4
You can replace every column other than the first by applying over the rest:
> d[,-1] = apply(d[,-1],2,function(x){x/sum(x)})
> d
sample a b c
1 a2 0.1666667 0.4444444 0.6
2 a3 0.8333333 0.5555556 0.4
If you don't want d
being stomped on make a copy beforehand.
summing rows of specific columns then dividing by the sum
Try this way to specify the column (by sub-setting Df
), and then indicating the margin as 1
Df_new = t(apply(Df[,c(1:3)], 1, \(x) x/sum(x)))
lose draw win
[1,] 0.5000 0.1428571 0.3571429
[2,] 0.0625 0.1250000 0.8125000
Dividing columns by colSums in R
See ?sweep
, eg:
> sweep(m,2,colSums(m),`/`)
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
or you can transpose the matrix and then colSums(m)
gets recycled correctly. Don't forget to transpose afterwards again, like this :
> t(t(m)/colSums(m))
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
Or you use the function prop.table()
to do basically the same:
> prop.table(m,2)
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
The time differences are rather small. the sweep()
function and the t()
trick are the most flexible solutions, prop.table()
is only for this particular case
Dividing cell with sum of every nth cell in same column in R
You can achieve your "dream dataframe" by :
library(dplyr)
df %>%
group_by(Country) %>%
mutate(across(LT5F:Y9t14T, prop.table)) %>%
ungroup
# Country LT5F LT5M LT5T Y9t14F Y9t14M Y9t14T
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 AL 0.4 0.357 0.375 0.333 0.0909 0.2
# 2 AL 0.2 0.214 0.208 0.222 0.455 0.35
# 3 AL 0.1 0.286 0.208 0.111 0.273 0.2
# 4 AL 0.3 0.143 0.208 0.333 0.182 0.25
# 5 FR 0.25 0.2 0.222 0.263 0.25 0.257
# 6 FR 0.125 0.1 0.111 0.158 0.375 0.257
# 7 FR 0.5 0 0.222 0.368 0.0625 0.229
# 8 FR 0.125 0.7 0.444 0.211 0.312 0.257
# 9 UK 0.286 0.5 0.385 0.231 0.214 0.222
#10 UK 0.143 0.333 0.231 0.231 0.286 0.259
#11 UK 0.286 0.167 0.231 0.154 0.286 0.222
#12 UK 0.286 0 0.154 0.385 0.214 0.296
If you have NA
's you can use :
library(dplyr)
df %>%
group_by(Country) %>%
mutate(across(LT5F:Y9t14T, ~./sum(., na.rm = TRUE))) %>%
ungroup
Divide each each cell of large matrix by sum of its row
You could do this using apply
, but scale
in this case makes things even simplier. Assuming you want to divide columns by their sums:
set.seed(0)
relative_abundance <- matrix(sample(1:10, 360*375, TRUE), nrow= 375)
freqs <- scale(relative_abundance, center = FALSE,
scale = colSums(relative_abundance))
The matrix is too big to output here, but here's how it shoud look like:
> head(freqs[, 1:5])
[,1] [,2] [,3] [,4] [,5]
[1,] 0.004409603 0.0014231499 0.003439803 0.004052685 0.0024026910
[2,] 0.001469868 0.0023719165 0.002457002 0.005065856 0.0004805382
[3,] 0.001959824 0.0018975332 0.004914005 0.001519757 0.0043248438
[4,] 0.002939735 0.0042694497 0.002948403 0.002532928 0.0009610764
[5,] 0.004899559 0.0009487666 0.000982801 0.001519757 0.0028832292
[6,] 0.001469868 0.0023719165 0.002457002 0.002026342 0.0009610764
And a sanity check:
> head(colSums(freqs))
[1] 1 1 1 1 1 1
Using apply
:
freqs2 <- apply(relative_abundance, 2, function(i) i/sum(i))
This has the advatange of being easly changed to run by rows, but the results will be joined as columns anyway, so you'd have to transpose it.
Column sum in mutate function in R
You are very close.
Is this what you want?
head(iris[1:4]) %>% summarise(across(.cols = c(1:4), .fns = function(x) {x/sum(x)}))
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width
1 0.1717172 0.1724138 0.1609195 0.1428571
2 0.1649832 0.1477833 0.1609195 0.1428571
3 0.1582492 0.1576355 0.1494253 0.1428571
4 0.1548822 0.1527094 0.1724138 0.1428571
5 0.1683502 0.1773399 0.1609195 0.1428571
6 0.1818182 0.1921182 0.1954023 0.2857143
How do I Divide Values in a Column by the Value in the Last Cell?
You can try
df$CFn <- with(df,CumFreq/sum(Freq))
or
df$CFn <- with(df,CumFreq/tail(CumFreq,1))
Divide row value by aggregated sum in R data.frame
There are various ways of solving this, here's one
with(dat, ave(y, x, FUN = function(x) x/sum(x)))
## [1] 0.3750000 0.6666667 0.4444444 0.5555556 0.3333333 0.6250000
Here's another possibility
library(data.table)
setDT(dat)[, z := y/sum(y), by = x]
dat
# x y z
# 1: 1 3 0.3750000
# 2: 2 4 0.6666667
# 3: 3 4 0.4444444
# 4: 3 5 0.5555556
# 5: 2 2 0.3333333
# 6: 1 5 0.6250000
Here's a third one
library(dplyr)
dat %>%
group_by(x) %>%
mutate(z = y/sum(y))
# Source: local data frame [6 x 3]
# Groups: x
#
# x y z
# 1 1 3 0.3750000
# 2 2 4 0.6666667
# 3 3 4 0.4444444
# 4 3 5 0.5555556
# 5 2 2 0.3333333
# 6 1 5 0.6250000
Related Topics
Ggplot Bar Plot Side by Side Using Two Variables
Extracting Data from Text Files
Store Arrangegrob to Object, Does Not Create Printable Object
How to Create a Bar and Line Plot with R Dygraphs
How to Rearrange an Order of Matches Between Two Data Frames
Why Are the Colors Wrong on This Ggplot
Upload and View a PDF in R Shiny
Calculate Percentage for Each Time Series Observations Per Group in R
Factor with Comma and Percentage to Numeric
How to Get Leaflet for R Use 100% of Shiny Dashboard Height
Create Combinations of a Binary Vector
R - Svd() Function - Infinite or Missing Values in 'X'
Unexpected Symbol Error in Parse(Text = Str) with Hyphen After a Digit
Scraping Leaderboard Table on Golf Website in R
Purrr:Map and Glm - Issues with Call
Lm(): What Is Qraux Returned by Qr Decomposition in Linpack/Lapack