R: How to rescale my matrix by column
Something like
commercialMedExp0A <- t(apply(states0CommercialA, 1, function(x){ x * commercialMedExp}))
should work so long as the number of columns in states0CommericialA
is the same length as commercialMedExp
. If it is not you would have to subset the data. For example, if the disease states are in columns 13 through 18
commercialMedExp0A <- t(apply(states0CommercialA[,c(13:18)], 1, function(x){ x * commercialMedExp}))
Scaling a numeric matrix in R with values 0 to 1
Try the following, which seems simple enough:
## Data to make a minimal reproducible example
m <- matrix(rnorm(9), ncol=3)
## Rescale each column to range between 0 and 1
apply(m, MARGIN = 2, FUN = function(X) (X - min(X))/diff(range(X)))
# [,1] [,2] [,3]
# [1,] 0.0000000 0.0000000 0.5220198
# [2,] 0.6239273 1.0000000 0.0000000
# [3,] 1.0000000 0.9253893 1.0000000
how to scale a matrix by group?
We can use data.table
. Convert the 'data.frame' to 'data.table' (setDT(my.df)
, grouped by 'sex', selecting the columns of interest in .SDcols
, we loop through the columns (lapply(.SD, ...
) , do the scale
and convert to vector
. (The scale
function output a matrix with some attributes, which will create some problems if we don't convert to vector
.)
library(data.table)
setDT(my.df)[, c('x', 'y', 'z') := lapply(.SD, function(x)
as.vector(scale(x))) , by = sex, .SDcols= x:z]
Scaling a Single Column Vector of a Matrix in R
You can assign it to the columns that is being changed
A[,1] <- A[,1]*10
A
# [,1] [,2]
#[1,] 10 2
#[2,] 30 4
data
A <- matrix(c(1,3,2,4), ncol=2)
How can i rescale every column in my data frame to a 0-100 scale? (in r)
Using scale
, if dat
is the name of your data frame:
## for one column
dat$a <- scale(dat$a, center = FALSE, scale = max(dat$a, na.rm = TRUE)/100)
## for every column of your data frame
dat <- data.frame(lapply(dat, function(x) scale(x, center = FALSE, scale = max(x, na.rm = TRUE)/100)))
For a simple case like this, you could also write your own function.
fn <- function(x) x * 100/max(x, na.rm = TRUE)
fn(c(0,1,0))
# [1] 0 100 0
## to one column
dat$a <- fn(dat$a)
## to all columns of your data frame
dat <- data.frame(lapply(dat, fn))
Normalizing columns of matrix between -1 and 1
How about rescaling the matrix x
at the end of your own function?
normalize <- function(x) {
x <- sweep(x, 2, apply(x, 2, min))
x <- sweep(x, 2, apply(x, 2, max), "/")
2*x - 1
}
Column rescaling for a very large sparse matrix in R
This is what we can do, assuming A
is a dgCMatrix
:
A@x <- A@x / rep.int(colSums(A), diff(A@p))
This requires some understanding of dgCMatrix
class.
@x
stores none-zero matrix values, in a packed 1D array;@p
stores the cumulative number of non-zero elements by column, hencediff(A@p)
gives the number of non-zero elements for each column.
We repeat each element of colSums(A)
by number of none-zero elements in that column, then divide A@x
by this vector. In this end, we update A@x
by rescaled values. In this way, column rescaling is done in a sparse manner.
Example:
library(Matrix)
set.seed(2); A <- Matrix(rbinom(100,10,0.05), nrow = 10)
#10 x 10 sparse Matrix of class "dgCMatrix"
# [1,] . . 1 . 2 . 1 . . 2
# [2,] 1 . . . . . 1 . 1 .
# [3,] . 1 1 1 . 1 1 . . .
# [4,] . . . 1 . 2 . . . .
# [5,] 2 . . . 2 . 1 . . .
# [6,] 2 1 . 1 1 1 . 1 1 .
# [7,] . 2 . 1 2 1 . . 2 .
# [8,] 1 . . . . 3 . 1 . .
# [9,] . . 2 1 . 1 . . 1 .
#[10,] . . . . 1 1 . . . .
diff(A@p) ## number of non-zeros per column
# [1] 4 3 3 5 5 7 4 2 4 1
colSums(A) ## column sums
# [1] 6 4 4 5 8 10 4 2 5 2
A@x <- A@x / rep.int(colSums(A), diff(A@p)) ## sparse column rescaling
#10 x 10 sparse Matrix of class "dgCMatrix"
# [1,] . . 0.25 . 0.250 . 0.25 . . 1
# [2,] 0.1666667 . . . . . 0.25 . 0.2 .
# [3,] . 0.25 0.25 0.2 . 0.1 0.25 . . .
# [4,] . . . 0.2 . 0.2 . . . .
# [5,] 0.3333333 . . . 0.250 . 0.25 . . .
# [6,] 0.3333333 0.25 . 0.2 0.125 0.1 . 0.5 0.2 .
# [7,] . 0.50 . 0.2 0.250 0.1 . . 0.4 .
# [8,] 0.1666667 . . . . 0.3 . 0.5 . .
# [9,] . . 0.50 0.2 . 0.1 . . 0.2 .
#[10,] . . . . 0.125 0.1 . . . .
@thelatemail mentioned another method, by first converting dgCMatrix
to dgTMatrix
:
AA <- as(A, "dgTMatrix")
A@x <- A@x / colSumns(A)[AA@j + 1L]
For dgTMatrix
class there is no @p
but @j
, giving the column index (0 based) for none zero matrix elements.
Standardize data columns in R
I have to assume you meant to say that you wanted a mean of 0 and a standard deviation of 1. If your data is in a dataframe and all the columns are numeric you can simply call the scale
function on the data to do what you want.
dat <- data.frame(x = rnorm(10, 30, .2), y = runif(10, 3, 5))
scaled.dat <- scale(dat)
# check that we get mean of 0 and sd of 1
colMeans(scaled.dat) # faster version of apply(scaled.dat, 2, mean)
apply(scaled.dat, 2, sd)
Using built in functions is classy. Like this cat:
Related Topics
Efficiently Computing a Linear Combination of Data.Table Columns
Rounding Numbers in R to Specified Number of Digits
How to Use Subscripts in Ggplot2 Legends [R]
Ggplot2: How to Use Same Colors in Different Plots for Same Factor
How to Check If CSV File Has a Comma or a Semicolon as Separator
How to Make a Discontinuous Axis in R with Ggplot2
How to Replace Na with Most Recent Non-Na by Group
Replace Empty Values with Value from Other Column in a Dataframe
Split the Title Onto Multiple Lines
Convert a Date Vector into Julian Day in R
How to Use Data.Table Within Functions and Loops
In R, How to Add a Max by Group
Create Counter of Consecutive Runs of a Certain Value
Loop in R: How to Save the Outputs
Writing Robust R Code: Namespaces, Masking and Using the '::' Operator
Selecting Columns in R Data Frame Based on Those *Not* in a Vector