R: How to Rescale My Matrix by Column

R: How to rescale my matrix by column

Something like

commercialMedExp0A <- t(apply(states0CommercialA, 1, function(x){ x * commercialMedExp}))

should work so long as the number of columns in states0CommericialA is the same length as commercialMedExp. If it is not you would have to subset the data. For example, if the disease states are in columns 13 through 18

    commercialMedExp0A <- t(apply(states0CommercialA[,c(13:18)], 1, function(x){ x * commercialMedExp}))

Scaling a numeric matrix in R with values 0 to 1

Try the following, which seems simple enough:

## Data to make a minimal reproducible example
m <- matrix(rnorm(9), ncol=3)

## Rescale each column to range between 0 and 1
apply(m, MARGIN = 2, FUN = function(X) (X - min(X))/diff(range(X)))
# [,1] [,2] [,3]
# [1,] 0.0000000 0.0000000 0.5220198
# [2,] 0.6239273 1.0000000 0.0000000
# [3,] 1.0000000 0.9253893 1.0000000

how to scale a matrix by group?

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(my.df), grouped by 'sex', selecting the columns of interest in .SDcols, we loop through the columns (lapply(.SD, ...) , do the scale and convert to vector. (The scale function output a matrix with some attributes, which will create some problems if we don't convert to vector.)

library(data.table)
setDT(my.df)[, c('x', 'y', 'z') := lapply(.SD, function(x)
as.vector(scale(x))) , by = sex, .SDcols= x:z]

Scaling a Single Column Vector of a Matrix in R

You can assign it to the columns that is being changed

A[,1] <- A[,1]*10
A
# [,1] [,2]
#[1,] 10 2
#[2,] 30 4

data

A <- matrix(c(1,3,2,4), ncol=2)

How can i rescale every column in my data frame to a 0-100 scale? (in r)

Using scale, if dat is the name of your data frame:

## for one column
dat$a <- scale(dat$a, center = FALSE, scale = max(dat$a, na.rm = TRUE)/100)
## for every column of your data frame
dat <- data.frame(lapply(dat, function(x) scale(x, center = FALSE, scale = max(x, na.rm = TRUE)/100)))

For a simple case like this, you could also write your own function.

fn <- function(x) x * 100/max(x, na.rm = TRUE)
fn(c(0,1,0))
# [1] 0 100 0
## to one column
dat$a <- fn(dat$a)
## to all columns of your data frame
dat <- data.frame(lapply(dat, fn))

Normalizing columns of matrix between -1 and 1

How about rescaling the matrix x at the end of your own function?

normalize <- function(x) { 
x <- sweep(x, 2, apply(x, 2, min))
x <- sweep(x, 2, apply(x, 2, max), "/")
2*x - 1
}

Column rescaling for a very large sparse matrix in R

This is what we can do, assuming A is a dgCMatrix:

A@x <- A@x / rep.int(colSums(A), diff(A@p))

This requires some understanding of dgCMatrix class.

  1. @x stores none-zero matrix values, in a packed 1D array;
  2. @p stores the cumulative number of non-zero elements by column, hence diff(A@p) gives the number of non-zero elements for each column.

We repeat each element of colSums(A) by number of none-zero elements in that column, then divide A@x by this vector. In this end, we update A@x by rescaled values. In this way, column rescaling is done in a sparse manner.


Example:

library(Matrix)
set.seed(2); A <- Matrix(rbinom(100,10,0.05), nrow = 10)

#10 x 10 sparse Matrix of class "dgCMatrix"

# [1,] . . 1 . 2 . 1 . . 2
# [2,] 1 . . . . . 1 . 1 .
# [3,] . 1 1 1 . 1 1 . . .
# [4,] . . . 1 . 2 . . . .
# [5,] 2 . . . 2 . 1 . . .
# [6,] 2 1 . 1 1 1 . 1 1 .
# [7,] . 2 . 1 2 1 . . 2 .
# [8,] 1 . . . . 3 . 1 . .
# [9,] . . 2 1 . 1 . . 1 .
#[10,] . . . . 1 1 . . . .

diff(A@p) ## number of non-zeros per column
# [1] 4 3 3 5 5 7 4 2 4 1

colSums(A) ## column sums
# [1] 6 4 4 5 8 10 4 2 5 2

A@x <- A@x / rep.int(colSums(A), diff(A@p)) ## sparse column rescaling

#10 x 10 sparse Matrix of class "dgCMatrix"

# [1,] . . 0.25 . 0.250 . 0.25 . . 1
# [2,] 0.1666667 . . . . . 0.25 . 0.2 .
# [3,] . 0.25 0.25 0.2 . 0.1 0.25 . . .
# [4,] . . . 0.2 . 0.2 . . . .
# [5,] 0.3333333 . . . 0.250 . 0.25 . . .
# [6,] 0.3333333 0.25 . 0.2 0.125 0.1 . 0.5 0.2 .
# [7,] . 0.50 . 0.2 0.250 0.1 . . 0.4 .
# [8,] 0.1666667 . . . . 0.3 . 0.5 . .
# [9,] . . 0.50 0.2 . 0.1 . . 0.2 .
#[10,] . . . . 0.125 0.1 . . . .

@thelatemail mentioned another method, by first converting dgCMatrix to dgTMatrix:

AA <- as(A, "dgTMatrix")
A@x <- A@x / colSumns(A)[AA@j + 1L]

For dgTMatrix class there is no @p but @j, giving the column index (0 based) for none zero matrix elements.

Standardize data columns in R

I have to assume you meant to say that you wanted a mean of 0 and a standard deviation of 1. If your data is in a dataframe and all the columns are numeric you can simply call the scale function on the data to do what you want.

dat <- data.frame(x = rnorm(10, 30, .2), y = runif(10, 3, 5))
scaled.dat <- scale(dat)

# check that we get mean of 0 and sd of 1
colMeans(scaled.dat) # faster version of apply(scaled.dat, 2, mean)
apply(scaled.dat, 2, sd)

Using built in functions is classy. Like this cat:

Sample Image



Related Topics



Leave a reply



Submit