Scale Only Certain Columns R

Scale only certain columns R

We can do this with lapply. Subset the columns of interest, loop through them with lapply, assign the output back to the subset of data. Here, we are using c because the outpuf of scale is a matrix with a single column. Using c or as.vector, it gets converted to vector

df[c(3,6)] <- lapply(df[c(3, 6), function(x) c(scale(x)))

Or another option is mutate_at from dplyr

library(dplyr)
df %>%
   mutate_at(c(3,6), funs(c(scale(.))))

Scaling data in R ignoring specific columns

you can do partial assignment:

trouble[, -c(1)] <- scale(trouble[, -c(1)])

selective scaling function in r using a different data frame to scale

One way with base R. Comments in the code. Thanks, Nelson, for the data +1

df <- read.table(text="color weight height length estimate
    1    red     10     66     40        5
    2    red     12     60     41        7
    3 yellow     12     67     48        9
    4   blue     15     55     36       10
    5 yellow     21     54     48        7
    6    red     12     54     43        5
    7    red     11     38     36        6", head=T)

scale_df <- read.table(text=" color weight height length estimate
    1    red     11     55     41        7
    2    red     13     67     39        9
    3 yellow     12     67     46       11
    4   blue     16      8     37        5
    5 yellow     23     10     47        9
    6    red     17     11     41       10
    7    red     16     13     37       13", head=T)

## add reference and scaling df as arguments
scale2sd <- function(ref, scale_by, variable) {
  ((ref[[variable]]) - mean(scale_by[[variable]], na.rm = TRUE)) / (2 * sd(scale_by[[variable]], na.rm = TRUE))
}
predictors <- c("color", "weight", "height", "length")
## this is to get all numeric columns that are part of your predictor variables
df_to_scale <- Filter(is.numeric, df[predictors])
## create a named vector. This is a bit awkward but it makes it easier to select
## the corresponding items in the two data frames, 
## and then replace the original columns 
num_vars <- setNames(names(df_to_scale), names(df_to_scale))                      

## this is the actual scaling job - 
## use the named vector for looping over the selected columns 
## then assign it back to the selected columns
df[num_vars] <- lapply(num_vars, function(x) scale2sd(df, scale_df, x))

df
#>    color      weight     height      length estimate
#> 1    red -0.67259271 0.58130793 -0.14222363        5
#> 2    red -0.42479540 0.47561558 -0.01777795        7
#> 3 yellow -0.42479540 0.59892332  0.85334176        9
#> 4   blue -0.05309942 0.38753862 -0.64000632       10
#> 5 yellow  0.69029252 0.36992323  0.85334176        7
#> 6    red -0.42479540 0.36992323  0.23111339        5
#> 7    red -0.54869405 0.08807696 -0.64000632        6

Standardize data columns in R

I have to assume you meant to say that you wanted a mean of 0 and a standard deviation of 1. If your data is in a dataframe and all the columns are numeric you can simply call the scale function on the data to do what you want.

dat <- data.frame(x = rnorm(10, 30, .2), y = runif(10, 3, 5))
scaled.dat <- scale(dat)

# check that we get mean of 0 and sd of 1
colMeans(scaled.dat)  # faster version of apply(scaled.dat, 2, mean)
apply(scaled.dat, 2, sd)

Using built in functions is classy. Like this cat:

Sample Image

scale columns based on vector of column names

library(tidyverse)
set.seed(123)
dat <-   
  data.frame(year_ref = 2000:2004,
             www_val1 = sample(5),
             www_val2 = sample(5),
             www_val3 = sample(5),
             sat_val1 = sample(5),
             sat_val2 = sample(5),
             sat_val3 = sample(5),
             ds_val1 = sample(5),
             ds_val2 = sample(5),
             ds_val3 = sample(5))
var_names <- c("ds", "sat")
dat %>% 
  dplyr::mutate_at(vars(starts_with(var_names)), ~scale(., center = T, scale = T))
#   year_ref www_val1 www_val2 www_val3   sat_val1   sat_val2   sat_val3    ds_val1    ds_val2    ds_val3
# 1     2000        3        3        1  0.0000000 -0.6324555 -1.2649111  0.6324555  0.6324555  0.0000000
# 2     2001        5        5        3 -1.2649111  0.0000000  0.0000000 -0.6324555 -1.2649111  1.2649111
# 3     2002        2        2        2  0.6324555  0.6324555  0.6324555  0.0000000  1.2649111 -0.6324555
# 4     2003        4        4        5 -0.6324555 -1.2649111 -0.6324555  1.2649111 -0.6324555 -1.2649111
# 5     2004        1        1        4  1.2649111  1.2649111  1.2649111 -1.2649111  0.0000000  0.6324555

How to scale segments of a column in an R data frame?

Apply the same function (scale) by group.

In base R

df$z <- with(df, ave(x, y, FUN = scale))
df

#    x y        z
#1   1 A -1.26491
#2   2 A -0.63246
#3   3 A  0.00000
#4   4 A  0.63246
#5   5 A  1.26491
#6  20 B -1.33242
#7  22 B -0.59219
#8  24 B  0.14805
#9  25 B  0.51816
#10 27 B  1.25840
#11 12 C -0.83028
#12 13 C -0.36901
#13 12 C -0.83028
#14 15 C  0.55352
#15 17 C  1.47605

Using dplyr

library(dplyr)
df %>%  group_by(y) %>%  mutate(z =  scale(x))

Or data.table

library(data.table)
setDT(df)[, z:= scale(x), y]

Rescaled certain columns to specific mean and standard deviation in R

Please try to post a valid reprex next time. This will save others the trouble of having to manually reproduce your input data. Also, it is not immediately clear how your first code chunk referring to a df with columns v1 - v5 relates to the subsequent code chunk referring to df$mother.iq.
The help file for psych::rescale() specifically states that the input, x, should be a matrix or data frame. I suspect this is why the output you get is not what you were expecting.
While you can use psych::rescale(), a better alternative that offers more flexibility may be to forego the additional dependency on the {psych} package altogether and, instead, simply manually rescale the columns as required. The two approaches are illustrated in the reprex below:

# load libraries
library(tidyverse)

# define data as per OP
df <- data.frame(
          v1 = c(65L, 98L, 85L, 83L, 115L, 98L),
          v2 = c(1L, 1L, 1L, 1L, 1L, 0L),
          v3 = c(121.12, 89.36, 115.44, 99.45, 92.75, 107.9),
          v4 = c(4L, 4L, 4L, 3L, 4L, 1L),
          v5 = c(27L, 25L, 27L, 25L, 27L, 18L)
)

# rescale via psych::rescale using entire data frame
df %>% psych::rescale(mean = 100, sd = 15)
#>          v1        v2        v3        v4        v5
#> 1  77.38682 106.12372 119.90143 108.25723 109.31746
#> 2 106.46091 106.12372  82.24089 108.25723 100.71673
#> 3  95.00748 106.12372 113.16617 108.25723 109.31746
#> 4  93.24541 106.12372  94.20546  95.87139 100.71673
#> 5 121.43847 106.12372  86.26070 108.25723 109.31746
#> 6 106.46091  69.38138 104.22535  71.09970  70.61416

# if you only want to do this for specific columns, do it manually by targeting
# columns using dplyr::mutate_at(), an anonymous function, and scale (from base
# R):
df %>% 
  mutate_at(vars(v4, v5), function(x) scale(x)*15 + 100)
#>    v1 v2     v3        v4        v5
#> 1  65  1 121.12 108.25723 109.31746
#> 2  98  1  89.36 108.25723 100.71673
#> 3  85  1 115.44 108.25723 109.31746
#> 4  83  1  99.45  95.87139 100.71673
#> 5 115  1  92.75 108.25723 109.31746
#> 6  98  0 107.90  71.09970  70.61416

Scale Only Certain Columns R