Dealing with spaces and weird characters in column names with dplyr::rename()
To refer to variables that contain non-standard characters or start with a number, wrap the name in back ticks, e.g., `Instruction..Mode!`
How to deal with nonstandard column names (white space, punctuation, starts with numbers)
You may select
the variable by using backticks `
.
select(df, `a a`)
# a a
# 1 1
# 2 2
# 3 3
However, if your main objective is to rename the column, you may use rename
in plyr
package, in which you can use both ""
and ``
.
rename(df, replace = c("a a" = "a"))
rename(df, replace = c(`a a` = "a"))
Or in base
R:
names(df)[names(df) == "a a"] <- "a"
For a more thorough description on the use of various quotes, see ?Quotes
. The 'Names and Identifiers' section is especially relevant here:
other [syntactically invalid] names can be used provided they are quoted. The preferred quote is the backtick".
See also ?make.names
about valid names.
See also this post about renaming in dplyr
How to fix spaces in column names of a data.frame (remove spaces, inject dots)?
UDPDATE 2022 Aug:
df %>% rename_with(make.names)
OLD code was: (still works though)
as of Jan 2021: drplyr solution that is brief and uses no extra libraries is
df %<>% dplyr::rename_all(make.names)
credit goes to commenter.
Select columns with spaced heading in R
We can use backquotes to select those unusual names i.e. column names that doesn't start with letters
subset(df, select = c(height, `80% height`))
-output
# height 80% height
#1 1020 816.0
#2 2053 1642.4
#3 1840 1472.0
#4 3301 2640.8
#5 2094 1675.2
Also, the dplyr
use with specifying df
twice is not needed. We can have select
function from dplyr
library(dplyr)
df %>%
select(height, `80% height`)
-output
# height 80% height
#1 1020 816.0
#2 2053 1642.4
#3 1840 1472.0
#4 3301 2640.8
#5 2094 1675.2
It may be also better to remove spaces and append a letter for those column names that start with numbers. clean_names
from janitor
does
library(janitor)
df %>%
clean_names()
compute sum for space string column
Does this work?
df %>% group_by(`a 1`) %>% summarise(tx = sum(`t t`))
renaming columns in R with `-` symbol
This can be done using rename
. You just have to put the column names with special charcters inside the "`" sign:
temp <- temp %>% dplyr::rename(`Re-ply` = re_ply,
total_id = total_ID,
`Re-ask` = re_ask)
names(temp)
[1] "Re-ply" "total_id" "Re-ask"
R: Import CSV with column names that contain spaces
Unless you specify check.names=FALSE
, R will convert column names that are not valid variable names (e.g. contain spaces or special characters or start with numbers) into valid variable names, e.g. by replacing spaces with dots. Try names(s_data)
. If you do use check.names=TRUE
, then use single back-quotes (`) to surround the names.
I would also recommend using rename
from the reshape
package (or, these days, dplyr::rename
).
s_data <- read.csv2( file=f_name )
library(reshape)
s_df <- rename(s_data,ID="scada_id",
PlantNo="plant",DateTime="date",Main.status="main_code",
Additional.status="seco_code",MainStatustext="main_text",
AddStatustext="seco_test",Duration="duration")
For what it's worth, the tidyverse tools (i.e. readr::read_csv
) have the opposite default; they don't transform the column names to make them legal R symbols unless you explicitly request it.
Rename special character / symbols from all variables in the dataset
if you are here and want to check the solution to this questions, I have two options.
First, using dplyr:
ds <- ds %>% setNames(tolower(gsub("\\.","",names(.)))) %>%
setNames(tolower(gsub("\\_","",names(.)))) %>%
setNames(tolower(gsub("ç","c",names(.)))) %>%
setNames(tolower(gsub("ã","a",names(.))))
Second, using the janitor package
library(janitor)
ds <- ds %>% clean_names()
This community is a great place to find answers to our questions and I hope my answer could help you.
Related Topics
How to Return 5 Topmost Values from Vector in R
Increase Space Between Bars in Ggplot
Index Unique Values in Data.Table
Dealing with Spaces and "Weird" Characters in Column Names with Dplyr::Rename()
New R-Studio Version 0.98.932 Deletes .Md File - How to Prevent
Add Colored Arrow to Axis of Ggplot2 (Partially Outside Plot Region)
Convert a Printed Message into a Character Vector
Difference Between 'Names(Df[1]) <- ' and 'Names(Df)[1] <- '
When Does the Argument Go Inside or Outside Aes()
Different Axis Limits Per Facet in Ggplot2
Rounding Time to Nearest Quarter Hour
Generate Ggplot2 Boxplot with Different Colours for Multiple Groups
How to Add a Condition to the Geom_Point Size
How to Get the Nth Element of Each Item of a List, Which Is Itself a Vector of Unknown Length
How to Download and Display an Image from an Url in R
Using Dplyr Within a Function, Non-Standard Evaluation