dplyr: nonstandard column names (white space, punctuation, starts with numbers)
You may select
the variable by using backticks `
.
select(df, `a a`)
# a a
# 1 1
# 2 2
# 3 3
However, if your main objective is to rename the column, you may use rename
in plyr
package, in which you can use both ""
and ``
.
rename(df, replace = c("a a" = "a"))
rename(df, replace = c(`a a` = "a"))
Or in base
R:
names(df)[names(df) == "a a"] <- "a"
For a more thorough description on the use of various quotes, see ?Quotes
. The 'Names and Identifiers' section is especially relevant here:
other [syntactically invalid] names can be used provided they are quoted. The preferred quote is the backtick".
See also ?make.names
about valid names.
See also this post about renaming in dplyr
Select columns with spaced heading in R
We can use backquotes to select those unusual names i.e. column names that doesn't start with letters
subset(df, select = c(height, `80% height`))
-output
# height 80% height
#1 1020 816.0
#2 2053 1642.4
#3 1840 1472.0
#4 3301 2640.8
#5 2094 1675.2
Also, the dplyr
use with specifying df
twice is not needed. We can have select
function from dplyr
library(dplyr)
df %>%
select(height, `80% height`)
-output
# height 80% height
#1 1020 816.0
#2 2053 1642.4
#3 1840 1472.0
#4 3301 2640.8
#5 2094 1675.2
It may be also better to remove spaces and append a letter for those column names that start with numbers. clean_names
from janitor
does
library(janitor)
df %>%
clean_names()
Dealing with spaces and weird characters in column names with dplyr::rename()
To refer to variables that contain non-standard characters or start with a number, wrap the name in back ticks, e.g., `Instruction..Mode!`
R dplyr filter column with column name that starts with number
You can use backticks to refer to variables with non-standard names. This works whether they are columns of a data frame or not.
For this specific case
df %>% dplyr::filter(`1a`) # note that == TRUE is never needed
Or generally,
`2b` = 1:5
mean(`2b`)
# [1] 3
Of course you shouldn't make a bad habit of this - use standard names whenever possible.
As mentioned in comments, the ?Quotes
documentation is helpful. It states (in the Names and Identifiers section):
Almost always, other names can be used provided they are quoted. The preferred quote is the backtick (`), and
deparse
will normally use it, but under many circumstances single or double quotes can be used (as a character constant will often be converted to a name). One place where backticks may be essential is to delimit variable names in formulae: seeformula
.
use dplyr to combine columns of data.frame when column names are not known
With a little trial and error:
colNames_as_symbols <- syms(names(myTibble))
transmute(myTibble, concat = paste(!!!colNames_as_symbols, sep = '.'))
Here was the hint that put me on to the solution... From the documentation for !!!
:
The big-bang operator !!! forces-splice a list of objects. The
elements of the list are spliced in place, meaning that they each
become one single argument.vars <- syms(c("height", "mass"))
Force-splicing is equivalent to supplying the elements separately:
starwars %>% select(!!!vars)
starwars %>% select(height, mass)
In fact, the entire documentation entitled "Force parts of an expression" is fascinating reading. It can be accessed by issuing ?qq_show
Renaming dataframe column names which contain a space
You can use the dplyr function rename_with() to rename all columns that match a certain condition (in this case that it contains a space). In this example I replace the space in the column name with an underscore:
library(dplyr)
df <- data.frame(a = 1:2,
b = LETTERS[1:2],
c = 101:102)
names(df) <- c("a", "b b", "c e f")
df %>%
rename_with(~ gsub(" ","_", .x), contains(" "))
Replace underscore with white space in column names of datatable
You can use str_replace
from stringr
names(f) <- stringr::str_replace(names(f), "_", " ")
compute sum for space string column
Does this work?
df %>% group_by(`a 1`) %>% summarise(tx = sum(`t t`))
Related Topics
Access Variable Value Where the Name of Variable Is Stored in a String
How to Load Packages in R Automatically
Omit Rows Containing Specific Column of Na
Ggplot Bar Plot With Facet-Dependent Order of Categories
Ggplot Combining Two Plots from Different Data.Frames
Select Groups Which Have At Least One of a Certain Value
Ggplot2 Geom_Bar - How to Keep Order of Data.Frame
Figure Position in Markdown When Converting to Pdf With Knitr and Pandoc
How to Make Consistent-Width Plots in Ggplot (With Legends)
Lattice: Multiple Plots in One Window
Changing Column Names in a List of Data Frames in R
How to Center Stacked Percent Barchart Labels