How to deal with spaces in column names?
This is a "bug" in the package ggplot2
that comes from the fact that the function as.data.frame()
in the internal ggplot2 function quoted_df
converts the names to syntactically valid names. These syntactically valid names cannot be found in the original dataframe, hence the error.
To remind you :
syntactically valid names consists of letters, numbers and the dot or
underline characters, and start with a letter or the dot (but the dot
cannot be followed by a number)
There's a reason for that. There's also a reason why ggplot allows you to set labels using labs
, eg using the following dummy dataset with valid names:
X <-data.frame(
PonOAC = rep(c('a','b','c','d'),2),
AgeGroup = rep(c("over 80",'under 80'),each=4),
NumberofPractices = rpois(8,70)
)
You can use labs at the end to make this code work
ggplot(X, aes(x=PonOAC,y=NumberofPractices, fill=AgeGroup)) +
geom_bar() +
facet_grid(AgeGroup~ .) +
labs(x="% on OAC", y="Number of Practices",fill = "Age Group")
To produce
How to select a column name with a space in MySQL
Generally the first step is to not do that in the first place, but if this is already done, then you need to resort to properly quoting your column names:
SELECT `Business Name` FROM annoying_table
Usually these sorts of things are created by people who have used something like Microsoft Access and always use a GUI to do their thing.
Renaming dataframe column names which contain a space
You can use the dplyr function rename_with() to rename all columns that match a certain condition (in this case that it contains a space). In this example I replace the space in the column name with an underscore:
library(dplyr)
df <- data.frame(a = 1:2,
b = LETTERS[1:2],
c = 101:102)
names(df) <- c("a", "b b", "c e f")
df %>%
rename_with(~ gsub(" ","_", .x), contains(" "))
How do you deal with blank spaces in column names in SQL Server?
select [Response Status Code], [Client Response Status Code]
from TC_Sessions (NOLOCK)
WHERE StartDate BETWEEN '05-15-2012' AND '06-01-2012'
AND SupplyID = 3367
Wrap the names in square brackets.
It is , however, best to avoid spaces in names if possible. It just creates more work for you down the road...
How do you handle column names having spaces in them when using pd.read_clipboard?
What I do in this situation is that I make all my columns two or more spaces apart, then I use sep='\s\s+' for my delimiter, this way when I do have column headings with a single space such as, Col #3 above it treats it as one column.
A B Col #3
NaN NaN NaN
-0.041158 -0.161571 0.329038
0.238156 0.525878 0.110370
0.606738 0.854177 -0.095147
0.200166 0.385453 0.166235
df = pd.read_clipboard(sep='\s\s+')
You do get this warning, but you can ignore it since it as done it right. Or you could put the engine='python'
if your OCD gets the best of you. :)
C:\Program
Files\Anaconda3\lib\site-packages\pandas\io\clipboards.py:63:
ParserWarning: Falling back to the 'python' engine because the 'c'
engine does not support regex separators (separators > 1 char and
different from '\s+' are interpreted as regex); you can avoid this
warning by specifying engine='python'. return
read_table(StringIO(text), sep=sep, **kwargs)
print(df)
A B Col #3
0 NaN NaN NaN
1 -0.041158 -0.161571 0.329038
2 0.238156 0.525878 0.110370
3 0.606738 0.854177 -0.095147
4 0.200166 0.385453 0.166235
How to fix spaces in column names of a data.frame (remove spaces, inject dots)?
UDPDATE 2022 Aug:
df %>% rename_with(make.names)
OLD code was: (still works though)
as of Jan 2021: drplyr solution that is brief and uses no extra libraries is
df %<>% dplyr::rename_all(make.names)
credit goes to commenter.
How to use $ in R to select a column with space in its colname?
By using `` as in
df$`Avg Estimates`
Select columns with spaced heading in R
We can use backquotes to select those unusual names i.e. column names that doesn't start with letters
subset(df, select = c(height, `80% height`))
-output
# height 80% height
#1 1020 816.0
#2 2053 1642.4
#3 1840 1472.0
#4 3301 2640.8
#5 2094 1675.2
Also, the dplyr
use with specifying df
twice is not needed. We can have select
function from dplyr
library(dplyr)
df %>%
select(height, `80% height`)
-output
# height 80% height
#1 1020 816.0
#2 2053 1642.4
#3 1840 1472.0
#4 3301 2640.8
#5 2094 1675.2
It may be also better to remove spaces and append a letter for those column names that start with numbers. clean_names
from janitor
does
library(janitor)
df %>%
clean_names()
Related Topics
Can't Loop with R's Leaflet Package to Produce Multiple Maps
R Data.Table Apply Function to Rows Using Columns as Arguments
Saving and Loading a Model in R
Round a Posix Date (Posixct) with Base R Functionality
R: Extracting "Clean" Utf-8 Text from a Web Page Scraped with Rcurl
How to Combine Ggplot and Dplyr into a Function
Maps, Ggplot2, Fill by State Is Missing Certain Areas on the Map
Change the Position of the Strip Label in Ggplot from the Top to the Bottom
Selecting a Subset of Columns in a Data.Table
Find Out the Number of Days of a Month in R
In Ggplot2, Coord_Flip and Free Scales Don't Work Together
Align Multiple Ggplot2 Plots with Grid