Remove dots from column names
One straightforward way is to use gsub
to remove the periods from the column names:
> names(mydf)
[1] "JulDay" "i.46.j.8.k.1" "i.47.j.8.k.1" "i.48.j.8.k.1" "i.46.j.8.k.2"
[6] "i.47.j.8.k.2" "i.48.j.8.k.2" "i.46.j.8.k.3" "i.47.j.8.k.3" "i.48.j.8.k.3"
[11] "i.46.j.8.k.4" "i.47.j.8.k.4" "i.48.j.8.k.4"
> names(mydf) <- gsub("\\.", "", names(mydf))
> names(mydf)
[1] "JulDay" "i46j8k1" "i47j8k1" "i48j8k1" "i46j8k2" "i47j8k2" "i48j8k2" "i46j8k3"
[9] "i47j8k3" "i48j8k3" "i46j8k4" "i47j8k4" "i48j8k4"
How to remove unwanted dots from strings in pandas column?
Try:
df["parts"] = df["parts"].str.replace(r"\.*\d+", "", regex=True)
print(df)
Prints:
parts
0 mouse.pad.v
1 key.board.c
2 pen.color.r
Input dataframe:
parts
0 mouse.pad.v.1.2
1 key.board.1.0.c30
2 pen.color.4.32.r
How to remove '.' from column names in a dataframe?
1) sqldf can deal with names having dots in them if you quote the names:
library(sqldf)
d0 <- read.csv(text = "A.B,C.D\n1,2")
sqldf('select "A.B", "C.D" from d0')
giving:
A.B C.D
1 1 2
2) When reading the data using read.table
or read.csv
use the check.names=FALSE
argument.
Compare:
Lines <- "A B,C D
1,2
3,4"
read.csv(text = Lines)
## A.B C.D
## 1 1 2
## 2 3 4
read.csv(text = Lines, check.names = FALSE)
## A B C D
## 1 1 2
## 2 3 4
however, in this example it still leaves a name that would have to be quoted in sqldf since the names have embedded spaces.
3) To simply remove the periods, if DF
is a data frame:
names(DF) <- gsub(".", "", names(DF), fixed = TRUE)
or it might be nicer to convert the periods to underscores so that it is reversible:
names(DF) <- gsub(".", "_", names(DF), fixed = TRUE)
This last line could be alternatively done like this:
names(DF) <- chartr(".", "_", names(DF))
Remove dots from data column
You can use the following:
first create a function that will be used for replacement:
repl = function(x)setNames(c("","."),c(".",","))[x]
This function takes in either "."
or ","
and returns ""
or '.'
respectively
Now use this function to replace
stringr::str_replace_all(as.character(df[,3]), "[.](?!\\d+$)|,", repl)
[1] "3812819062.06" "4039362599.36" "3652885587.18" "3460247960.02" "3465677403.12" "3131903622.55"
[7] "3204983361.46" "3811786009.24" "3180864095.05" "3352535553.88" "5214148756.95" "4491795201.50"
[13] "4333557619.30" "4808488277.86" "4039347179.81" "3867676530.69" "6356164873.94" "3961793391.19"
[19] "3797656130.81" "4709949715.37" "4047436592.12" "3923484635.28" "4821729985.03" "5024757038.22"
Of course you can do the rest. ie calling as.numeric()
etc.
To do this in base R:
sub(',','.',gsub('[.](?!\\d+$)','',as.character(df[,3]),perl=T))
or If you know the exact number of .
and ,
in your data, you could do
a = as.character(df[,3])
regmatches(a,gregexpr('[.](?!\\d+$)|,',df[,3],perl = T)) = list(c("","","","."))
a
Dataframe: How to remove dot in a string
# The following code should work:
df.NACE_code = df.NACE_code.astype(str)
df.NACE_code = df.NACE_code.str.replace('.', '')
How to fix spaces in column names of a data.frame (remove spaces, inject dots)?
UDPDATE 2022 Aug:
df %>% rename_with(make.names)
OLD code was: (still works though)
as of Jan 2021: drplyr solution that is brief and uses no extra libraries is
df %<>% dplyr::rename_all(make.names)
credit goes to commenter.
Replace dot in a specific column
gsub("\\.","",CAR)
[1] "BMW3" "FERRARI12" "TOYOTA58"
Related Topics
Remove Some of the Axis Labels in Ggplot Faceted Plots
Saving a File to Sharepoint with R
How to Bookmark and Restore Dynamically Added Modules
R Dataframe with Varied Column Lengths
Inserting a Table Under the Legend in a Ggplot2 and Saving Everything to a File
R Find the Distance Between Two Us Zipcode Columns
R: Ifelse Function Returns Vector Position Instead of Value (String)
Scraping a Complex HTML Table into a Data.Frame in R
Pivot_Longer Multiple Variables of Different Kinds
Adjusting the Node Size in Igraph Using a Matrix
Ubuntu 16.04 R Installation: Configure: Gdal-Config Not Found or Not Executable
Prevent Automatic Conversion of Single Column to Vector
Installing R Packages Error in Readrds(File):Error Reading from Connection
Prevent Knitr/Rmarkdown from Interleaving Chunk Output with Code
R: Is There a Good Replacement for Plyr::Rbind.Fill in Dplyr