Remove Dots from Column Names

Remove dots from column names

One straightforward way is to use gsub to remove the periods from the column names:

> names(mydf)
[1] "JulDay" "i.46.j.8.k.1" "i.47.j.8.k.1" "i.48.j.8.k.1" "i.46.j.8.k.2"
[6] "i.47.j.8.k.2" "i.48.j.8.k.2" "i.46.j.8.k.3" "i.47.j.8.k.3" "i.48.j.8.k.3"
[11] "i.46.j.8.k.4" "i.47.j.8.k.4" "i.48.j.8.k.4"
> names(mydf) <- gsub("\\.", "", names(mydf))
> names(mydf)
[1] "JulDay" "i46j8k1" "i47j8k1" "i48j8k1" "i46j8k2" "i47j8k2" "i48j8k2" "i46j8k3"
[9] "i47j8k3" "i48j8k3" "i46j8k4" "i47j8k4" "i48j8k4"

How to remove unwanted dots from strings in pandas column?

Try:

df["parts"] = df["parts"].str.replace(r"\.*\d+", "", regex=True)
print(df)

Prints:

         parts
0 mouse.pad.v
1 key.board.c
2 pen.color.r

Input dataframe:

               parts
0 mouse.pad.v.1.2
1 key.board.1.0.c30
2 pen.color.4.32.r

How to remove '.' from column names in a dataframe?

1) sqldf can deal with names having dots in them if you quote the names:

library(sqldf)
d0 <- read.csv(text = "A.B,C.D\n1,2")
sqldf('select "A.B", "C.D" from d0')

giving:

  A.B C.D
1 1 2

2) When reading the data using read.table or read.csv use the check.names=FALSE argument.

Compare:

Lines <- "A B,C D
1,2
3,4"
read.csv(text = Lines)
## A.B C.D
## 1 1 2
## 2 3 4
read.csv(text = Lines, check.names = FALSE)
## A B C D
## 1 1 2
## 2 3 4

however, in this example it still leaves a name that would have to be quoted in sqldf since the names have embedded spaces.

3) To simply remove the periods, if DF is a data frame:

names(DF) <- gsub(".", "", names(DF), fixed = TRUE)

or it might be nicer to convert the periods to underscores so that it is reversible:

names(DF) <- gsub(".", "_", names(DF), fixed = TRUE)

This last line could be alternatively done like this:

names(DF) <- chartr(".", "_", names(DF))

Remove dots from data column

You can use the following:
first create a function that will be used for replacement:

repl = function(x)setNames(c("","."),c(".",","))[x]

This function takes in either "." or "," and returns "" or '.' respectively

Now use this function to replace

stringr::str_replace_all(as.character(df[,3]), "[.](?!\\d+$)|,", repl)

[1] "3812819062.06" "4039362599.36" "3652885587.18" "3460247960.02" "3465677403.12" "3131903622.55"
[7] "3204983361.46" "3811786009.24" "3180864095.05" "3352535553.88" "5214148756.95" "4491795201.50"
[13] "4333557619.30" "4808488277.86" "4039347179.81" "3867676530.69" "6356164873.94" "3961793391.19"
[19] "3797656130.81" "4709949715.37" "4047436592.12" "3923484635.28" "4821729985.03" "5024757038.22"

Of course you can do the rest. ie calling as.numeric() etc.

To do this in base R:

sub(',','.',gsub('[.](?!\\d+$)','',as.character(df[,3]),perl=T))

or If you know the exact number of . and , in your data, you could do

a = as.character(df[,3])
regmatches(a,gregexpr('[.](?!\\d+$)|,',df[,3],perl = T)) = list(c("","","","."))
a

Dataframe: How to remove dot in a string

# The following code should work:
df.NACE_code = df.NACE_code.astype(str)
df.NACE_code = df.NACE_code.str.replace('.', '')

How to fix spaces in column names of a data.frame (remove spaces, inject dots)?

UDPDATE 2022 Aug:

df %>% rename_with(make.names)

OLD code was: (still works though)
as of Jan 2021: drplyr solution that is brief and uses no extra libraries is

df %<>% dplyr::rename_all(make.names)

credit goes to commenter.

Replace dot in a specific column

gsub("\\.","",CAR)
[1] "BMW3" "FERRARI12" "TOYOTA58"


Related Topics



Leave a reply



Submit