Remove a Character from the Entire Data Frame

Removing a character from entire data frame

You can use DataFrame.replace and for select use subset:

df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':['f;','d:','sda;sd'],
                   'D':['s','d;','d;p'],
                   'E':[5,3,6],
                   'F':[7,4,3]})

print (df)
   A  B       C    D  E  F
0  1  4      f;    s  5  7
1  2  5      d:   d;  3  4
2  3  6  sda;sd  d;p  6  3

cols_to_check = ['C','D', 'E']

print (df[cols_to_check])
        C    D  E
0      f;    s  5
1      d:   d;  3
2  sda;sd  d;p  6

df[cols_to_check] = df[cols_to_check].replace({';':''}, regex=True)
print (df)
   A  B      C   D  E  F
0  1  4      f   s  5  7
1  2  5     d:   d  3  4
2  3  6  sdasd  dp  6  3

remove a character from the entire data frame

I would use lapply to loop over the columns and then replace the " using gsub.

df1[] <- lapply(df1, gsub, pattern='"', replacement='')
df1
#  ID name value1 value2
#1  1    x  a,b,c      x
#2  2    y    d,r      z

and if need the class can be changed with type.convert

df1[] <- lapply(df1, type.convert)

data

df1 <-  structure(list(ID = c("\"1", "\"2"), name = c("x", "y"),
value1 = c("a,\"b,\"c", 
"d,\"r\""), value2 = c("x\"", "z\"")), .Names = c("ID", "name", 
"value1", "value2"), class = "data.frame", row.names = c(NA, -2L))

How to remove a character from some rows in a dataframe column?

Another way would be to use numpy.where and evaluate your conditions using str.startswith and str.endswith:

import numpy as np

p = df['Price'].str
df['Price'] = np.where(p.startswith('.'),p.replace('.','',regex=True),
                         np.where(p.endswith('.T'),p.replace('.T','',regex=True),p))

This will check whether df['Price'] starts with a . or ends with a .T and replace them.

            Brand  Price
0     Honda Civic  22000
1  Toyota Corolla  25000
2      Ford Focus  27000
3         Audi A4    TPX
4          Suzuki   NKM1

How to remove part of characters in data frame column

There are multiple ways of doing this:

Using as.numeric on a column of your choice.

raw$Zipcode <- as.numeric(raw$Zipcode)

If you want it to be a character then you can use stringr package.

library(stringr)
raw$Zipcode <- str_replace(raw$Zipcode, "^0+" ,"")

There is another function called str_remove in stringr package.

raw$Zipcode <- str_remove(raw$Zipcode, "^0+")

You can also use sub from base R.

raw$Zipcode <- sub("^0+", "", raw$Zipcode)

But if you want to remove n number of leading zeroes, replace + with {n} to remove them.

For instance to remove two 0's use sub("^0{2}", "", raw$Zipcode).

How to remove special characters from rows in pandas dataframe

I have different approach using regex. It will delete anything between brackets:

import re
import pandas as pd
df = {'LGA': ['Alpine (S)', 'Ararat (RC)', 'Bass Coast (S)']  }
df = pd.DataFrame(df)
df['LGA'] = [re.sub("[\(\[].*?[\)\]]", "", x).strip() for x in df['LGA']] # delete anything between brackets

Remove unwanted parts from strings in a column

data['result'] = data['result'].map(lambda x: x.lstrip('+-').rstrip('aAbBcC'))

Pandas: Remove all characters before a specific character in a dataframe column

Using str.replace:

df["address"] = df["address"].str.replace(r'^[^,]*,\s*', '')

Here is a regex demo showing that the logic is working.

Removing special character from dataframe

That looks like a tuple to me, so give .str[0] a shot:

df['IP_ADDRESS'] = df['IP_ADDRESS'].str[0]

R Remove string characters from a range of rows in a column

If we want to substring and filter, an option is to use trimws (trims out the characters by default whitespace at either end of the string - if we want only left or right, specify the which by default is 'both') with whitespace as regex i.e. matching zero or more upper case letters followed by zero or more spaces ([A-Z]*\\s*), and then filter the rows where the elements are not blank

library(dplyr)
df %>% 
  mutate(Date = trimws(Date, whitespace = "[A-Z]*\\s*")) %>% 
  filter(nzchar(Date))

-output

       Date Date_Approved
1  1/27/2020     1/28/2020
2  1/29/2020     1/30/2020
3  1/30/2020     1/31/2020
4   2/1/2020      2/2/2020
5   2/9/2020     2/10/2020
6  2/15/2020     2/16/2020
7  2/16/2020     2/17/2020
8  2/17/2020     2/19/2020
9  2/18/2020     2/20/2020
10 2/22/2020     2/23/2020
11 2/25/2020     2/26/2020
12 2/28/2020     2/29/2020

remove character for all column names in a data frame

If we need to remove only 'v' the one of more digits (\\d+) at the end ($) is not needed as the expected output also removes 'v' from first column 'q_ve5'

library(dplyr)
library(stringr)
df %>% 
    rename_with(~ str_remove(., "v"), everything())