Replace Column If Equal to a Specific Value

Replace Column if equal to a specific value

You can use the following awk:

awk -F, '{ $4 = ($4 == "N/A" ? -1 : $4) } 1' OFS=, test.csv
  • We set the input and output field separators to , to preserve the delimiters in your csv file
  • We check the forth field if it is equal to "N/A" then we assign it the value -1 if not we retain the value as is.
  • 1 at the end prints your line with or without modified 4th column depending if our test was successful or not.
  • ($4=="N/A"?-1:$4) is a ternary operator that checks if the condition $4=="N/A" is true or not. If true ? then we assign -1 and if false : we keep the field as is.

Test run on sample file:

$ cat file
a,b,c,d,e,f
1,2,3,4,5,6
44,2,1,N/A,4,5
24,sdf,sdf,4,2,254,5
a,f,f,N/A,f,4


$ awk -F, '{ $4 = ($4 == "N/A" ? -1 : $4) } 1' OFS=, file
a,b,c,d,e,f
1,2,3,4,5,6
44,2,1,-1,4,5
24,sdf,sdf,4,2,254,5
a,f,f,-1,f,4

Replace column value based on value in other column

You can use loc to specify where you want to replace, and pass the replaced series to the assignment:

df.loc[df['Stage']=='X', 'Area'] = df['Area'].replace('Q','P')

Output:

   ID Area Stage
0 1 P X
1 2 P X
2 3 P X
3 4 Q Y

Replace column values conditional on another column being equal to a vector

We can create the logical vector with %in%, subset the 'value' and assign it to 0

df$value[df$digit %in% v] <- 0
df
# name value
#1 1 0
#2 2 1
#3 3 0
#4 1 0

Or another option is

df$value <-  df$value *(!df$digit %in% v)

Or using dplyr

library(dplyr)
df %>%
mutate(value = replace(value, digit %in% v, 0))

Or with data.table

library(data.table)
setDT(df)[name %chin% v, value := 0]

conditional based find and replace values in fields of csv file

This awk should do:

cat file
2159,23,45,45,13.512034,78.226233

awk -F, -v OFS="," '$1==2159 && $5==13.512034 {$5="13.49694"} $1==2159 && $6==78.226233 {$6="78.22772"} 1' file
2159,23,45,45,13.49694,78.22772

This $1 ~ /^2159/ does starts with, not equal to. $1=2159 or $1~/^2159$/

Replace value in column B using dictionary if column A is equal to specific value

You can first create replace values then use pandas.mask and only set values for those rows that have pos==2 or pos==3.

rep = df['result'].replace(replaceValues)
df['result'] = df['result'].mask(df['pos'].isin([2,3]), rep)
print(df)


    pos result
0 1 AA
1 1 AB
2 1 BB
3 2 C
4 2 CA
5 2 AC
6 3 A
7 3 D
8 3 C
9 4 DD
10 4 AB
11 4 BA

Efficiently replace values from a column to another column Pandas DataFrame

Using np.where is faster. Using a similar pattern as you used with replace:

df['col1'] = np.where(df['col1'] == 0, df['col2'], df['col1'])
df['col1'] = np.where(df['col1'] == 0, df['col3'], df['col1'])

However, using a nested np.where is slightly faster:

df['col1'] = np.where(df['col1'] == 0, 
np.where(df['col2'] == 0, df['col3'], df['col2']),
df['col1'])

Timings

Using the following setup to produce a larger sample DataFrame and timing functions:

df = pd.concat([df]*10**4, ignore_index=True)

def root_nested(df):
df['col1'] = np.where(df['col1'] == 0, np.where(df['col2'] == 0, df['col3'], df['col2']), df['col1'])
return df

def root_split(df):
df['col1'] = np.where(df['col1'] == 0, df['col2'], df['col1'])
df['col1'] = np.where(df['col1'] == 0, df['col3'], df['col1'])
return df

def pir2(df):
df['col1'] = df.where(df.ne(0), np.nan).bfill(axis=1).col1.fillna(0)
return df

def pir2_2(df):
slc = (df.values != 0).argmax(axis=1)
return df.values[np.arange(slc.shape[0]), slc]

def andrew(df):
df.col1[df.col1 == 0] = df.col2
df.col1[df.col1 == 0] = df.col3
return df

def pablo(df):
df['col1'] = df['col1'].replace(0,df['col2'])
df['col1'] = df['col1'].replace(0,df['col3'])
return df

I get the following timings:

%timeit root_nested(df.copy())
100 loops, best of 3: 2.25 ms per loop

%timeit root_split(df.copy())
100 loops, best of 3: 2.62 ms per loop

%timeit pir2(df.copy())
100 loops, best of 3: 6.25 ms per loop

%timeit pir2_2(df.copy())
1 loop, best of 3: 2.4 ms per loop

%timeit andrew(df.copy())
100 loops, best of 3: 8.55 ms per loop

I tried timing your method, but it's been running for multiple minutes without completing. As a comparison, timing your method on just the 6 row example DataFrame (not the much larger one tested above) took 12.8 ms.

Change one value based on another value in pandas

One option is to use Python's slicing and indexing features to logically evaluate the places where your condition holds and overwrite the data there.

Assuming you can load your data directly into pandas with pandas.read_csv then the following code might be helpful for you.

import pandas
df = pandas.read_csv("test.csv")
df.loc[df.ID == 103, 'FirstName'] = "Matt"
df.loc[df.ID == 103, 'LastName'] = "Jones"

As mentioned in the comments, you can also do the assignment to both columns in one shot:

df.loc[df.ID == 103, ['FirstName', 'LastName']] = 'Matt', 'Jones'

Note that you'll need pandas version 0.11 or newer to make use of loc for overwrite assignment operations. Indeed, for older versions like 0.8 (despite what critics of chained assignment may say), chained assignment is the correct way to do it, hence why it's useful to know about even if it should be avoided in more modern versions of pandas.


Another way to do it is to use what is called chained assignment. The behavior of this is less stable and so it is not considered the best solution (it is explicitly discouraged in the docs), but it is useful to know about:

import pandas
df = pandas.read_csv("test.csv")
df['FirstName'][df.ID == 103] = "Matt"
df['LastName'][df.ID == 103] = "Jones"

EXCEL - Replace function | Replace value from one column to replace with other column value

EDIT : For English users replace ; by , in the formula

You can use SEARCH to get the position of the $ and REPLACE everything after it.

For example with schema name in column A and schema value in column B -> result in column C.

To find the $ you need to use SEARCH("$";A2) it will give you the position.
Then you can count the number of characters to replace after the "$" by substracting the position to the length of the schema name with LEN(). (+1 to get the last char)

Then you can combine everything :

=REPLACE(A2;SEARCH("$";A2);LEN(A2)-SEARCH("$";A2)+1;B2)

Result in my Excel :

Excel example



Related Topics



Leave a reply



Submit