Replace Missing Values with a Value from Another Column

Manually replace missing value in a column based on another column

Does this work. Not sure if in your data you'd have NA for all values of geo == ny. Hence I've added & is.na(mark).

library(dplyr)
df %>% mutate(mark = case_when(geo == 'ny' & is.na(mark) ~ 'toyota', TRUE ~ mark))
# A tibble: 5 x 3
geo mark value
<chr> <chr> <dbl>
1 texas nissan 2
2 texas nissan 78
3 ny toyota 65
4 ny toyota 15
5 ca audi 22

Replace missing values with values from another column in Julia Dataframe

You can use coalesce:

julia> df = DataFrame(x = [0, missing, 2], y=[2, 4, 6])
3×2 DataFrame
Row │ x y
│ Int64? Int64
─────┼────────────────
1 │ 0 2
2 │ missing 4
3 │ 2 6

julia> df.x .= coalesce.(df.x, df.y)
3-element Array{Union{Missing, Int64},1}:
0
4
2

julia> df
3×2 DataFrame
Row │ x y
│ Int64? Int64
─────┼───────────────
1 │ 0 2
2 │ 4 4
3 │ 2 6

or if you like piping-aware functions:

julia> df = DataFrame(x = [0, missing, 2], y=[2, 4, 6])
3×2 DataFrame
Row │ x y
│ Int64? Int64
─────┼────────────────
1 │ 0 2
2 │ missing 4
3 │ 2 6

julia> transform!(df, [:x, :y] => ByRow(coalesce) => :x)
3×2 DataFrame
Row │ x y
│ Int64 Int64
─────┼──────────────
1 │ 0 2
2 │ 4 4
3 │ 2 6

and this is the same, but not requiring you to remember about coalesce:

julia> df = DataFrame(x = [0, missing, 2], y=[2, 4, 6])
3×2 DataFrame
Row │ x y
│ Int64? Int64
─────┼────────────────
1 │ 0 2
2 │ missing 4
3 │ 2 6

julia> transform!(df, [:x, :y] => ByRow((x,y) -> ismissing(x) ? y : x) => :x)
3×2 DataFrame
Row │ x y
│ Int64 Int64
─────┼──────────────
1 │ 0 2
2 │ 4 4
3 │ 2 6

How to fill missing values relative to a value from another column

Not sure if this is the best way to do it but it is one way to do it

age_series = df['Age'].copy()
df.loc[(df['Country'] == 'China') & (df['Age'].isnull()), 'Age'] = age_series.mean()
df.loc[(df['Country'] == 'USA') & (df['Age'].isnull()), 'Age'] = age_series.median()

Note that I copied the age column before hand so that you get the median of the original age series not after calculating the mean for the US. This is the final results

    Country     Age
0 USA 33.500000
1 EU 15.000000
2 China 35.000000
3 USA 45.000000
4 EU 30.000000
5 China 40.583333
6 USA 28.000000
7 EU 26.000000
8 China 78.000000
9 USA 65.000000
10 EU 53.000000
11 China 66.000000
12 USA 32.000000
13 EU NaN
14 China 14.000000

Replace missing values with a value from another column

ifelse(test, yes, no) is a handy function to do just that, and it can be used on vectors. Using your last data.frame:

s <- data.frame(ID = c(191, 282, 202, 210),
Group = c("", "A", "", "B"),
Group2 = c("D", "G", "G", "D"))

s$Group <- ifelse(test = s$Group != "", yes = s$Group, no = s$Group2)

The first argument is the test. For each value in the vector, if the test is true, then it will take the value in yes, otherwise it will take the value in no.

Python Pandas replace NaN in one column with value from corresponding row of second column

Assuming your DataFrame is in df:

df.Temp_Rating.fillna(df.Farheit, inplace=True)
del df['Farheit']
df.columns = 'File heat Observations'.split()

First replace any NaN values with the corresponding value of df.Farheit. Delete the 'Farheit' column. Then rename the columns. Here's the resulting DataFrame:

resulting DataFrame

How do I replace missing value with values from another column in R

You could do the following:

df$var1[is.na(df$var1)] <- df$var2[is.na(df$var1)]

Replace a value NA with the value from another column in R

Perhaps the easiest to read/understand answer in R lexicon is to use ifelse. So borrowing Richard's dataframe we could do:

df <- structure(list(A = c(56L, NA, NA, 67L, NA),
B = c(75L, 45L, 77L, 41L, 65L),
Year = c(1921L, 1921L, 1922L, 1923L, 1923L)),.Names = c("A",
"B", "Year"), class = "data.frame", row.names = c(NA, -5L))
df$A <- ifelse(is.na(df$A), df$B, df$A)

Replace missing values from another column - pandas

Assuming you are using panda, you could do this:

for col in ['col1', 'col2']:
df[col] = df[col].fillna(df[col+'_suffix'])

A more generic version:

for col in df.columns:
if col+'_suffix' in df.columns:
df[col] = df[col].fillna(df[col+'_suffix'])

Replace values of rows with missing values by values of another row

In general, if you find yourself looping over a data frame, there is probably a more efficient solution, either to use vectorised functions like
Jonathan has in his answer, or to use dplyr as follows.

We can check if a is NA - if so, we set c equal to b, otherwise keep it as a.

library(dplyr)
dat %>% mutate(c = if_else(is.na(A), B, A))
       A      B      c
1 13 A 1 15 A 2 13 A 1
2 15 A 2 15 A 2 15 A 2
3 <NA> 15 A 8 15 A 8
4 10 B 3 15 A 2 10 B 3
5 <NA> 15 A 5 15 A 5

Power Query Replace null values with values from another column

I'm sure there must be some way to do this with the ReplaceValue function, but I think it might be easier to do the following:

1: Create a new column with definition NewData6= if[Data.Column6]=null then [Data.Column7] else [Data.Column6]
2: Do the same thing for 8 : NewData8= if[Data.Column8]=null then [Data.Column7] else [Data.Column8]
3: Delete Data.Column6/7/8
4: Rename the newly made columns if neccesary.

You can do these steps either in the advanced editor, or just use the create custom column button in the add column tab.



Related Topics



Leave a reply



Submit