Replace Row Value with Empty String If Duplicate

Replace row value with empty string if duplicate

Often, this type of transformation is better done at the application layer, because the result-set is not "SQL-ish". That is, the ordering is important for understanding the rows.

But, you can do this as:

select (case when row_number() over (partition by ProductCode order by (select NULL)) = 1
then ProductCode
end) as ProductCode
Color
from Product
order by ProductCode;

SQL Query replace duplicates with NULL or empty string ORACLE

You can use LAG() window function:

select 
nullif(Month, lag(Month) over (order by null)) Month,
Product
from tablename

See the demo.

Results:

> MONTH | PRODUCT                       
> :---- | :-----------------------------
> March | ENVOY & External Keyboard (22)
> | ENVOY & External Keyboard (22)
> | ENVOY & External Keyboard (22)
> | ENVOY & External Keyboard (22)
> | ENVOY & External Keyboard (22)
> | ENVOY & External Keyboard (22)
> April | ENVOY & External Keyboard (22)
> | ENVOY & External Keyboard (22)
> | ENVOY & External Keyboard (22)
> | ENVOY & External Keyboard (22)
> | ENVOY & External Keyboard (22)

Replace duplicated values in a column with blank

You can use duplicated like this:

df$E[duplicated(df$E)] <- ""

> df
A B C D E
1 1 2 5 6 7
2 1 3 6 5
3 1 4 7 4
4 2 1 3 3 6
5 2 2 4 5
6 3 1 2 2 5
7 3 2 1 3

data

df <- read.table(text="   A  B  C  D  E
1 2 5 6 7
1 3 6 5 7
1 4 7 4 7
2 1 3 3 6
2 2 4 5 6
3 1 2 2 5
3 2 1 3 5",header=TRUE,stringsAsFactors=FALSE)

replace the duplicate data in a row with NA except the first

Using apply with MARGIN = 1 (row-wise) we can replace duplicated values in a row to NA.

t(apply(df, 1, function(x) replace(x, duplicated(x), NA)))

# [,1] [,2] [,3]
#[1,] 2 4 NA
#[2,] 1 5 NA
#[3,] 3 NA 4

Replace all duplicate rows with Nan or blank

Use Series.where:

m = df['Decision'].ne(df['Decision'].shift()) 
df['Decision'] = df['Decision'].where(m, '')
print (df)
price1 price2 Decision
0 50 50 NaN
1 100 200 buy
2 70 140
3 150 200
4 150 50 sell
5 60 20
6 30 70 buy
7 60 100

Or:

m = df['Decision'].ne(df['Decision'].shift()) 
df['Decision'] = np.where(m, df['Decision'], '')

Pandas groupby and replace duplicates with empty string

You can mask your columns where the value is not the same as the value below, then use where to change it to a blank string:

df[['one','two']] = df[['one','two']].where(df[['one', 'two']].apply(lambda x: x != x.shift()), '')

>>> df
one two letter
0 1 a a
1 b
2 c
3 b a
4 2 a a
5 b
6 b a
7 b

some explanation:

Your mask looks like this:

>>> df[['one', 'two']].apply(lambda x: x != x.shift())
one two
0 True True
1 False False
2 False False
3 False True
4 True True
5 False False
6 False True
7 False False

All that where is doing is finding the values where that is true, and replacing the rest with ''

R - find all duplicates in row and replace

You can try this:

as.data.frame(t(apply(df, 1, function(x) {x[x==x[duplicated(x)]] <- ''; x})))

to get

   X1 X2 X3 X4 X5
x 1 2 4
y 2 3 4

If you want to retain the integer type for each column, try this:

as.data.frame(t(apply(df, 1, function(x) {x[x==x[duplicated(x)]] <- NA; x})))

to get

  X1 X2 X3 X4 X5
x 1 2 NA NA 4
y NA NA 2 3 4


Related Topics



Leave a reply



Submit