Replace value with the name of its respective column
The coding below enabled me to replace every "true" value (character) into its respective column name.
##Replace every "true" value with its respective column name
w <- which(df=="true",arr.ind=TRUE)
df[w] <- names(df)[w[,"col"]]
Replace column values with column name using dplyr's transmute_all
If you want to stick with a dplyr
solution you almost already had it
library(dplyr)
df <- data_frame(a = c(NA, 1, NA, 1, 1), b = c(1, NA, 1, 1, NA))
df %>%
transmute_all(funs(ifelse(. == 1, deparse(substitute(.)), NA)))
#> # A tibble: 5 x 2
#> a b
#> <chr> <chr>
#> 1 <NA> b
#> 2 a <NA>
#> 3 <NA> b
#> 4 a b
#> 5 a <NA>
Replace value by column name for many columns using R and dplyr
An option is to use tidyr::gather
and then summarise using dplyr
:
library(dplyr)
library(tidyr)
df %>% gather(feelings, value, -id) %>% #Change to long format
filter(value) %>% #Filter for value which are TRUE
group_by(id) %>%
summarise(feelings= paste0(feelings,collapse=","))
# id feelings
# <chr> <chr>
# 1 a tired
# 2 b excited
# 3 c tired,lonely,excited
Replace specific values in pandas dataframe with the corresponding column name, based on a condition,
IIUC try adding the brackets and 'and', then mask
out the yes
and radd
the column names:
new_df = (' and (' + df + ')').mask(df.eq('yes'), '').radd(df.columns)
new_df
:
column1 column2
0 column1 NaN
1 column1 and (some_string) NaN
2 NaN column2
Breakdown of steps:
new_df = ' and (' + df + ')'
column1 column2
0 and (yes) NaN
1 and (some_string) NaN
2 NaN and (yes)
mask
:
new_df = new_df.mask(df.eq('yes'), '')
column1 column2
0 NaN
1 and (some_string) NaN
2 NaN
radd
:
new_df = new_df.radd(df.columns)
column1 column2
0 column1 NaN
1 column1 and (some_string) NaN
2 NaN column2
How to replace a value in a pandas dataframe with column name based on a condition?
One way could be to use replace
and pass in a Series mapping column labels to values (those same labels in this case):
>>> dfz.loc[:, 'A':'D'].replace(1, pd.Series(dfz.columns, dfz.columns))
A B C D
0 A B C D
1 0 0 0 0
2 0 0 0 0
3 A B C D
4 0 0 3 0
5 0 B C 0
To make the change permanent, you'd assign the returned DataFrame back to dfz.loc[:, 'A':'D']
.
Solutions aside, it's useful to keep in mind that you may lose a lot of performance benefits when you mix numeric and string types in columns, as pandas is forced to use the generic 'object' dtype to hold the values.
Changing cell values in data table with column names (R)?
Here's a tidyverse/purrr
option:
map2_df(DT, names(DT), ~ replace(.x, .x==1, .y) %>% replace(. == 0, NA))
# A tibble: 5 x 4
names a b c
<chr> <chr> <chr> <chr>
1 n1 NA b c
2 n2 NA NA NA
3 n3 a NA NA
4 n4 a b c
5 n5 NA NA c
Replacing column values based on a corresponding column r
Maybe put your data in long form:
library(data.table)
setDT(df.wide)
dt.long = melt(df.wide, meas=patterns(IM = "^IM", LV = "^LV"))
dt.long[, variable := c("A","B","C")[variable]]
title variable IM LV
1: A A 0.5 0.7
2: B A 0.1 0.0
3: C A 4.6 2.5
4: D A 5.6 5.0
5: A B 0.2 1.0
6: B B 0.4 2.0
7: C B 2.6 4.5
8: D B 2.2 5.0
9: A C 2.0 3.0
10: B C 1.0 2.0
11: C C 3.0 5.0
12: D C 4.0 1.0
From here, it is easy to make the edit:
dt.long[IM < 2.5, LV := 0]
If you want to use tidyr: As far as I know, gather
does not support creating two columns when converting to long form. The next generation of the function, pivot_longer
might.
I would suggest continuing to work with the data in long format as long as possible to avoid further fiddling with variable names, but if you need to get back to wide format, there's...
res = dcast(dt.long, title ~ variable, value.var=c("IM", "LV"), sep=".")
title IM_A IM_B IM_C LV_A LV_B LV_C
1: A 0.5 0.2 2 0.0 0.0 0
2: B 0.1 0.4 1 0.0 0.0 0
3: C 4.6 2.6 3 2.5 4.5 5
4: D 5.6 2.2 4 5.0 0.0 1
Further steps are needed if you want the same column order:
setcolorder(res, names(df.wide))
title IM.A LV.A IM.B LV.B IM.C LV.C
1: A 0.5 0.0 0.2 0.0 2 0
2: B 0.1 0.0 0.4 0.0 1 0
3: C 4.6 2.5 2.6 4.5 3 5
4: D 5.6 5.0 2.2 0.0 4 1
Replace value with the average of it's column with Pandas
The first thing to recognize is the columns that have 'x' in them are not integer datatypes. They are object datatypes.
df = pd.read_csv('file.csv')
df
Col1 Col2
0 1 22
1 2 44
2 3 x
3 4 88
4 5 110
5 6 132
6 7 x
7 8 176
8 9 198
9 10 x
df.dtypes
Col1 int64
Col2 object
dtype: object
In order to get the mean of Col2, it needs to be converted to a numeric value.
df['Col2'] = pd.to_numeric(df['Col2'], errors='coerce').astype('Int64')
df.dtypes
Col1 int64
Col2 Int64
dtype: object
The df now looks like so:
df
Col1 Col2
0 1 22
1 2 44
2 3 <NA>
3 4 88
4 5 110
5 6 132
6 7 <NA>
7 8 176
8 9 198
9 10 <NA>
Now we can use fillna() with df['Col2'].mean():
df['Col2'] = df['Col2'].fillna(df['Col2'].mean())
df
Col1 Col2
0 1 22
1 2 44
2 3 110
3 4 88
4 5 110
5 6 132
6 7 110
7 8 176
8 9 198
9 10 110
Replace column values according to corresponding values of other column in Pandas
Use mask
for replace all not missing values with pop
for extract column Data
:
df = pd.DataFrame({
'A':[4,5] + [np.nan] * 4,
'B':[np.nan,np.nan,9,4,np.nan,np.nan],
'C':[np.nan] * 4 + [7,0],
'Data':list('aaabbb')
})
print (df)
A B C Data
0 4.0 NaN NaN a
1 5.0 NaN NaN a
2 NaN 9.0 NaN a
3 NaN 4.0 NaN b
4 NaN NaN 7.0 b
5 NaN NaN 0.0 b
df = df.mask(df.notnull(), df.pop('Data'), axis=0)
print (df)
A B C
0 a NaN NaN
1 a NaN NaN
2 NaN a NaN
3 NaN b NaN
4 NaN NaN b
5 NaN NaN b
Related Topics
Trouble Passing on an Argument to Function Within Own Function
How to Select Non-Numeric Columns Using Dplyr::Select_If
Minus Operation of Data Frames
Convert Quarter/Year Format to a Date
Convert List to Data Frame While Keeping List-Element Names
How to Connect to a Remote Server with Ssh in R
Dealing with Spaces and "Weird" Characters in Column Names with Dplyr::Rename()
Add Text on Right of Shinydashboard Header
Ggplot2: Using Gtable to Move Strip Labels to Top of Panel for Facet_Grid
How to Add a Condition to the Geom_Point Size
How to Change Angle of Line in Customized Legend in Ggplot2
Generate All Possible Permutations (Or N-Tuples)
Remove Text After Final Period in String
Assign Names to Data Frame with As.Data.Frame Function
How to Write Contents of Help to a File from Within R
Using the Geosphere Distm Function on a Data.Table to Calculate Distances