Delete a column in a data frame within a list
Assuming your list is called myList
, something like this should work:
lapply(myList, function(x) { x["ID"] <- NULL; x })
Update
For a more general solution, you can also use something like this:
# Sample data
myList <- list(A = data.frame(ID = c("A", "A"),
Test = c(1, 1),
Value = 1:2),
B = data.frame(ID = c("B", "B", "B"),
Test = c(1, 3, 5),
Value = 1:3))
# Keep just the "ID" and "Value" columns
lapply(myList, function(x) x[(names(x) %in% c("ID", "Value"))])
# Drop the "ID" and "Value" columns
lapply(myList, function(x) x[!(names(x) %in% c("ID", "Value"))])
How to remove specific columns from a list of dataframes when all of these columns do not exist in each dataframe
one approach is:
library(tidyverse)
map(dat_list, function(xx) xx %>% select(any_of(c("x", "X4"))))
# [[1]]
# x
# 1 7
# 2 8
# 3 9
# [[2]]
# data frame with 0 columns and 3 rows
# [[3]]
# x X4
# 1 7 10
# 2 8 11
# 3 9 12
Remove columns from data.frame that are of type list
If you need to use logical indexing:
df[,!purrr::map_lgl(df,is.list)] %>%
names()
[1] "CATEGORY" "BIBTEXKEY" "ADDRESS" "ANNOTE" "BOOKTITLE"
[6] "CHAPTER" "CROSSREF" "EDITION" "HOWPUBLISHED" "INSTITUTION"
[11] "JOURNAL" "KEY" "MONTH" "NOTE" "NUMBER"
[16] "ORGANIZATION" "PAGES" "PUBLISHER" "SCHOOL" "SERIES"
[21] "TITLE" "TYPE" "VOLUME" "YEAR" "ISSN"
[26] "DOI" "ISBN" "URL"
You can also do df %>% select_if(Negate(is.list))
Also, As mentioned by @akrun,
You can simply use discard
from purrr
:
purrr::discard(dat, is.list)
Or as @markus points out, we can use keep
and negate
:
keep(dat, negate(is.list))
Otherwise:
We can unnest:
library(tidyverse)
df %>%
unnest(AUTHOR) %>%
select(-AUTHOR)
Can't change/remove a column in a list of dataframes using lapply and dplyr select?
Both commands work for me:
library(dplyr)
vec1 <- c("a", "b", "c", "d")
val <- c(11, 5443, 552, 9)
vec2 <- c("r", "p", "h", "y")
val <- c(5, 9, 47, 23)
df1 <- data.frame(vec1, val, stringsAsFactors = FALSE)
df2 <- data.frame(vec2, val, stringsAsFactors = FALSE)
L <- list(df1, df2)
L
# [[1]]
# vec1 val
# 1 a 5
# 2 b 9
# 3 c 47
# 4 d 23
#
# [[2]]
# vec2 val
# 1 r 5
# 2 p 9
# 3 h 47
# 4 y 23
lapply(L, function (y) {y$val <- NULL; y})
# [[1]]
# vec1
# 1 a
# 2 b
# 3 c
# 4 d
#
# [[2]]
# vec2
# 1 r
# 2 p
# 3 h
# 4 y
lapply(L, function (y) {select(y, -val)})
# [[1]]
# vec1
# 1 a
# 2 b
# 3 c
# 4 d
#
# [[2]]
# vec2
# 1 r
# 2 p
# 3 h
# 4 y
Removing a column permanently from a data frame in Python
You have to assign it back to mydf
, if you want to reach a permanent change, i.e. do
mydf = mydf.drop('Z', axis=1)
instead.
Remove dataframe row containing a specific in a list value from a list
You can approach in the following steps:
You can use
pd.Series.explode()
on each column/element to expand the list of strings into multiple rows, with each row contains only strings (all lists already got expanded / exploded into rows).Then check the dataframe for strings in the
to_delete
list by using.isin()
.Group by index level 0 (which contains original row index before explode) to aggregate and summarize the multiple rows matching result back into one row (using
.sum()
undergroupby()
).Then
.sum(axis=1)
to check row-wise any matching string to delete.Check for rows with 0 match (those rows to retain) and form a boolean index of the resulting rows.
Finally, use
.loc
to filter the rows without matching to retain.
df.loc[df.apply(pd.Series.explode).isin(to_delete).groupby(level=0).sum().sum(axis=1).eq(0)]
Result:
A B C D E
1 string2 string5 [string8] [string13] [string16]
The original dataframe can be built for testing from the following codes:
data = {'A': ['string1', 'string2', 'string3'],
'B': ['string4', 'string5', 'string6'],
'C': [['string7', 'string10'], ['string8'], ['string9']],
'D': [['string11', 'string 12'], ['string13'], ['string14']],
'E': [['string15'], ['string16'], ['string17']]}
df = pd.DataFrame(data)
tidyverse - delete a column within a nested column/list
The suggestion by @r2evans would work if we remove the group attribute
library(dplyr)
library(purrr)
cor_dat %>%
ungroup %>%
mutate(cor = map(cor, ~ select(.x, -rowname)))
# A tibble: 4 x 2
# grp cor
# <int> <list>
#1 1 <tibble [6 × 6]>
#2 2 <tibble [6 × 6]>
#3 3 <tibble [6 × 6]>
#4 4 <tibble [6 × 6]>
When there is a group attribute, it results in error
cor_dat %>%
mutate(cor = map(cor, ~ select(.x, -rowname)))
Error:
mutate()
argumentcor
errored.
ℹcor
ismap(cor, ~select(.x, -rowname))
.
ℹ The error occured in row 1.
✖ no applicable method for 'select_' applied to an object of class "character"
Runrlang::last_error()
to see where the error occurred.
which is consistent with the same behavior if we extract as a column
cor_dat$cor %>%
map(~ .x %>% select(-rowname))
Or if we want to make it shorter, it can be done within condense
itself because correlate
adds a rowname
column as per the documentation
dat %>%
group_by(grp) %>%
condense(cor = correlate(cur_data()) %>%
select(-rowname))
# A tibble: 4 x 2
# Rowwise: grp
# grp cor
# <int> <list>
#1 1 <tibble [6 × 6]>
#2 2 <tibble [6 × 6]>
#3 3 <tibble [6 × 6]>
#4 4 <tibble [6 × 6]>
How to remove a list of columns from pydatatable dataframe?
Removing columns (or rows) from a Frame is easy: take any syntax that you would normally use to select those columns, and then append the python del
keyword.
Thus, if you want to delete columns 'id'
, 'country'
, and 'egg'
, run
>>> del comidas_gen_dt[:, ['id','country','egg']]
>>> comidas_gen_dt
| veg fork beef
-- + --- ---- ----
0 | 30 5 90
1 | 40 10 50
2 | 10 2 20
3 | 3 1 NA
4 | 5 9 4
[5 rows x 3 columns]
If you want to keep the original frame unmodified, and then select a new frame with some of the columns removed, then the easiest way would be to first copy the frame, and then use the del
operation:
>>> DT = comidas_gen_dt.copy()
>>> del DT[:, columns_to_remove]
(note that .copy()
makes a shallow copy, i.e. its cost is typically negligible).
You can also use the f[:].remove()
approach. It's a bit strange that it didn't work the way you've written it, but going from a list of strings to a list of f
-symbols is quite straightforward:
def pydt_remove_cols(DT, *rmcols):
return DT[:, f[:].remove([f[col] for col in rmcols])]
Here I use the fact that f.A
is the same as f["A"]
, where the inner string "A"
might as well be replaced with any variable.
Related Topics
Passing Several Arguments to Fun of Lapply (And Others *Apply)
How to Use Multiple Versions of the Same R Package
Cut() Error - 'Breaks' Are Not Unique
Stop an R Program Without Error
Multiple Time Series in One Plot
Ordering of Points in R Lines Plot
Loop in R: How to Save the Outputs
How to Extract Certain Columns from a List of Data Frames
Recode Categorical Variable to Binary (0/1)
Set R Plots X Axis to Show at Y=0
How to Change Xts to Data.Frame and Keep Index
Ggplot Geom_Bar: Meaning of Aes(Group = 1)
Specifying Formula in R with Glm Without Explicit Declaration of Each Covariate
How to Group Data.Table by Multiple Columns
Data Input via Shinytable in R Shiny Application
R: How to Split a Data Frame into Training, Validation, and Test Sets