Delete a Column in a Data Frame Within a List

Delete a column in a data frame within a list

Assuming your list is called myList, something like this should work:

lapply(myList, function(x) { x["ID"] <- NULL; x })

Update

For a more general solution, you can also use something like this:

# Sample data
myList <- list(A = data.frame(ID = c("A", "A"),
Test = c(1, 1),
Value = 1:2),
B = data.frame(ID = c("B", "B", "B"),
Test = c(1, 3, 5),
Value = 1:3))
# Keep just the "ID" and "Value" columns
lapply(myList, function(x) x[(names(x) %in% c("ID", "Value"))])
# Drop the "ID" and "Value" columns
lapply(myList, function(x) x[!(names(x) %in% c("ID", "Value"))])

How to remove specific columns from a list of dataframes when all of these columns do not exist in each dataframe

one approach is:

library(tidyverse)
map(dat_list, function(xx) xx %>% select(any_of(c("x", "X4"))))
# [[1]]
# x
# 1 7
# 2 8
# 3 9

# [[2]]
# data frame with 0 columns and 3 rows

# [[3]]
# x X4
# 1 7 10
# 2 8 11
# 3 9 12

Remove columns from data.frame that are of type list

If you need to use logical indexing:

  df[,!purrr::map_lgl(df,is.list)] %>% 
names()
[1] "CATEGORY" "BIBTEXKEY" "ADDRESS" "ANNOTE" "BOOKTITLE"
[6] "CHAPTER" "CROSSREF" "EDITION" "HOWPUBLISHED" "INSTITUTION"
[11] "JOURNAL" "KEY" "MONTH" "NOTE" "NUMBER"
[16] "ORGANIZATION" "PAGES" "PUBLISHER" "SCHOOL" "SERIES"
[21] "TITLE" "TYPE" "VOLUME" "YEAR" "ISSN"
[26] "DOI" "ISBN" "URL"

You can also do df %>% select_if(Negate(is.list))

Also, As mentioned by @akrun,
You can simply use discard from purrr:

purrr::discard(dat, is.list) 

Or as @markus points out, we can use keep and negate:

keep(dat, negate(is.list))

Otherwise:

We can unnest:

library(tidyverse)
df %>%
unnest(AUTHOR) %>%
select(-AUTHOR)

Can't change/remove a column in a list of dataframes using lapply and dplyr select?

Both commands work for me:

library(dplyr)

vec1 <- c("a", "b", "c", "d")
val <- c(11, 5443, 552, 9)
vec2 <- c("r", "p", "h", "y")
val <- c(5, 9, 47, 23)

df1 <- data.frame(vec1, val, stringsAsFactors = FALSE)
df2 <- data.frame(vec2, val, stringsAsFactors = FALSE)

L <- list(df1, df2)

L

# [[1]]
# vec1 val
# 1 a 5
# 2 b 9
# 3 c 47
# 4 d 23
#
# [[2]]
# vec2 val
# 1 r 5
# 2 p 9
# 3 h 47
# 4 y 23

lapply(L, function (y) {y$val <- NULL; y})

# [[1]]
# vec1
# 1 a
# 2 b
# 3 c
# 4 d
#
# [[2]]
# vec2
# 1 r
# 2 p
# 3 h
# 4 y

lapply(L, function (y) {select(y, -val)})

# [[1]]
# vec1
# 1 a
# 2 b
# 3 c
# 4 d
#
# [[2]]
# vec2
# 1 r
# 2 p
# 3 h
# 4 y

Removing a column permanently from a data frame in Python

You have to assign it back to mydf, if you want to reach a permanent change, i.e. do

mydf = mydf.drop('Z', axis=1)

instead.

Remove dataframe row containing a specific in a list value from a list


You can approach in the following steps:

  1. You can use pd.Series.explode() on each column/element to expand the list of strings into multiple rows, with each row contains only strings (all lists already got expanded / exploded into rows).

  2. Then check the dataframe for strings in the to_delete list by using .isin().

  3. Group by index level 0 (which contains original row index before explode) to aggregate and summarize the multiple rows matching result back into one row (using .sum() under groupby()).

  4. Then .sum(axis=1) to check row-wise any matching string to delete.

  5. Check for rows with 0 match (those rows to retain) and form a boolean index of the resulting rows.

  6. Finally, use .loc to filter the rows without matching to retain.



df.loc[df.apply(pd.Series.explode).isin(to_delete).groupby(level=0).sum().sum(axis=1).eq(0)]

Result:

         A        B          C           D           E
1 string2 string5 [string8] [string13] [string16]

The original dataframe can be built for testing from the following codes:

data = {'A': ['string1', 'string2', 'string3'],
'B': ['string4', 'string5', 'string6'],
'C': [['string7', 'string10'], ['string8'], ['string9']],
'D': [['string11', 'string 12'], ['string13'], ['string14']],
'E': [['string15'], ['string16'], ['string17']]}

df = pd.DataFrame(data)

tidyverse - delete a column within a nested column/list

The suggestion by @r2evans would work if we remove the group attribute

library(dplyr)
library(purrr)
cor_dat %>%
ungroup %>%
mutate(cor = map(cor, ~ select(.x, -rowname)))
# A tibble: 4 x 2
# grp cor
# <int> <list>
#1 1 <tibble [6 × 6]>
#2 2 <tibble [6 × 6]>
#3 3 <tibble [6 × 6]>
#4 4 <tibble [6 × 6]>

When there is a group attribute, it results in error

cor_dat %>% 
mutate(cor = map(cor, ~ select(.x, -rowname)))

Error: mutate() argument cor errored.
cor is map(cor, ~select(.x, -rowname)).
ℹ The error occured in row 1.
✖ no applicable method for 'select_' applied to an object of class "character"
Run rlang::last_error() to see where the error occurred.

which is consistent with the same behavior if we extract as a column

cor_dat$cor %>% 
map(~ .x %>% select(-rowname))

Or if we want to make it shorter, it can be done within condense itself because correlate adds a rowname column as per the documentation

dat %>%
group_by(grp) %>%
condense(cor = correlate(cur_data()) %>%
select(-rowname))
# A tibble: 4 x 2
# Rowwise: grp
# grp cor
# <int> <list>
#1 1 <tibble [6 × 6]>
#2 2 <tibble [6 × 6]>
#3 3 <tibble [6 × 6]>
#4 4 <tibble [6 × 6]>

How to remove a list of columns from pydatatable dataframe?

Removing columns (or rows) from a Frame is easy: take any syntax that you would normally use to select those columns, and then append the python del keyword.

Thus, if you want to delete columns 'id', 'country', and 'egg', run

>>> del comidas_gen_dt[:, ['id','country','egg']]
>>> comidas_gen_dt
| veg fork beef
-- + --- ---- ----
0 | 30 5 90
1 | 40 10 50
2 | 10 2 20
3 | 3 1 NA
4 | 5 9 4

[5 rows x 3 columns]

If you want to keep the original frame unmodified, and then select a new frame with some of the columns removed, then the easiest way would be to first copy the frame, and then use the del operation:

>>> DT = comidas_gen_dt.copy()
>>> del DT[:, columns_to_remove]

(note that .copy() makes a shallow copy, i.e. its cost is typically negligible).

You can also use the f[:].remove() approach. It's a bit strange that it didn't work the way you've written it, but going from a list of strings to a list of f-symbols is quite straightforward:

def pydt_remove_cols(DT, *rmcols):
return DT[:, f[:].remove([f[col] for col in rmcols])]

Here I use the fact that f.A is the same as f["A"], where the inner string "A" might as well be replaced with any variable.



Related Topics



Leave a reply



Submit