How to Delete a Row from a Data.Frame Without Losing the Attributes

How to delete a row from a data.frame without losing the attributes

If I understand you correctly, you have some data in a data.frame, and the columns of the data.frame have comments associated with them. Perhaps something like the following?

set.seed(1)

mydf<-data.frame(aa=rpois(100,4),bb=sample(LETTERS[1:5],
100,replace=TRUE))

comment(mydf$aa)<-"Don't drop me!"
comment(mydf$bb)<-"Me either!"

So this would give you something like

> str(mydf)
'data.frame': 100 obs. of 2 variables:
$ aa: atomic 3 3 4 7 2 7 7 5 5 1 ...
..- attr(*, "comment")= chr "Don't drop me!"
$ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2 2 5 4 2 1 3 5 3 ...
..- attr(*, "comment")= chr "Me either!"

And when you subset this, the comments are dropped:

> str(mydf[1:2,]) # comment dropped.
'data.frame': 2 obs. of 2 variables:
$ aa: num 3 3
$ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2

To preserve the comments, define the function [.avector, as you did above (from the documentation) then add the appropriate class attributes to each of the columns in your data.frame (EDIT: to keep the factor levels of bb, add "factor" to the class of bb.):

mydf$aa<-structure(mydf$aa, class="avector")
mydf$bb<-structure(mydf$bb, class=c("avector","factor"))

So that the comments are preserved:

> str(mydf[1:2,])
'data.frame': 2 obs. of 2 variables:
$ aa:Class 'avector' atomic [1:2] 3 3
.. ..- attr(*, "comment")= chr "Don't drop me!"
$ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2
..- attr(*, "comment")= chr "Me either!"

EDIT:

If there are many columns in your data.frame that have attributes you want to preserve, you could use lapply (EDITED to include original column class):

mydf2 <- data.frame( lapply( mydf, function(x) {
structure( x, class = c("avector", class(x) ) )
} ) )

However, this drops comments associated with the data.frame itself (such as comment(mydf)<-"I'm a data.frame"), so if you have any, assign them to the new data.frame:

comment(mydf2)<-comment(mydf)

And then you have

> str(mydf2[1:2,])
'data.frame': 2 obs. of 2 variables:
$ aa:Classes 'avector', 'numeric' atomic [1:2] 3 3
.. ..- attr(*, "comment")= chr "Don't drop me!"
$ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2
..- attr(*, "comment")= chr "Me either!"
- attr(*, "comment")= chr "I'm a data.frame"

Remove Rows From Data Frame where a Row matches a String

Just use the == with the negation symbol (!). If dtfm is the name of your data.frame:

dtfm[!dtfm$C == "Foo", ]

Or, to move the negation in the comparison:

dtfm[dtfm$C != "Foo", ]

Or, even shorter using subset():

subset(dtfm, C!="Foo")

Remove rows from pandas DataFrame based on condition

General boolean indexing

df[df['Species'] != 'Cat']
# df[df['Species'].ne('Cat')]

Index Name Species
1 1 Jill Dog
3 3 Harry Dog
4 4 Hannah Dog


df.query

df.query("Species != 'Cat'")

Index Name Species
1 1 Jill Dog
3 3 Harry Dog
4 4 Hannah Dog

For information on the pd.eval() family of functions, their features and use cases, please visit Dynamic Expression Evaluation in pandas using pd.eval().



df.isin

df[~df['Species'].isin(['Cat'])]

Index Name Species
1 1 Jill Dog
3 3 Harry Dog
4 4 Hannah Dog

Drop rows with all zeros in pandas data frame

It turns out this can be nicely expressed in a vectorized fashion:

> df = pd.DataFrame({'a':[0,0,1,1], 'b':[0,1,0,1]})
> df = df[(df.T != 0).any()]
> df
a b
1 0 1
2 1 0
3 1 1

Delete rows containing specific strings in R

This should do the trick:

df[- grep("REVERSE", df$Name),]

Or a safer version would be:

df[!grepl("REVERSE", df$Name),]

how do I remove rows with duplicate values of columns in pandas data frame?

Using drop_duplicates with subset with list of columns to check for duplicates on and keep='first' to keep first of duplicates.

If dataframe is:

df = pd.DataFrame({'Column1': ["'cat'", "'toy'", "'cat'"],
'Column2': ["'bat'", "'flower'", "'bat'"],
'Column3': ["'xyz'", "'abc'", "'lmn'"]})
print(df)

Result:

  Column1   Column2 Column3
0 'cat' 'bat' 'xyz'
1 'toy' 'flower' 'abc'
2 'cat' 'bat' 'lmn'

Then:

result_df = df.drop_duplicates(subset=['Column1', 'Column2'], keep='first')
print(result_df)

Result:

  Column1   Column2 Column3
0 'cat' 'bat' 'xyz'
1 'toy' 'flower' 'abc'

Delete rows if there are null values in a specific column in Pandas dataframe

If the relevant entries in Charge_Per_Line are empty (NaN) when you read into pandas, you can use df.dropna:

df = df.dropna(axis=0, subset=['Charge_Per_Line'])

If the values are genuinely -, then you can replace them with np.nan and then use df.dropna:

import numpy as np

df['Charge_Per_Line'] = df['Charge_Per_Line'].replace('-', np.nan)
df = df.dropna(axis=0, subset=['Charge_Per_Line'])

Delete rows with blank values in one particular column

 df[!(is.na(df$start_pc) | df$start_pc==""), ]

Removing display of row names from data frame

You have successfully removed the row names. The print.data.frame method just shows the row numbers if no row names are present.

df1 <- data.frame(values = rnorm(3), group = letters[1:3],
row.names = paste0("RowName", 1:3))
print(df1)
# values group
#RowName1 -1.469809 a
#RowName2 -1.164943 b
#RowName3 0.899430 c

rownames(df1) <- NULL
print(df1)
# values group
#1 -1.469809 a
#2 -1.164943 b
#3 0.899430 c

You can suppress printing the row names and numbers in print.data.frame with the argument row.names as FALSE.

print(df1, row.names = FALSE)
# values group
# -1.4345829 d
# 0.2182768 e
# -0.2855440 f

Edit: As written in the comments, you want to convert this to HTML. From the xtable and print.xtable documentation, you can see that the argument include.rownames will do the trick.

library("xtable")
print(xtable(df1), type="html", include.rownames = FALSE)
#<!-- html table generated in R 3.1.0 by xtable 1.7-3 package -->
#<!-- Thu Jun 26 12:50:17 2014 -->
#<TABLE border=1>
#<TR> <TH> values </TH> <TH> group </TH> </TR>
#<TR> <TD align="right"> -0.34 </TD> <TD> a </TD> </TR>
#<TR> <TD align="right"> -1.04 </TD> <TD> b </TD> </TR>
#<TR> <TD align="right"> -0.48 </TD> <TD> c </TD> </TR>
#</TABLE>


Related Topics



Leave a reply



Submit