How to delete a row from a data.frame without losing the attributes
If I understand you correctly, you have some data in a data.frame, and the columns of the data.frame have comments associated with them. Perhaps something like the following?
set.seed(1)
mydf<-data.frame(aa=rpois(100,4),bb=sample(LETTERS[1:5],
100,replace=TRUE))
comment(mydf$aa)<-"Don't drop me!"
comment(mydf$bb)<-"Me either!"
So this would give you something like
> str(mydf)
'data.frame': 100 obs. of 2 variables:
$ aa: atomic 3 3 4 7 2 7 7 5 5 1 ...
..- attr(*, "comment")= chr "Don't drop me!"
$ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2 2 5 4 2 1 3 5 3 ...
..- attr(*, "comment")= chr "Me either!"
And when you subset this, the comments are dropped:
> str(mydf[1:2,]) # comment dropped.
'data.frame': 2 obs. of 2 variables:
$ aa: num 3 3
$ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2
To preserve the comments, define the function [.avector
, as you did above (from the documentation) then add the appropriate class attributes to each of the columns in your data.frame (EDIT: to keep the factor levels of bb
, add "factor"
to the class of bb
.):
mydf$aa<-structure(mydf$aa, class="avector")
mydf$bb<-structure(mydf$bb, class=c("avector","factor"))
So that the comments are preserved:
> str(mydf[1:2,])
'data.frame': 2 obs. of 2 variables:
$ aa:Class 'avector' atomic [1:2] 3 3
.. ..- attr(*, "comment")= chr "Don't drop me!"
$ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2
..- attr(*, "comment")= chr "Me either!"
EDIT:
If there are many columns in your data.frame that have attributes you want to preserve, you could use lapply
(EDITED to include original column class):
mydf2 <- data.frame( lapply( mydf, function(x) {
structure( x, class = c("avector", class(x) ) )
} ) )
However, this drops comments associated with the data.frame itself (such as comment(mydf)<-"I'm a data.frame"
), so if you have any, assign them to the new data.frame:
comment(mydf2)<-comment(mydf)
And then you have
> str(mydf2[1:2,])
'data.frame': 2 obs. of 2 variables:
$ aa:Classes 'avector', 'numeric' atomic [1:2] 3 3
.. ..- attr(*, "comment")= chr "Don't drop me!"
$ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2
..- attr(*, "comment")= chr "Me either!"
- attr(*, "comment")= chr "I'm a data.frame"
Remove Rows From Data Frame where a Row matches a String
Just use the ==
with the negation symbol (!
). If dtfm is the name of your data.frame:
dtfm[!dtfm$C == "Foo", ]
Or, to move the negation in the comparison:
dtfm[dtfm$C != "Foo", ]
Or, even shorter using subset()
:
subset(dtfm, C!="Foo")
Remove rows from pandas DataFrame based on condition
General boolean indexing
df[df['Species'] != 'Cat']
# df[df['Species'].ne('Cat')]
Index Name Species
1 1 Jill Dog
3 3 Harry Dog
4 4 Hannah Dog
df.query
df.query("Species != 'Cat'")
Index Name Species
1 1 Jill Dog
3 3 Harry Dog
4 4 Hannah Dog
For information on the pd.eval()
family of functions, their features and use cases, please visit Dynamic Expression Evaluation in pandas using pd.eval().
df.isin
df[~df['Species'].isin(['Cat'])]
Index Name Species
1 1 Jill Dog
3 3 Harry Dog
4 4 Hannah Dog
Drop rows with all zeros in pandas data frame
It turns out this can be nicely expressed in a vectorized fashion:
> df = pd.DataFrame({'a':[0,0,1,1], 'b':[0,1,0,1]})
> df = df[(df.T != 0).any()]
> df
a b
1 0 1
2 1 0
3 1 1
Delete rows containing specific strings in R
This should do the trick:
df[- grep("REVERSE", df$Name),]
Or a safer version would be:
df[!grepl("REVERSE", df$Name),]
how do I remove rows with duplicate values of columns in pandas data frame?
Using drop_duplicates
with subset
with list of columns to check for duplicates on and keep='first'
to keep first of duplicates.
If dataframe
is:
df = pd.DataFrame({'Column1': ["'cat'", "'toy'", "'cat'"],
'Column2': ["'bat'", "'flower'", "'bat'"],
'Column3': ["'xyz'", "'abc'", "'lmn'"]})
print(df)
Result:
Column1 Column2 Column3
0 'cat' 'bat' 'xyz'
1 'toy' 'flower' 'abc'
2 'cat' 'bat' 'lmn'
Then:
result_df = df.drop_duplicates(subset=['Column1', 'Column2'], keep='first')
print(result_df)
Result:
Column1 Column2 Column3
0 'cat' 'bat' 'xyz'
1 'toy' 'flower' 'abc'
Delete rows if there are null values in a specific column in Pandas dataframe
If the relevant entries in Charge_Per_Line are empty (NaN
) when you read into pandas, you can use df.dropna
:
df = df.dropna(axis=0, subset=['Charge_Per_Line'])
If the values are genuinely -
, then you can replace them with np.nan
and then use df.dropna
:
import numpy as np
df['Charge_Per_Line'] = df['Charge_Per_Line'].replace('-', np.nan)
df = df.dropna(axis=0, subset=['Charge_Per_Line'])
Delete rows with blank values in one particular column
df[!(is.na(df$start_pc) | df$start_pc==""), ]
Removing display of row names from data frame
You have successfully removed the row names. The print.data.frame
method just shows the row numbers if no row names are present.
df1 <- data.frame(values = rnorm(3), group = letters[1:3],
row.names = paste0("RowName", 1:3))
print(df1)
# values group
#RowName1 -1.469809 a
#RowName2 -1.164943 b
#RowName3 0.899430 c
rownames(df1) <- NULL
print(df1)
# values group
#1 -1.469809 a
#2 -1.164943 b
#3 0.899430 c
You can suppress printing the row names and numbers in print.data.frame
with the argument row.names
as FALSE
.
print(df1, row.names = FALSE)
# values group
# -1.4345829 d
# 0.2182768 e
# -0.2855440 f
Edit: As written in the comments, you want to convert this to HTML. From the xtable
and print.xtable
documentation, you can see that the argument include.rownames
will do the trick.
library("xtable")
print(xtable(df1), type="html", include.rownames = FALSE)
#<!-- html table generated in R 3.1.0 by xtable 1.7-3 package -->
#<!-- Thu Jun 26 12:50:17 2014 -->
#<TABLE border=1>
#<TR> <TH> values </TH> <TH> group </TH> </TR>
#<TR> <TD align="right"> -0.34 </TD> <TD> a </TD> </TR>
#<TR> <TD align="right"> -1.04 </TD> <TD> b </TD> </TR>
#<TR> <TD align="right"> -0.48 </TD> <TD> c </TD> </TR>
#</TABLE>
Related Topics
Clip Values Between a Minimum and Maximum Allowed Value in R
Is It Bad Practice to Access S4 Objects Slots Directly Using @
Controlling the 'Alpha' Level in a Ggplot2 Legend
What Is a Neat Command Line Equivalent to Rstudio's Knit HTML
How to Build a Dendrogram from a Directory Tree
Adding Lagged Variables to an Lm Model
Make Dataframe of Top N Frequent Terms for Multiple Corpora Using Tm Package in R
Specifying Column Types When Importing Xlsx Data to R with Package Readxl
Extract Random Effect Variances from Lme4 Mer Model Object
Writing Functions VS. Line-By-Line Interpretation in an R Workflow
Remove Strip Background Keep Panel Border
How to Subset from a List in R
How to Control the Igraph Plot Layout with Fixed Positions
Any Way to Pause at Specific Frames/Time Points with Transition_Reveal in Gganimate
Difference Between Mean(C(1,2,21)) and Mean(1,2,21)
Displaying Data in the Chart Based on Plotly_Click in R Shiny