subset a data frame by row names of different rows
Using dplyr
:
library(dplyr)
DF <- data.frame(row.names=c("12a", "22a", "13a"), Name=c("12","22","13"), plot=c(25,18,9))
If you want to filter by the data frame column "Name", then:
DF.new -> DF %>% filter(Name %in% c("12", "16"))
If you want to filter by actual row.names
of the df, then:
DF.new -> DF %>% filter(row.names(DF) %in% c("12a","13a"))
Or, using base R:
DF.new -> DF[DF$Name %in% c("12","13"), ]
or
DF.new -> DF[row.names(DF) %in% c("12a","13a"),]
R: Subset data.frame by exact rownames
Use match
function.
> d[match(c('711', '9', '1'), rownames(d)),]
[1] 2 NA 1
Which is exactly what I need.
Addition:
Instead of using data.frame
, use Tibbles
.
From the documentation (https://cran.r-project.org/web/packages/tibble/vignettes/tibble.html):
Tibbles are also stricter with $. Tibbles never do partial matching, and will throw a warning and return NULL if the column does not exist
match row names of two data frames and subset only matching rows in R
You can directly use the rownames
from b
to subset z
.
z[rownames(b),]
# Fox Prox Sox
#ABC 1 2 3
#DEF 1 1 0
#ABD 1 3 0
How do I subset a data frame by row names that do not meet a condition?
In the example, the 'Year' for all unique 'Name' are consecutive. So, an easier option would be to group by 'Name' and filter
if the number of distinct 'Year' is less than 3 or the number of rows (n()
) is less than 3
library(dplyr)
data %>%
group_by(Name) %>%
filter(n_distinct(Year) < 3)
#or the number of rows
# filter(n() < 3)
# A tibble: 4 x 2
# Groups: Name [2]
# Name Year
# <fct> <dbl>
#1 Dex 2000
#2 Dex 2001
#3 Lex 2001
#4 Lex 2002
As a general case, after grouping by 'Name', we get the diff
erence of adjacent 'Year', check if it is equal to 1 i.e. 1 year difference, use that in run-length-encoding (rle
) to find the max
imum length of sequence of consecutive 'year' is less than 3 to filter
those 'Name' groups
data %>%
group_by(Name) %>%
filter(with(rle(c(TRUE, diff(Year)) == 1), max(lengths[values])) < 3)
# A tibble: 4 x 2
# Groups: Name [2]
# Name Year
# <fct> <dbl>
#1 Dex 2000
#2 Dex 2001
#3 Lex 2001
#4 Lex 2002
subset dataframe based on rownames
Following on from @yeedle's solution, I modified it a little and found this worked for me:
library(dplyr)
bwenv2 <- bwenv %>%
rownames_to_column("row_names") %>%
semi_join(rownames_to_column(bwsp, "row_names"), by = "row_names")
rownames(bwenv2) <- bwenv2$row_names
bwenv2 <- bwenv2 %>% select(-row_names)
bw2015 <- cbind(bwenv2, bwsp)
str(bw2015)
Subset Data Frame Rows by value in row.names in R
Extract the data which you want to split on :
sub('\\d+', '', data$group)
#[1] "ga" "ga" "gb" "gc" "gb"
and use the above in split
to divide the data into groups.
new_data <- split(data, sub('\\d+', '', data$group))
new_data
#$ga
# x1 x2 group
#1 3 a ga1
#2 7 b ga2
#$gb
# x1 x2 group
#3 1 c gb1
#5 5 e gb1
#$gc
# x1 x2 group
#4 8 d gc3
It is better to keep data in a list however, if you want separate dataframes for each group you can use list2env
.
list2env(new_data, .GlobalEnv)
how to select row names and row for any mismatch found in a row data frame
dplyr
option -
library(dplyr)
df %>% group_by(across()) %>% group_split()
# A tibble: 2 x 4
# V1 V2 V3 V4
# <chr> <chr> <chr> <chr>
#1 L M X V
#2 L M X V
#[[2]]
# A tibble: 2 x 4
# V1 V2 V3 V4
# <chr> <chr> <chr> <chr>
#1 P M X V
#2 P M X V
Subsetting a matrix by row names and column names in R
Try to filter rows and columns in this way:
matrix[rownames(matrix)%in%list_individuals,colnames(matrix)%in%list_individuals]
Only rows and columns contained in list_individuals
will be mantained in the output.
Return corresponding row name instead of data in r
You can subset the row.names
vector with the index of the max value of the column.
df <- data.frame(
x = 1:100
)
row.names(df)[which(df$x == max(df$x, na.rm = TRUE))]
# "100"
Related Topics
Identifying Where Value Changes in R Data.Frame Column
Reduce File Size of R Markdown HTML Output
Ggpairs Plot with Heatmap of Correlation Values
List Members Can Be Accessed with Partial Name? Is This a Feature
Rcpp Function to Select (And to Return) a Sub-Dataframe
Apply Function to Elements Over a List
Shiny Dashboard - Display a Dedicated "Loading.." Page Until Initial Loading of the Data Is Done
Align Two Data.Frames Next to Each Other with Knitr
Find All Combinations of Numbers That Sum to a Target
Ggplot2 Draw Individual Ellipses But Color by Group
Ggplot: Order Bars in Faceted Bar Chart Per Facet
Count Every Possible Pair of Values in a Column Grouped by Multiple Columns
Add Column Containing Data Frame Name to a List of Data Frames
Trouble Passing on an Argument to Function Within Own Function
Different Colour Palettes for Two Different Colour Aesthetic Mappings in Ggplot2
How to Perform Arithmetic on Values and Operators Expressed as Strings