Select rows of a matrix that meet a condition
This is easier to do if you convert your matrix to a data frame using as.data.frame(). In that case the previous answers (using subset or m$three) will work, otherwise they will not.
To perform the operation on a matrix, you can define a column by name:
m[m[, "three"] == 11,]
Or by number:
m[m[,3] == 11,]
Note that if only one row matches, the result is an integer vector, not a matrix.
Extract rows from a matrix that meet a condition
You can use %in%
to see where the subset.vector
matches column 3 of mat1
like:
identical(mat2, mat1[mat1[,3] %in% subset.vector,])
#[1] TRUE
select rows of a matrix with specific condition
Your problem is with the &&
. &&
is a logical operation that only works on scalar booleans (with shortcut ability, thus if the first expression is FALSE
, the other expressions never are evaluated.). When doing logical expressions with vectors, just use a plain &
:
sub3 <- subset(data.df, V1 == "General0" & V2 == "0")
Your import is a bit complicated. read.table
returns you a perfect data.frame
of your dataset. Thus, converting the output of read.table
to a matrix and then reconverting it to a data.frame
has only one effect. You convert all values to characters (as you have one character column) and then create a data.frame with those character values, with the result that the columns V2
and V3
are converted as factors.
If there is a valid reason to have all columns as factors, this is a valid (yet uncommon) approach. However, I can hardly imagine a use case for that. I would like
data <- read.table("sample1.txt", header = F)
sub <- subset(data, V1 == 'General0' & V2 == 0)
much better.
Edit
If you just need one column, you have at least three options (that are all well documented, by the way):
col3 <- sub3$V3
or
col3 <- with(data.df, V3[V1=='General0' & V2 == '0')
or
col3 <- data.df$V3[data.df$V1 == 'General0' & data.df&V2 == '0'])
Numpy select rows based on condition
Use a boolean mask:
mask = (z[:, 0] == 6)
z[mask, :]
This is much more efficient than np.where
because you can use the boolean mask directly, without having the overhead of converting it to an array of indices first.
One liner:
z[z[:, 0] == 6, :]
Identify rows that meet condition and store in matrix
You can maybe try something like this:
set.seed(2) # "fix" your random numbers due reproducibility
Mat1 <- data.frame(matrix(nrow = 10, ncol =250, data = rnorm(250,0,1)))
seq1 <- seq(1, 247,3)
# select the interesting columns
Mat2 <- Mat1[,c(seq1)]
# create a matrix with the row names of the top 2 values for each interesting column
dat <- sapply(Mat2, function(x) head(row.names(Mat2)[order(x, decreasing = TRUE)], 2)
class(dat)
[1] "matrix"
dat[,1:4]
X1 X4 X7 X10
[1,] "9" "3" "2" "7"
[2,] "3" "1" "5" "2"
Selecting rows that meet a condition depending on other rows in R
One method is to compute the difference in time between infection events (event_diff
). Then, incident
would be when this difference is greater than 2 years, or difference of 0 (assuming multiple tests are not done on same date). Looking at this now, I suspect there are better alternative solutions to this.
df <- data.frame(
patient_id = c(1,1,1,1,1,1,2,2,2,2),
infection = c("no", "yes", "yes", "no", "yes", "yes", "yes", "no", "no", "yes"),
date = c("2005-02-22", "2005-04-26", "2005-05-06", "2006-05-22", "2007-08-19", "2007-12-15", "2005-10-24", "2005-11-11", "2006-07-12", "2007-12-01")
)
df$date <- as.Date(df$date, "%Y-%m-%d")
library(dplyr)
df %>%
group_by(patient_id, infection) %>%
mutate(event_diff = coalesce(date - lag(date), 0)) %>%
mutate(incident = ifelse(infection == "yes" & (event_diff == 0 | event_diff > (365*2)), "yes", "no"))
patient_id infection date event_diff incident
<dbl> <fct> <date> <drtn> <chr>
1 1 no 2005-02-22 0 days no
2 1 yes 2005-04-26 0 days yes
3 1 yes 2005-05-06 10 days no
4 1 no 2006-05-22 454 days no
5 1 yes 2007-08-19 835 days yes
6 1 yes 2007-12-15 118 days no
7 2 yes 2005-10-24 0 days yes
8 2 no 2005-11-11 0 days no
9 2 no 2006-07-12 243 days no
10 2 yes 2007-12-01 768 days yes
R function or loop for repeatedly selecting rows that meet a condition, saving as separate object, and renaming column headers
You could do:
a <- split(df, df$TYPE)
b <- sapply(names(a), function(x)setNames(a[[x]],
paste0(names(a[[x]]), sub(".*_", 'L', x))), simplify = FALSE)
Related Topics
Remove Duplicate Column Pairs, Sort Rows Based on 2 Columns
Split Dataframe by Levels of a Factor and Name Dataframes by Those Levels
Ggplot2: Changing the Order of Stacks on a Bar Graph
R Keep Rows with at Least One Column Greater Than Value
Long Numbers as a Character String
Command Lines Error in Rstudio Console
Splitting a Data.Frame by a Variable
Merge Dataframes of Different Sizes
Reshape from Long to Wide and Create Columns with Binary Value
Convert from Billion to Million and Vice Versa
Expand Spacing Between Tick Marks on X Axis
What's the Difference Between Lapply and Do.Call
Select Rows of a Matrix That Meet a Condition
Why Is Message() a Better Choice Than Print() in R for Writing a Package
Colour Points in a Plot Differently Depending on a Vector of Values