Select Rows of a Matrix That Meet a Condition

Select rows of a matrix that meet a condition

This is easier to do if you convert your matrix to a data frame using as.data.frame(). In that case the previous answers (using subset or m$three) will work, otherwise they will not.

To perform the operation on a matrix, you can define a column by name:

m[m[, "three"] == 11,]

Or by number:

m[m[,3] == 11,]

Note that if only one row matches, the result is an integer vector, not a matrix.

Extract rows from a matrix that meet a condition

You can use %in% to see where the subset.vector matches column 3 of mat1 like:

identical(mat2, mat1[mat1[,3] %in% subset.vector,])
#[1] TRUE

select rows of a matrix with specific condition

Your problem is with the &&. && is a logical operation that only works on scalar booleans (with shortcut ability, thus if the first expression is FALSE, the other expressions never are evaluated.). When doing logical expressions with vectors, just use a plain &:

sub3 <- subset(data.df, V1 == "General0" & V2 == "0")

Your import is a bit complicated. read.table returns you a perfect data.frame of your dataset. Thus, converting the output of read.table to a matrix and then reconverting it to a data.frame has only one effect. You convert all values to characters (as you have one character column) and then create a data.frame with those character values, with the result that the columns V2 and V3 are converted as factors.

If there is a valid reason to have all columns as factors, this is a valid (yet uncommon) approach. However, I can hardly imagine a use case for that. I would like

data <- read.table("sample1.txt", header = F)
sub <- subset(data, V1 == 'General0' & V2 == 0)

much better.


Edit

If you just need one column, you have at least three options (that are all well documented, by the way):

col3 <- sub3$V3

or

col3 <- with(data.df, V3[V1=='General0' & V2 == '0')

or

col3 <- data.df$V3[data.df$V1 == 'General0' & data.df&V2 == '0'])

Numpy select rows based on condition

Use a boolean mask:

mask = (z[:, 0] == 6)
z[mask, :]

This is much more efficient than np.where because you can use the boolean mask directly, without having the overhead of converting it to an array of indices first.

One liner:

z[z[:, 0] == 6, :]

Identify rows that meet condition and store in matrix

You can maybe try something like this:

set.seed(2) # "fix" your random numbers due reproducibility
Mat1 <- data.frame(matrix(nrow = 10, ncol =250, data = rnorm(250,0,1)))
seq1 <- seq(1, 247,3)

# select the interesting columns
Mat2 <- Mat1[,c(seq1)]

# create a matrix with the row names of the top 2 values for each interesting column
dat <- sapply(Mat2, function(x) head(row.names(Mat2)[order(x, decreasing = TRUE)], 2)
class(dat)
[1] "matrix"

dat[,1:4]
X1 X4 X7 X10
[1,] "9" "3" "2" "7"
[2,] "3" "1" "5" "2"

Selecting rows that meet a condition depending on other rows in R

One method is to compute the difference in time between infection events (event_diff). Then, incident would be when this difference is greater than 2 years, or difference of 0 (assuming multiple tests are not done on same date). Looking at this now, I suspect there are better alternative solutions to this.

df <- data.frame(
patient_id = c(1,1,1,1,1,1,2,2,2,2),
infection = c("no", "yes", "yes", "no", "yes", "yes", "yes", "no", "no", "yes"),
date = c("2005-02-22", "2005-04-26", "2005-05-06", "2006-05-22", "2007-08-19", "2007-12-15", "2005-10-24", "2005-11-11", "2006-07-12", "2007-12-01")
)

df$date <- as.Date(df$date, "%Y-%m-%d")

library(dplyr)

df %>%
group_by(patient_id, infection) %>%
mutate(event_diff = coalesce(date - lag(date), 0)) %>%
mutate(incident = ifelse(infection == "yes" & (event_diff == 0 | event_diff > (365*2)), "yes", "no"))

patient_id infection date event_diff incident
<dbl> <fct> <date> <drtn> <chr>
1 1 no 2005-02-22 0 days no
2 1 yes 2005-04-26 0 days yes
3 1 yes 2005-05-06 10 days no
4 1 no 2006-05-22 454 days no
5 1 yes 2007-08-19 835 days yes
6 1 yes 2007-12-15 118 days no
7 2 yes 2005-10-24 0 days yes
8 2 no 2005-11-11 0 days no
9 2 no 2006-07-12 243 days no
10 2 yes 2007-12-01 768 days yes

R function or loop for repeatedly selecting rows that meet a condition, saving as separate object, and renaming column headers

You could do:

a <- split(df, df$TYPE)

b <- sapply(names(a), function(x)setNames(a[[x]],
paste0(names(a[[x]]), sub(".*_", 'L', x))), simplify = FALSE)


Related Topics



Leave a reply



Submit