Select Rows That Are a Multiple of X

SELECT rows that are a multiple of x

You can use modulo for that.

SELECT * FROM `table` WHERE (`id` % 10) = 0

SELECT * FROM `table` WHERE (`id` MOD 10) = 0

SELECT * FROM `table` WHERE !MOD(`id`, 10)

Anyone should do.

How to in R select rows that partly appear multiple of X times?

1) Calculate the length of each group and subset to those for which that length exceeds 2. No packages are used:

nr <- nrow(DF)
grp <- sub("_.*", "", rownames(DF)) # extract part before _
subset(DF, ave(1:nr, grp, FUN = length) > 2)

giving:

       A  B C  D
X11_1 0 10 9 4
X11_2 4 12 8 2
X11_3 0 9 9 13
X21_1 2 10 0 40
X21_2 3 0 0 0
X21_3 1 3 9 0

2) Another approach is to assume that the part after the underscore is labelled 1, 2, 3, ... as in the question. Split the rownames into a two column data frame r, subset r down to the rows whose second column, V2, is 3 and then keep only the rows of DF corresponding to the first column, V1, of r in that subset. No packages are used.

r <- read.table(text = rownames(DF), sep = "_")
DF[r$V1 %in% subset(r, V2 == 3)$V1, ]

Note

The input in reproducible form:

Lines <- "A,   B,  C,  D, 
X11_1, 0, 10, 9, 4,
X11_2, 4, 12, 8, 2,
X11_3, 0, 9, 9, 13,
X2_1, 7, 0, 3, 3,
X2_2, 0, 10, 0, 0,
X21_1, 2, 10, 0, 40,
X21_2, 3, 0, 0, 0,
X21_3, 1, 3, 9, 0,"
DF <- read.csv(text = Lines, strip.white = TRUE)[-5]

How to select (four) specific rows (multiple times) based on a column value in R?

Just to capture @Jasonaizkains answer from the comments field above, since pivoting is not strictly necessary in this case with some play data.

library(dplyr)
id <- rep(10:13, 4) # four subjects
year <- rep(2013:2016, each = 4) # four years
gender <- sample(1:2, 16, replace = TRUE)
play <- tibble(id, gender, year) # data.frame of 16

play <- play[-9,] # removes row for id 10 in 2015

# Removes all entries for the right id number
play %>% group_by(id) %>% filter(n_distinct(year) >= 4) %>% ungroup()
#> # A tibble: 12 x 3
#> id gender year
#> <int> <int> <int>
#> 1 11 1 2013
#> 2 12 2 2013
#> 3 13 2 2013
#> 4 11 1 2014
#> 5 12 2 2014
#> 6 13 1 2014
#> 7 11 2 2015
#> 8 12 2 2015
#> 9 13 2 2015
#> 10 11 2 2016
#> 11 12 2 2016
#> 12 13 1 2016

Select rows by multiples conditions at columns

You require a mix of regex and concatenation of your columns as such

df1 <-  dplyr::filter(df, grepl(paste(c("0/1", "1/0"), collapse = "|"), 
paste(column1, column2, sep = "_")))

Find if values in a column are a multiple of x

Do you want this? -

x =2
df['operation'] = 0
df.loc[df['Volume']%x != 0, 'operation'] = 1
df

or you can use numpy where -

import numpy as np
df['operation'] = np.where(df['Volume']%x != 0, "Do something", "Do another thing”) # on left hand side just specify the column on which you wanna perform operation

If you’ve multiple if/else condition better use np.select

Selecting rows based on multiple conditions

set.seed(123)
> df <- data.frame(loc.id = rep(1:9, each = 9), month = rep(1:9,times = 9),
x = runif(81, min = 0, max = 5))
> set.seed(123)
> df=rbind(df,cbind(loc.id=10,month=1:9 , x=runif(9)))

> df%>%group_by(loc.id)%>%mutate(x=replace(x,9,0),y=cumsum(x>1))%>%
+ summarise(y=ifelse(all(!y),which.max(x),which.max(y)))
# A tibble: 10 x 2
loc.id y
<dbl> <int>
1 1 8
2 2 8
3 3 8
4 4 7
5 5 8
6 6 8
7 7 7
8 8 8
9 9 7
10 10 5

Select rows based on whether value of a columns is in top X of columns

We can loop through the rows with apply (from base R) check whether any of the elements in 'a' or 'b' are %in% the sorted group to create a logical index and subset the rows based on that

i1 <- apply(my.df, 1, function(x) any(x[1:2] %in% sort(x, decreasing = TRUE)[1:2]))
my.df[i1,]
# a b c d e
#1 6.401462 5.318849 5.373496 5.101140 3.710973
#2 6.715845 4.786936 3.521965 4.264029 4.525138
#3 6.076211 5.356114 5.605134 5.443002 5.296778
#4 7.009623 5.275595 4.801874 4.355892 6.752737
#5 5.002059 6.163398 6.063694 2.409702 6.172111
#6 6.298305 3.291884 5.737053 4.701320 4.752406
#10 5.500374 4.400130 3.980433 6.203259 4.498614

Or use max.col from base R to create the logical index and that would be much faster and avoid any transformation

i1 <- max.col(my.df, "first")
i2 <- max.col(replace(my.df, cbind(seq_len(nrow(my.df)), i1), -Inf), "first")
my.df[(i1 %in% 1:2) | (i2 %in% 1:2), ]

data

my.df <- structure(list(a = c(6.401462, 6.715845, 6.076211, 7.009623, 
5.002059, 6.298305, 4.856246, 5.03799, 4.903592, 5.500374), b = c(5.318849,
4.786936, 5.356114, 5.275595, 6.163398, 3.291884, 4.674743, 4.129333,
3.135622, 4.40013), c = c(5.373496, 3.521965, 5.605134, 4.801874,
6.063694, 5.737053, 5.550828, 4.797334, 5.879798, 3.980433),
d = c(5.10114, 4.264029, 5.443002, 4.355892, 2.409702, 4.70132,
7.501786, 5.143915, 5.639893, 6.203259), e = c(3.710973,
4.525138, 5.296778, 6.752737, 6.172111, 4.752406, 5.466611,
5.558161, 4.368915, 4.498614)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))

Numpy: Selecting Rows based on Multiple Conditions on Some of its Elements

Elegant is np.equal

Z[np.equal(Z[:, [0,1]], 1).all(axis=1)]

Or:

Z[np.equal(Z[:,0], 1) & np.equal(Z[:,1], 1)]

How can I SELECT rows with MAX(Column value), PARTITION by another column in MYSQL?

You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten table on BOTH fields:

SELECT tt.*
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime


Related Topics



Leave a reply



Submit