Select N Rows Above and Below Match

Select N rows above and below match

This seems to be a simple question but is not as trivial as presumably expected.

The issue is that which(mtcars$vs == 1) returns a vector rather than a single value:

[1]  3  4  6  8  9 10 11 18 19 20 21 26 28 32

If another vector -1:1 (which is c(-1L, 0L, 1L)) is added to it, the normal R rules for operations on vectors of unequal lengths apply: The recycling rule says

Any short vector operands are extended by recycling their values until
they match the size of any other operands.

Therefore the shorter vector -1:1 will be recycled to the length of which(mtcars$vs == 1), i.e.,

rep(-1:1, length.out = length(which(mtcars$vs == 1)))

 [1] -1  0  1 -1  0  1 -1  0  1 -1  0  1 -1  0

Therefore, the result of

which(mtcars$vs == 1) + -1:1

is the element-wise sum of the elements of both vectors where the shorter vector has been recycled to match the length of the longer vector.

 [1]  2  4  7  7  9 11 10 18 20 19 21 27 27 32

which is propably not what the OP has expected.

In addition, we get the

Warning message:

In which(mtcars$vs == 1) + -1:1 :

longer object length is not a multiple of shorter object length

because which(mtcars$vs == 1) has length 14 and -1:1 has length 3.

Solution using `outer()`

In order to select the N rows above and below each matching row, we need to add -N:N to each row number returned by which(mtcars$vs == 1):

outer(which(mtcars$vs == 1), -1:1, `+`)

      [,1] [,2] [,3]
 [1,]    2    3    4
 [2,]    3    4    5
 [3,]    5    6    7
 [4,]    7    8    9
 [5,]    8    9   10
 [6,]    9   10   11
 [7,]   10   11   12
 [8,]   17   18   19
 [9,]   18   19   20
[10,]   19   20   21
[11,]   20   21   22
[12,]   25   26   27
[13,]   27   28   29
[14,]   31   32   33

Now, we have an array of all row numbers. Unfortunately, it cannot be used directly for subsetting because it contains duplicates and there are row numbers which do not exist in mtcars. So the the result has to be "post-processed" before it can be used for subsetting.

library(magrittr) # piping used for clarity
rn <- outer(which(mtcars$vs == 1), -1:1, `+`) %>% 
  as.vector() %>% 
  unique() %>% 
  Filter(function(x) x[1 <= x & x <= nrow(mtcars)], .)

rn

 [1]  2  3  4  5  6  7  8  9 10 11 12 17 18 19 20 21 22 25 26 27 28 29 31 32

mtcars[rn, ]

                   mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
Merc 280          19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
Merc 280C         17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
Merc 450SE        16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
Chrysler Imperial 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
Fiat 128          32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
Honda Civic       30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
Toyota Corolla    33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
Toyota Corona     21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
Dodge Challenger  15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
Pontiac Firebird  19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
Fiat X1-9         27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
Porsche 914-2     26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
Lotus Europa      30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
Ford Pantera L    15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
Maserati Bora     15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
Volvo 142E        21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

How to SELECT N values ABOVE and BELOW from specific value

You can limit and offset inside your QUERY():

=QUERY(A1:C,"limit "&2+MIN(5,MATCH(D1,B:B,0))&" offset "&MAX(0,MATCH(D1,B:B,0)-5))

Sample Image

selecting row matching condition, and also one row above and one row below

You already have the bool so do not need to filter it by index , we can do shift for the mask

m = df['col1']<100
df = df[m.shift(-1)|m.shift()|m]

Select N rows above and below a specific row in pandas

Really simple using index.get_loc. Get the index of the label, and slice accordingly.

idx = df.index.get_loc('2015-01-17')
df.iloc[idx - 10 : idx + 10]

                   A
2015-01-07  0.262086
2015-01-08  0.836742
2015-01-09  0.094763
2015-01-10  0.133500
2015-01-11  0.285372
2015-01-12  0.338112
2015-01-13  0.451852
2015-01-14  0.163001
2015-01-15  0.247186
2015-01-16  0.227053
2015-01-17  0.837647
2015-01-18  0.918334
2015-01-19  0.514731
2015-01-20  0.207688
2015-01-21  0.700314
2015-01-22  0.363784
2015-01-23  0.811346
2015-01-24  0.079030
2015-01-25  0.051900
2015-01-26  0.520310

How to subset N rows above a selected point in a 'tidy' dataframe

I'll use a function I wrote in a different answer, https://stackoverflow.com/a/58716950/3358272, called leadlag. The premise for that function is similar to lead or lag (in dplyr-speak) but it has a cumulative effect.

Up front: I'm assuming that this "N prior" is per-group (per stock_name), not generally throughout all stock names.

For this data, I'll add a unique id to each row and find the rows to keep:

stock.data$rn <- seq_len(nrow(stock.data))
rownums <- merge(stock.data, other_data)$rn

From there, let's lead/lag the filtering:

stock.data %>%
  group_by(stock_name) %>%
  filter(leadlag(rn %in% rownums, bef=1, aft=0)) %>%
  ungroup()
# # A tibble: 4 x 4
#   stock_name price date          rn
#   <chr>      <dbl> <date>     <int>
# 1 Walmart      100 2012-01-01     1
# 2 Walmart      101 2012-03-01     2
# 3 Target       202 2012-03-01     5
# 4 Target       203 2012-04-01     6

and if you wanted N=2 before, then

stock.data %>%
  group_by(stock_name) %>%
  filter(leadlag(rn %in% rownums, bef=2, aft=0)) %>%
  ungroup()
# # A tibble: 5 x 4
#   stock_name price date          rn
#   <chr>      <dbl> <date>     <int>
# 1 Walmart      100 2012-01-01     1
# 2 Walmart      101 2012-03-01     2
# 3 Target       201 2012-01-01     4
# 4 Target       202 2012-03-01     5
# 5 Target       203 2012-04-01     6

Data

stock.data <- data.frame(
  stock_name = c("Walmart","Walmart","Walmart","Target","Target","Target"),
  price = c(100,101,102,201,202,203), 
  date = as.Date(c("2012-01-01", "2012-03-01", "2012-04-01", "2012-01-01",
                   "2012-03-01","2012-04-01"))
)
other_data <- data.frame(
  stock_name = c("Walmart", "Target"),
  date = as.Date(c("2012-03-01", "2012-04-01"))
)

A copy of the leadlag function defined in the other answer:

#' Lead/Lag a logical
#'
#' @param lgl logical vector
#' @param bef integer, number of elements to lead by
#' @param aft integer, number of elements to lag by
#' @return logical, same length as 'lgl'
#' @export
leadlag <- function(lgl, bef = 1, aft = 1) {
  n <- length(lgl)
  bef <- min(n, max(0, bef))
  aft <- min(n, max(0, aft))
  befx <- if (bef > 0) sapply(seq_len(bef), function(b) c(tail(lgl, n = -b), rep(FALSE, b)))
  aftx <- if (aft > 0) sapply(seq_len(aft), function(a) c(rep(FALSE, a), head(lgl, n = -a)))
  rowSums(cbind(befx, lgl, aftx), na.rm = TRUE) > 0
}

How to find matching row based on a condition and return N row above or below?

Let me know if this does the job. It chooses the subsequent indexes leaving off from the 3 PT ones, and then chooses all the rows in main_df with those index numbers.

final_home = home_df[home_slice]  
print(final_home.to_string())  # starting from where you left off

subsequent_rows = 3 # note it'll choose this value - 1, so pick 3 if you want 2

# returns a list of tuples that contain the ranges of indices following the initial event
index_ranges = home_df[home_df['PLAYER1_TEAM_NICKNAME'] == home_df['rebounder_team']].index.map(lambda x: range(x, x + subsequent_rows))
index_list=[]
# flatten the list of tuples to a list of all the index values we want
[index_list.extend(x) for x in index_ranges]
# go back to main_df and select all the rows with those index values
final = main_df[main_df.index.isin(index_list)]
print(final)

SELECT N rows before and after the row matching the condition?

Right, this works for me:

SELECT child.*
FROM stack as child,
(SELECT idstack FROM stack WHERE message LIKE '%hello%') as parent
WHERE child.idstack BETWEEN parent.idstack-2 AND parent.idstack+2;

Returning above and below rows of specific rows in r dataframe

Try that:

extract.with.context <- function(x, rows, after = 0, before = 0) {

  match.idx  <- which(rownames(x) %in% rows)
  span       <- seq(from = -before, to = after)
  extend.idx <- c(outer(match.idx, span, `+`))
  extend.idx <- Filter(function(i) i > 0 & i <= nrow(x), extend.idx)
  extend.idx <- sort(unique(extend.idx))

  return(x[extend.idx, , drop = FALSE])
}

dat <- data.frame(x = 1:26, row.names = letters)
extract.with.context(dat, c("a", "b", "j", "y"), after = 3, before = 1)
#    x
# a  1
# b  2
# c  3
# d  4
# e  5
# i  9
# j 10
# k 11
# l 12
# m 13
# x 24
# y 25
# z 26

Select N Rows Above and Below Match