Return row number(s) for a particular value in a column in a dataframe
Use which(mydata_2$height_chad1 == 2585)
Short example
df <- data.frame(x = c(1,1,2,3,4,5,6,3),
y = c(5,4,6,7,8,3,2,4))
df
x y
1 1 5
2 1 4
3 2 6
4 3 7
5 4 8
6 5 3
7 6 2
8 3 4
which(df$x == 3)
[1] 4 8
length(which(df$x == 3))
[1] 2
count(df, vars = "x")
x freq
1 1 2
2 2 1
3 3 2
4 4 1
5 5 1
6 6 1
df[which(df$x == 3),]
x y
4 3 7
8 3 4
As Matt Weller pointed out, you can use the length
function.
The count
function in plyr
can be used to return the count of each unique column value.
How to find row number of a value in R code
I would be tempted to use grepl
, which should give all the lines with matches and can be generalised for arbitrary strings.
mydata_2 <- read.table(textConnection("
sex age height_seca1 height_chad1 height_DL weight_alog1
1 F 19 1800 1797 180 70.0
2 F 19 1682 1670 167 69.0
3 F 21 1765 1765 178 80.0
4 F 21 1829 1833 181 74.0
5 F 21 1706 1705 170 103.0
6 F 18 1607 1606 160 76.0
7 F 19 1578 1576 156 50.0
8 F 19 1577 1575 156 61.0
9 F 21 1666 1665 166 52.0
10 F 17 1710 1716 172 65.0
11 F 28 1616 1619 161 65.5
12 F 22 1648 1644 165 57.5
13 F 19 1569 1570 155 55.0
14 F 19 1779 1777 177 55.0
15 M 18 1773 1772 179 70.0
16 M 18 1816 1809 181 81.0
17 M 19 1766 1765 178 77.0
18 M 19 1745 1741 174 76.0
19 M 18 1716 1714 170 71.0
20 M 21 1785 1783 179 64.0
21 M 19 1850 1854 185 71.0
22 M 31 1875 1880 188 95.0
23 M 26 1877 1877 186 105.5
24 M 19 1836 1837 185 100.0
25 M 18 1825 1823 182 85.0
26 M 19 1755 1754 174 79.0
27 M 26 1658 1658 165 69.0
28 M 20 1816 1818 183 84.0
29 M 18 1755 1755 175 67.0"),
sep = " ", header = TRUE)
which(grepl(1578, mydata_2$height_seca1))
The output is:
> which(grepl(1578, mydata_2$height_seca1))
[1] 7
>
[Edit] However, as pointed out in the comments, this will capture much more than the string 1578 (e.g. it also matches for 21578 etc) and thus should be used only if you are certain that you the length of the values you are searching will not be larger than the four characters or digits shown here.
And subsetting as per the other answer also works fine:
mydata_2[mydata_2$height_seca1 == 1578, ]
sex age height_seca1 height_chad1 height_DL weight_alog1
7 F 19 1578 1576 156 50
>
If you're looking for several different values, you could put them in a vector and then use the %in%
operator:
look.for <- c(1578, 1658, 1616)
> mydata_2[mydata_2$height_seca1 %in% look.for, ]
sex age height_seca1 height_chad1 height_DL weight_alog1
7 F 19 1578 1576 156 50.0
11 F 28 1616 1619 161 65.5
27 M 26 1658 1658 165 69.0
>
in R, how can I find the row number of the first occurrence and last occurrence of a value in a Matrix?
Try the code below
> diff(range(which(a == 1, arr.ind = TRUE)[, "row"]))
[1] 13
Data
> dput(a)
structure(list(V1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L),
V2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), V3 = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), V4 = c(0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L), V5 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L), V6 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L
), V7 = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L), V8 = c(0L,
0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L), V9 = c(0L, 0L, 0L, 0L,
0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L,
0L, 0L, 0L, 0L, 0L, 0L), V10 = c(0L, 0L, 0L, 0L, 0L, 1L,
1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L,
0L, 0L, 0L, 0L), V11 = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L,
0L, 0L), V12 = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L
), V13 = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), V14 = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L,
1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), V15 = c(0L, 0L, 0L,
0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L), V16 = c(0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L,
0L, 0L, 0L, 0L, 0L), V17 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L,
0L, 0L, 0L), V18 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L), V19 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L),
V20 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), V21 = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), V22 = c(0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L), V23 = c(0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L), V24 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L), V25 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L)), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
"16", "17", "18", "19", "20", "21", "22", "23", "24", "25"))
Referencing Row Number in R
These are present by default as rownames
when you create a data.frame
.
R> df = data.frame('a' = rnorm(10), 'b' = runif(10), 'c' = letters[1:10])
R> df
a b c
1 0.3336944 0.39746731 a
2 -0.2334404 0.12242856 b
3 1.4886706 0.07984085 c
4 -1.4853724 0.83163342 d
5 0.7291344 0.10981827 e
6 0.1786753 0.47401690 f
7 -0.9173701 0.73992239 g
8 0.7805941 0.91925413 h
9 0.2469860 0.87979229 i
10 1.2810961 0.53289335 j
and you can access them via the rownames
command.
R> rownames(df)
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
if you need them as numbers, simply coerce to numeric by adding as.numeric
, as in as.numeric(rownames(df))
.
You don't need to add them, as if you know what you are looking for (say item df$c == 'i'
, you can use the which command:
R> which(df$c =='i')
[1] 9
or if you don't know the column
R> which(df == 'i', arr.ind=T)
row col
[1,] 9 3
you may access the element using df[9, 'c']
, or df$c[9]
.
If you wanted to add them you could use df$rownumber <- as.numeric(rownames(df))
, though this may be less robust than df$rownumber <- 1:nrow(df)
as there are cases when you might have assigned to rownames
so they will no longer be the default index numbers (the which command will continue to return index numbers even if you do assign to rownames
).
How to find a row number given multiple column conditions in R?
this should work.
which(example$gender == "F" & example$wage == 2100 & example$age == 24)
R: Obtain Row Number of a Tibble Using dplyr Package
You can use the following code:
library(dplyr)
tbl %>%
with(which(p == 1 & d == 1 & q == 1))
Output:
[1] 5
Which means it was the 5th row of your data.
get Row Number for R data frame
I think you have answered your own question.
The results of which() is a vector with the row numbers that you want to extract.
df <- data.frame(x = runif(20))
w <- which(df$x > 0.9)
w
[1] 9 11 14 16 20
data.frame(which=w, df=df[w, ])
which df
1 9 0.9437138
2 11 0.9995509
3 14 0.9237888
4 16 0.9526003
5 20 0.9191727
Related Topics
What Are 'User' and 'System' Times Measuring in R System.Time(Exp) Output
What Algorithm I Need to Find N-Grams
Ggplot2 Theme with No Axes or Grid
"Factor Has New Levels" Error for Variable I'm Not Using
Fastest Way to Read in 100,000 .Dat.Gz Files
Exceeding Memory Limit in R (Even with 24Gb Ram)
Using Lapply and Read.CSV on Multiple Files (In R)
Calculate Mean for Multiple Columns in Data.Frame
How to Clean Up R Memory Without Restarting My Pc
Ggplot2: Is There a Fix for Jagged, Poor-Quality Text Produced by Geom_Text()
How to Preserve Base Data Frame Rownames Upon Filtering in Dplyr Chain
Aesthetics Must Either Be Length One, or the Same Length as the Dataproblems
How to Make a Dummy Variable in R
Matching Multiple Columns on Different Data Frames and Getting Other Column as Result
Read CSV with Dates and Numbers
Using Annotate to Add Different Annotations to Different Facets