Get row and column indices of matches using `which()`
For your first question you need to also pass arr.ind= TRUE
to which
:
> which(m == 1, arr.ind = TRUE)
row col
[1,] 3 1
[2,] 2 2
[3,] 1 3
[4,] 4 3
Get indices of matches with a column in a second data.table
Using .EACHI
and adding the resulting list column by reference.
dt2[ , res := dt1[ , i := .I][.SD, on = .(firstName), .(.(i)), by = .EACHI]$V1]
# lid firstName res
# 1: 1 Maria NA
# 2: 2 Jim 1,4
# 3: 3 Jack NA
# 4: 4 Anne 3,5
Get column index from data frame that matches numeric vector?
Here's a base R approach, which compares every column in dat
with testVec
to see if they are identical
. Use which
to output the column index if they're identical.
which(sapply(1:ncol(dat), function(x) identical(dat[,x], testVec)))
[1] 3
UPDATE
@nicola has provided a better syntax to my original code (you can see it in the comment under this answer):
which(sapply(dat, identical, y = testVec))
z
3
get top 3 values per row and keep the column index
You can use order
to get the indices, and head(x, 3)
to get the top 3 values.
head(apply(m, 1, function(x) order(x, decreasing = TRUE)), 3)
M001_0.6 M002_0.6 M004_0.6 M012_0.6 M013_0.6
[1,] 1 5 5 2 3
[2,] 5 2 3 3 4
[3,] 2 4 2 1 2
Then you could get something like:
list(index = t(head(apply(m, 1, function(x) order(x, decreasing = TRUE)), 3)),
values = t(head(apply(m, 1,function(x) sort(x, decreasing = TRUE)), 3)))
$index
[,1] [,2] [,3]
M001_0.6 1 5 2
M002_0.6 5 2 4
M004_0.6 5 3 2
M012_0.6 2 3 1
M013_0.6 3 4 2
$values
[,1] [,2] [,3]
M001_0.6 0.0016572027 0.0015840909 0.0015597202
M002_0.6 0.0005361538 0.0004630419 0.0004630419
M004_0.6 0.0017303146 0.0014134965 0.0012916433
M012_0.6 0.0107961884 0.0107230765 0.0105768528
M013_0.6 0.0018277971 0.0017546853 0.0015840909
Subset of a data frame according to the combination of rows and column indices
You could use diag
:
diag(as.matrix(df[r.idx, c.idx]))
#[1] 21 35 21 32 26
Pandas: Comparing each row's value with index and replacing adjacent column's value
You need to use indexing lookup, for this you first need to ensure that the names in A match the column names (0
-> 'ans_0'
):
idx, cols = pd.factorize('ans_'+df['A'].astype(str))
import numpy as np
df['B'] = (df.reindex(cols, axis=1).to_numpy()
[np.arange(len(df)), idx]
)
output:
A B ans_0 ans_3 ans_4
timestamp
2022-05-09 09:28:00 0 20 20 200 100
2022-05-09 09:28:01 3 80 10 80 50
2022-05-09 09:28:02 4 10 30 60 10
Related Topics
Is It a Good Practice to Call Functions in a Package via ::
How to Randomize (Or Permute) a Dataframe Rowwise and Columnwise
How to Get Coefficients and Their Confidence Intervals in Mixed Effects Models
Finding 2 & 3 Word Phrases Using R Tm Package
Using Dynamic Column Names in 'Data.Table'
Legend Placement, Ggplot, Relative to Plotting Region
Using Substitute to Get Argument Name
R on Windows: Character Encoding Hell
Efficient Row-Wise Operations on a Data.Table
Add Line Break to Axis Labels and Ticks in Ggplot
How to Group My Date Variable into Month/Year in R
Change Both Legend Titles in a Ggplot with Two Legends