Select Values from Different Columns Based on a Variable Containing Column Names

Select values from different columns based on a variable containing column names

An excuse to use the obscure .BY:

DT[, newval := .SD[[.BY[[1]]]], by=new]

col1 col2 col3 new newval
1: 1 4 55 col1 1
2: 2 3 44 col2 3
3: 3 34 35 col2 34
4: 4 44 87 col3 87

How it works. This splits the data into groups based on the strings in new. The value of the string for each group is stored in newname = .BY[[1]]. We use this string to select the corresponding column of .SD via .SD[[newname]]. .SD stands for Subset of Data.

Alternatives. get(.BY[[1]]) should work just as well in place of .SD[[.BY[[1]]]]. According to a benchmark run by @David, the two ways are equally fast.

How do I return values from multiple columns when the column names are based on a variable result

The following query using a dynamic UNPIVOT operation will do the work:

CREATE TABLE #yourTable ( [record id] INT,[current stage] VARCHAR(255), [met client] DATE, [contract agreed] DATE, [service completed] DATE, [on hold] DATE)

INSERT INTO #yourTable VALUES
(11111, 'met client', '2019-01-02', NULL, NULL, NULL),
(22222, 'contract agreed', '2019-01-02', '2019-01-20', NULL, NULL),
(33333, 'on hold', '2019-01-02', '2019-01-20', NULL, '2019-02-10'),
(44444, 'service completed', '2019-01-02', '2019-01-20', '2019-03-01', '2019-02-10')


DECLARE @col NVARCHAR(MAX) = '';
SELECT @col += ',' + QUOTENAME([current stage]) FROM #yourTable
SET @col = STUFF(@col,1,1,'')

EXEC ( 'SELECT unpiv.[record id], unpiv.[current stage], [Date] AS [Date_of_current_stage] FROM #yourTable UNPIVOT ([Date] FOR [Stage] IN ('+@col+') ) unpiv WHERE [current stage] = [Stage]')

Create new column containing names of other columns based on values ​of those columns

You can use the following code:

library(dplyr)
df %>%
rowwise() %>%
mutate(V4 = paste0(names(.)[c_across() == 1], collapse = ','))

Output:

# A tibble: 4 × 4
# Rowwise:
V1 V2 V3 V4
<dbl> <dbl> <dbl> <chr>
1 1 0 1 "V1,V3"
2 0 1 1 "V2,V3"
3 0 0 0 ""
4 1 1 1 "V1,V2,V3"

Data

df <- data.frame(
V1 = c(1,0,0,1),
V2 = c(0,1,0,1),
V3 = c(1,1,0,1)
)

Calling a column name based on a different columns values?

A DataFrame usually contains multiple rows (and columns).

So if you ask whether particular column (say xx) has some value:

df.xx == 20

you will get a boolean Series with:

  • indices copied from df,
  • value stating whether xx column in this row == 20.

So I assume that you question about particular value in a given column
should actually be expressed as: Does any element in this column
have particular value?
.

You can check it with any() function:

(df.xx == 22).any()

This time the result will be a single boolean.

In your case you can write:

if (df.column_name == '1-5').any():
result = df.Minutes

Of course it is open to question what if not?
Do you want another column in result variable?

Another approach is to set the column name in some variable,
say src_col, based on some your logic.

Then, having this variable set, you can refer to the required column as:

result = df[src_col]

Note that this time:

  • the column name is between brackets,
  • but it is not surrounded with apostrophes,

so the target column name is expressed by the value of this variable.

And a remark about the comment by Chris90:

If you write df.loc[df['column_name'] == '1-5','Minutes']
you will get a single value from:

  • row containing 1-5 (string) in column_name,
  • Minutes column.

But you wrote that you wanted all values from this column.

Extract data frame columns based on multiple criteria on column names

It can be as straightforward as

df[c("id", grep("col", names(df), value = TRUE), "Gender")]

Select column dynamically based on value from another column in R

By looping through the sequence of rows, extract the value with get and assign it to create 'y'

dt[, y := .SD[, get(x), seq_len(.N)]$V1]
dt
# a b c x y
#1: 2 5 1 a 2
#2: 3 7 2 b 7
#3: 5 7 3 c 3

select columns based on columns names containing a specific string in pandas

alternative methods:

In [13]: df.loc[:, df.columns.str.startswith('alp')]
Out[13]:
alp1 alp2
0 0.357564 0.108907
1 0.341087 0.198098
2 0.416215 0.644166
3 0.814056 0.121044
4 0.382681 0.110829
5 0.130343 0.219829
6 0.110049 0.681618
7 0.949599 0.089632
8 0.047945 0.855116
9 0.561441 0.291182

In [14]: df.loc[:, df.columns.str.contains('alp')]
Out[14]:
alp1 alp2
0 0.357564 0.108907
1 0.341087 0.198098
2 0.416215 0.644166
3 0.814056 0.121044
4 0.382681 0.110829
5 0.130343 0.219829
6 0.110049 0.681618
7 0.949599 0.089632
8 0.047945 0.855116
9 0.561441 0.291182

Select column using the value in other row of a data.table in R

We can use get after looping through sequence of rows

DT[, W :=  get(Z) , 1:nrow(DT)]

Or with eval(as.name

DT[,  W := eval(as.name(Z)) , 1:nrow(DT)]

Data Table - Select Value of Column by Name From Another Column

Another option:

d[ , value.of.col := diag(as.matrix(.SD)), .SDcols = d[ , name.of.col]]
> d
value.1 value.2 name.of.col value.of.col
1: one two value.1 one
2: uno dos value.2 dos
3: 1 2 value.1 1

EDIT add a faster solution:

d[ , value.of.col :=
melt(d,id.vars='name.of.col')[name.of.col==variable, value]]


Related Topics



Leave a reply



Submit