How to extract column names which have a specific value
You can add a new column to an existing dataframe that contains a list of all columns in which for that particular row the field has the value 1
.
Within the column paramater of withColumn you can iterate over all other columns and check for the wanted value:
val df = Seq((1, 2, 3), (4, 5, 6), (3, 2, 1)).toDF("col1", "col2", "col3")
df.show()
val cols = df.schema.fieldNames //change this array according to your needs
//if you want to exclude columns from the check
df.withColumn("result", array(
cols.map {
c: String => when(col(c).equalTo(1), c)
}: _*
)).show()
prints:
//input data
+----+----+----+
|col1|col2|col3|
+----+----+----+
| 1| 2| 3|
| 4| 5| 6|
| 3| 1| 1|
+----+----+----+
//result
+----+----+----+--------------+
|col1|col2|col3| result|
+----+----+----+--------------+
| 1| 2| 3| [col1,,]|
| 4| 5| 6| [,,]|
| 3| 1| 1|[, col2, col3]|
+----+----+----+--------------+
Selecting all column names where value is greater than 0
You can filter values greater like 0
to boolean DataFrame and then use DataFrame.dot
for matrix multiplication with columns names, last remove separator by indexing with str
:
df['e'] = df.gt(0).dot(df.columns + ',').str[:-1]
print (df)
a b c d e
0 12 21 0 0 a,b
1 0 23 22 22 b,c,d
2 23 0 33 0 a,c
SELECT rows that have specified values in one of the columns
One way is to use XML:
SELECT t.*
FROM tab t
CROSS APPLY (SELECT * FROM tab t2 WHERE t.id = t2.id FOR XML RAW('a')) sub(c)
WHERE sub.c LIKE '%"I"%';
Output:
┌────┬──────┬────────┬─────────┬───────┐
│ ID │ Name │ Status │ Address │ Phone │
├────┼──────┼────────┼─────────┼───────┤
│ 1 │ Tom │ I │ U │ D │
│ 3 │ Pam │ D │ I │ U │
└────┴──────┴────────┴─────────┴───────┘
DBFiddle Demo
EDIT:
A bit more advanced option that excludes some columns. Basically simulating SELECT * EXCEPT id, name
:
SELECT DISTINCT t.*
FROM tab t
CROSS APPLY (VALUES(CAST((SELECT t.* for XML RAW) AS xml))) B(XMLData)
CROSS APPLY (SELECT 1 c
FROM B.XMLData.nodes('/row') AS C1(n)
CROSS APPLY C1.n.nodes('./@*') AS C2(a)
WHERE a.value('local-name(.)','varchar(100)') NOT IN ('id','name')
AND a.value('.','varchar(max)') = 'I') C;
DBFiddle Demo2
Select specific columns, where the column names are in another df in r
The problem is that Y.variable.names
is a data.frame
which you cannot use to subset another data.frame
.
You can check by typing class(Y.variable.names)
.
So the solution to your problem is subsetting Y.variable.names
:
Y.Data = data %>% select(Y.variable.names[,1])
Bring a row for each specific column that is not empty, with the column name
You can use a CROSS APPLY
in concert with VALUES
to UNPIVOT
your data
Select A.ID
,B.Data
,A.RandomInformation
From YourTable A
Cross Apply ( values ('Data1',Data1)
,('Data2',Data2)
,('Data3',Data3)
,('Data4',Data4)
,('Data5',Data5)
,('Data6',Data6)
) B(Data,Value)
Where B.Value is not null
Related Topics
What's the Difference Between Charfield and Textfield in Django
How to Retrieve the Current Value of an Oracle Sequence Without Increment It
Database Naming Conventions by Microsoft
Why Are Foreign Keys More Used in Theory Than in Practice
The Alter Table Statement Conflicted with the Foreign Key Constraint
How to Change Db Schema to Dbo
With Check Add Constraint Followed by Check Constraint VS. Add Constraint
Functions VS Stored Procedures
Why Is a Primary-Foreign Key Relation Required When We Can Join Without It
Check If Table Exists and If It Doesn't Exist, Create It in SQL Server 2008
Create Postgresql Role (User) If It Doesn't Exist
Convert a String to Int Using SQL Query
Update Multiple Columns in SQL
How to Insert Table Values from One Database to Another Database
What Is the Most Appropriate Data Type for Storing an Ip Address in SQL Server
SQL Join on Multiple Columns in Same Tables