Flattening of a 1 Row Table into a Key-Value Pair Table

Flattening of a 1 row table into a key-value pair table

A version where there is no dynamic involved. If you have column names that is invalid to use as element names in XML this will fail.

select T2.N.value('local-name(.)', 'nvarchar(128)') as [Key],
T2.N.value('text()[1]', 'nvarchar(max)') as Value
from (select *
from TableA
for xml path(''), type) as T1(X)
cross apply T1.X.nodes('/*') as T2(N)

A working sample:

declare @T table
(
Column1 varchar(10),
Column2 varchar(10),
Column3 varchar(10)
)

insert into @T values('V1','V2','V3')

select T2.N.value('local-name(.)', 'nvarchar(128)') as [Key],
T2.N.value('text()[1]', 'nvarchar(max)') as Value
from (select *
from @T
for xml path(''), type) as T1(X)
cross apply T1.X.nodes('/*') as T2(N)

Result:

Key                  Value
-------------------- -----
Column1 V1
Column2 V2
Column3 V3

Update

For a query with more than one table you could use for xml auto to get the table names in the XML. Note, if you use alias for table names in the query you will get the alias instead.

select X2.N.value('local-name(..)', 'nvarchar(128)') as TableName,
X2.N.value('local-name(.)', 'nvarchar(128)') as [Key],
X2.N.value('text()[1]', 'nvarchar(max)') as Value
from (
-- Your query starts here
select T1.T1ID,
T1.T1Col,
T2.T2ID,
T2.T2Col
from T1
inner join T2
on T1.T1ID = T2.T1ID
-- Your query ends here
for xml auto, elements, type
) as X1(X)
cross apply X1.X.nodes('//*[text()]') as X2(N)

SQL Fiddle

How to flatten out tab key value pair text format file in a table where key is column and value is data for the cell

Something like this could work:

SELECT facility_id, unknown_num, type_of_location, external_name,
address, city, state, county, zip_code
FROM (
SELECT key,
value facility_id,
LEAD(value, 1) OVER (ORDER BY seqnum) unknown_num,
LEAD(value, 2) OVER (ORDER BY seqnum) type_of_location,
LEAD(value, 3) OVER (ORDER BY seqnum) external_name,
LEAD(value, 4) OVER (ORDER BY seqnum) address,
LEAD(value, 5) OVER (ORDER BY seqnum) city,
LEAD(value, 6) OVER (ORDER BY seqnum) state,
LEAD(value, 7) OVER (ORDER BY seqnum) county,
LEAD(value, 8) OVER (ORDER BY seqnum) zip_code
FROM MDM_ODS.EAF_EPIC_IMPORT
ORDER BY seqnum)
WHERE key=1;

LEAD(X, N) OVER (ORDER BY <sort-order>) means return the value of column "X" that is "N" number of rows ahead of the current row with the rows ordered by <sort-order>.

Transform stacked tables of key value pairs to a single key value table dynamically

try:

={"Key", "Value"; FLATTEN(FILTER(A2:C, ISEVEN(ROW(A2:A)))), 
FLATTEN(FILTER(A2:C, ISODD(ROW(A2:A))))}

Sample Image

or:

={"Key", "Value"; FLATTEN(QUERY(A2:C, "skipping 2", )), 
FLATTEN(QUERY(A3:C, "skipping 2", ))}

Sample Image

How do I flatten a nested json key/value pair into a single array of values?

The below queries illustrate how to use FLATTEN and ARRAY_AGG to get the output you want.

  • FLATTEN unnests each array so that you can join on the values within.
  • ARRAY_AGG aggregates the values grouped by org.
  • The CASE statement accounts for org.relationships not always being an array.
CREATE OR REPLACE TABLE organizations (org variant) AS
SELECT parse_json('{relationships: [{ name: "mother", value: "a" }, { name: "siblings", value: [ "b", "c" ] } ] } ');

CREATE OR REPLACE TABLE people (person variant) AS
SELECT parse_json($1)
FROM
VALUES ('{id:"a", name: "Mary"}'),
('{id:"b", name: "Joe"}'),
('{id:"c", name: "John"}');

WITH org_people AS
(SELECT o.org,
relationship.value AS relationship,
CASE is_array(relationship:value)
WHEN TRUE THEN person_in_relationship.value
ELSE relationship:value
END AS person_in_relationship
FROM organizations o,
LATERAL FLATTEN(o.org:relationships) relationship ,
LATERAL FLATTEN(relationship.value:value, OUTER=>TRUE) person_in_relationship
)
SELECT op.org,
ARRAY_AGG(p.person) AS people
FROM org_people op
JOIN people p ON p.person:id = op.person_in_relationship
GROUP BY op.org;

How to spread a key-value pair across multiple columns and flatten the matrix based on another column?

EDIT: A typo in the sample data made pivot not work because of duplicates. pivot should work as well df = df.pivot(index='a', columns='b', values='c') or df = df.pivot('a', 'b', 'c'):

Tested on your specific version.

#pip install pandas==1.2.0
#both pivot and pivot_table should work on this version. However, this code for "pivot" would NOT work on earlier versions of pandas. Not sure the exact version but I think it was fixed in 2019.
import pandas as pd
df = pd.DataFrame({
'a': ['x_1', 'x_1', 'x_1', 'x_1', 'x_1', 'j_2', 'j_2', 'j_2', 'j_2', ],
'b': [1, 2, 3, 4, 5, 1, 2, 3, 5],
'c': [6, 3, 0, 1, 3.4, 4.5, 0.1, 0.2, 0.88]})
df = df.pivot_table(index='a', columns='b', values='c')
df
Out[1]:
b 1 2 3 4 5
a
j_2 4.5 0.1 0.2 NaN 0.88
x_1 6.0 3.0 0.0 1.0 3.40

How can I get a traditional single row result from a key-value oriented table layout

You are overnormalizing, but your example is incomplete. If you are looking at your original example and try to add one more row, you will see that it is not clear from your table to which row a value belongs. You need to add an item column:

root@localhost [kris]> create table overnormal ( id serial, k varchar(20) not null, v varchar(20) not null);
Query OK, 0 rows affected (0.96 sec)

root@localhost [kris]> insert into overnormal values ( 1, 'abc', 'value 1'), (2, 'def', 'value 2'), (3, 'geh', 'value 3');
Query OK, 3 rows affected (0.01 sec)
Records: 3 Duplicates: 0 Warnings: 0

root@localhost [kris]> select * from overnormal;
+----+-----+---------+
| id | k | v |
+----+-----+---------+
| 1 | abc | value 1 |
| 2 | def | value 2 |
| 3 | geh | value 3 |
+----+-----+---------+
3 rows in set (0.00 sec)

Let's add the item column:

root@localhost [kris]> alter table overnormal add column item integer unsigned not null;
Query OK, 3 rows affected (0.48 sec)
Records: 3 Duplicates: 0 Warnings: 0

root@localhost [kris]> update overnormal set item = 1;
Query OK, 3 rows affected (0.01 sec)
Rows matched: 3 Changed: 3 Warnings: 0

root@localhost [kris]> insert into overnormal values (4, 'abc', 'item 1/1', 2), (5, 'def', 'item 1/2', 2), (6, 'geh', 'item 1/3', 2);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0

root@localhost [kris]> select * from overnormal;
+----+-----+----------+------+
| id | k | v | item |
+----+-----+----------+------+
| 1 | abc | value 1 | 1 |
| 2 | def | value 2 | 1 |
| 3 | geh | value 3 | 1 |
| 4 | abc | item 1/1 | 2 |
| 5 | def | item 1/2 | 2 |
| 6 | geh | item 1/3 | 2 |
+----+-----+----------+------+
6 rows in set (0.00 sec)

You can transform this into a traditional table using the join of death:

root@localhost [kris]> select t1.item, t1.v as abc, t2.v as def, t3.v as geh 
from overnormal as t1
join overnormal as t2
on t1.item = t2.item
and t1.k = 'abc'
and t2.k = 'def'
join overnormal as t3
on t1.item = t3.item
and t3.k = 'geh';
+------+----------+----------+----------+
| item | abc | def | geh |
+------+----------+----------+----------+
| 1 | value 1 | value 2 | value 3 |
| 2 | item 1/1 | item 1/2 | item 1/3 |
+------+----------+----------+----------+
2 rows in set (0.00 sec)

You can sort this the usual way by assing an ORDER BY clause.

This query is not that bad, but will rapidly degrade in efficiency as your table gets wider, even with indexes, as the optimizer will get confused at 9 to 10 tables in a single join.

You are better off with a data model that closer to 3rd normal form and less close to DKNF, and yes, that is possible and less a problem than you think. You cannot handle completely arbitrary data types anyway without code making assumptions about the k-values that are legal and required in your application anyway (or, if you are not making such assumptions, you might as well serialize your in-app structure and store a blob).



Related Topics



Leave a reply



Submit