T-SQL Looping Through Xml Data Column to Derive Unique Set of Paths

T-SQL looping through XML data column to derive unique set of paths

The outdated approach with FROM OPENXML might be an option here. Check this answer.

At this link you'll find a function John Cappelletti posted from time to time, which will shred any XML (credits below the function's code).

But I'm not sure, what you are really trying to achieve... Why do you need the path? If you are interested in the values of all target nodes you might do something like this (deep search with // does not need the exact XPath)

 SELECT t.value(N'(text())[1]','nvarchar(max)')
FROM @xml.nodes('//target') AS A(t);

If you really need all and everything you can check this:

CREATE FUNCTION [dbo].[udf-XML-Hier](@XML xml)

Returns Table
As Return

with cte0 as (
Select Lvl = 1
,ID = Cast(1 as int)
,Pt = Cast(NULL as int)
,Element = x.value('local-name(.)','varchar(150)')
,Attribute = cast('' as varchar(150))
,Value = x.value('text()[1]','varchar(max)')
,XPath = cast(concat(x.value('local-name(.)','varchar(max)'),'[' ,cast(Row_Number() Over(Order By (Select 1)) as int),']') as varchar(max))
,Seq = cast(1000000+Row_Number() over(Order By (Select 1)) as varchar(max))
,AttData = x.query('.')
,XMLData = x.query('*')
From @XML.nodes('/*') a(x)
Union All
Select Lvl = p.Lvl + 1
,ID = Cast( (Lvl + 1) * 1024 + (Row_Number() Over(Order By (Select 1)) * 2) as int ) * 10
,Pt = p.ID
,Element = c.value('local-name(.)','varchar(150)')
,Attribute = cast('' as varchar(150))
,Value = cast( c.value('text()[1]','varchar(max)') as varchar(max) )
,XPath = cast(concat(p.XPath,'/',c.value('local-name(.)','varchar(max)'),'[',cast(Row_Number() Over(PARTITION BY c.value('local-name(.)','varchar(max)') Order By (Select 1)) as int),']') as varchar(max) )
,Seq = cast(concat(p.Seq,' ',10000000+Cast( (Lvl + 1) * 1024 + (Row_Number() Over(Order By (Select 1)) * 2) as int ) * 10) as varchar(max))
,AttData = c.query('.')
,XMLData = c.query('*')
From cte0 p
Cross Apply p.XMLData.nodes('*') b(c)
)
, cte1 as (
Select R1 = Row_Number() over (Order By Seq),A.*
From (
Select Lvl,ID,Pt,Element,Attribute,Value,XPath,Seq From cte0
Union All
Select Lvl = p.Lvl+1
,ID = p.ID + Row_Number() over (Order By (Select NULL))
,Pt = p.ID
,Element = p.Element
,Attribute = x.value('local-name(.)','varchar(150)')
,Value = x.value('.','varchar(max)')
,XPath = p.XPath + '/@' + x.value('local-name(.)','varchar(max)')
,Seq = cast(concat(p.Seq,' ',10000000+p.ID + Row_Number() over (Order By (Select NULL)) ) as varchar(max))
From cte0 p
Cross Apply AttData.nodes('/*/@*') a(x)
) A
)

Select A.R1
,R2 = IsNull((Select max(R1) From cte1 Where Seq Like A.Seq+'%'),A.R1)
,A.Lvl
,A.ID
,A.Pt
,A.Element
,A.Attribute
,A.XPath
,Title = Replicate('|---',Lvl-1)+Element+IIF(Attribute='','','@'+Attribute)
,A.Value
From cte1 A

/*
Source: http://beyondrelational.com/modules/2/blogs/28/posts/10495/xquery-lab-58-select-from-xml.aspx

Taken from John Cappelletti: https://stackoverflow.com/a/42729851/5089204

Declare @XML xml='<person><firstname preferred="Annie" nickname="BeBe">Annabelle</firstname><lastname>Smith</lastname></person>'
Select * from [dbo].[udf-XML-Hier](@XML) Order by R1
*/
GO

DECLARE @xml XML=
'<log>
<clients>
<client>
<section name ="Apps">
<questions>
<groupone>
<question>
<target>Age</target>
</question>
<question>
<target> Height</target>
</question>
<question>
<target> Weight</target>
</question>
</groupone>
<grouptwo name = "exercise">
<wording>what is your name</wording>
<question>
<id>1</id>
<target>def</target>
</question>
</grouptwo>
</questions>
</section>
</client>
</clients>
</log>';

SELECT * FROM dbo.[udf-XML-Hier](@xml);
GO

SQL Server FOR XML Path make repeating nodes

Add a column with NULL as value to generate a separate item node for each column.

SELECT 
t.col1 as 'item'
,NULL
,t.col2 as 'item'
,NULL
,t.col3 as 'item'
FROM dbo.tbl as t
FOR XML PATH('parent'), TYPE;

Result:

<parent>
<item>1</item>
<item>2</item>
<item>3</item>
</parent>

SQL Fiddle

Why does this work?

Columns without a name are inserted as text nodes. In this case the NULL value is inserted as a text node between the item nodes.

If you add actual values instead of NULL you will see what is happening.

SELECT 
t.col1 as 'item'
,'1'
,t.col2 as 'item'
,'2'
,t.col3 as 'item'
FROM dbo.tbl as t
FOR XML PATH('parent'), TYPE;

Result:

<parent>
<item>1</item>1<item>2</item>2<item>3</item></parent>

Another way to specify a column without a name is to use the wildcard character * as a column alias.

Columns with a Name Specified as a Wildcard Character

It is not necessary to use the wildcard in this case because the columns with NULL values don't have a column name but it is useful when you want values from actual columns but you don't want the column name to be a node name.

PL/SQL: Loop Through XML Column and Extract Value From Duplicate Elements

Try using XMLTable to
iterate through the values like a set.

Sample schema

--drop table table1;
create table table1 as
select xmltype('
<parent>
<child>Test</child>
<child>Test1</child>
<child>Test2</child>
</parent>') xml_column
from dual;

PL/SQL Loop

begin
for children in
(
select child
from table1
cross join xmltable
(
'/parent/child'
passing xml_column
columns child varchar2(100) path '/'
)
) loop
dbms_output.put_line(children.child);
end loop;
end;
/

DBMS_OUTPUT Results

Test
Test1
Test2

How to use SQL variable to iterate XML nodes

Your own code would work with just one more [1]. The function .modify() cannot interpret the [sql:variable(...)]... This might be any filter, even one with more than one result... So just change this to:

DECLARE @xml xml = '<Root>
<Contacts>
<Contact name="John Doe" type="REG" other="value" />
<Contact name="Jane Doe" type="REG" other="value" />
<Contact name="Jennifer Doe" type="REG" other="value" />
<Contact name="Jane Doe" type="REG" other="value" />
</Contacts>
</Root>';
DECLARE @i int = 1
SET @xml.modify('replace value of (/Root/Contacts/Contact)[sql:variable("@i")][1]/@type with "NEW"')
SELECT @xml

But I would go another path... You might read the whole lot as derived table and rebuild the XML like this:

SELECT
(
SELECT c.value('@name','nvarchar(max)') AS [@name]
--,c.value('@type','nvarchar(max)') AS [@type]
,'NEW' AS [@type]
,c.value('@other','nvarchar(max)') AS [@other]
FROM @xml.nodes('/Root/Contacts/Contact') AS A(c)
FOR XML PATH('Contact'),ROOT('Contact'),TYPE
)
FOR XML PATH('Root')

Another approach would be FLWOR:

SELECT @xml.query('
for $r in /Root/Contacts
return <Root><Contacts>
{
for $c in /Root/Contacts/Contact
return <Contact name="{$c/@name}" type="NEW" other="{$c/@other}"/>
}
</Contacts></Root>
')

Shredding XML From Execution Plans

A very straight way could be this (while @x is your XML-execution-plan):

DECLARE @x XML=
N'<root>
<ElementE1 AttributA1="A1-text belongs to E1[1]" OneMore="xyz">E1-Text 2</ElementE1>
<ElementE1 AttributA1="A1-text belongs to E1[2]">E1-Text 2</ElementE1>
<ElementParent>
<subElement test="sub"/>
Free text
</ElementParent>
</root>';

DECLARE @idoc INT;
EXEC sp_xml_preparedocument @idoc OUTPUT, @x;
SELECT * FROM OPENXML (@idoc, '*');
EXEC sp_xml_removedocument @idoc;

The result (not all columns)

+----+----------+----------+--------------+------+--------------------------+
| id | parentid | nodetype | localname | prev | text |
+----+----------+----------+--------------+------+--------------------------+
| 0 | NULL | 1 | root | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 2 | 0 | 1 | ElementE1 | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 3 | 2 | 2 | AttributA1 | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 13 | 3 | 3 | #text | NULL | A1-text belongs to E1[1] |
+----+----------+----------+--------------+------+--------------------------+
| 4 | 2 | 2 | OneMore | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 14 | 4 | 3 | #text | NULL | xyz |
+----+----------+----------+--------------+------+--------------------------+
| 5 | 2 | 3 | #text | NULL | E1-Text 2 |
+----+----------+----------+--------------+------+--------------------------+
| 6 | 0 | 1 | ElementE1 | 2 | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 7 | 6 | 2 | AttributA1 | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 15 | 7 | 3 | #text | NULL | A1-text belongs to E1[2] |
+----+----------+----------+--------------+------+--------------------------+
| 8 | 6 | 3 | #text | NULL | E1-Text 2 |
+----+----------+----------+--------------+------+--------------------------+
| 9 | 0 | 1 | ElementParent| 6 | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 10 | 9 | 1 | subElement | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 11 | 10 | 2 | test | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 16 | 11 | 3 | #text | NULL | sub |
+----+----------+----------+--------------+------+--------------------------+
| 12 | 9 | 3 | #text | 10 | Free text |
+----+----------+----------+--------------+------+--------------------------+

The id shows clearly, that the algorithm is breadth first, there is no id=1 (why ever) and the nodetype allows to distinguish between elements, attributs and (floating) text. The prev column points to a sibling up in the chain. The missing columns are related to namespaces...

The approach with FROM OPENXML is outdated, but this is one of the rare situations it might still be very usefull...

You get a list with IDs and ParentIDs you might query with an recursive CTE... This depends on what you want to do with this afterwards...

In sql how do you loop through singleton values in an xquery for an xml?

It is relatively simple to do by using XML data type .nodes() and .value() methods.

It is called shredding, i.e. conversion of XML into a rectangular relational structure. No need to do any loops.

SQL

declare @x as xml
set @x = '<root>
<item>
<itemCharacteristicA>this</itemCharacteristicA>
<itemCharacteristicB>part of this</itemCharacteristicB>
</item>
<item>
<itemCharacteristicA>that</itemCharacteristicA>
<itemCharacteristicB>part of that</itemCharacteristicB>
</item>
</root>';

SELECT c.value('(itemCharacteristicA/text())[1]','VARCHAR(30)') AS itemCharacteristicA
, c.value('(itemCharacteristicB/text())[1]','VARCHAR(30)') AS itemCharacteristicB
FROM @x.nodes('/root/item') AS t(c);

Output

+---------------------+---------------------+
| itemCharacteristicA | itemCharacteristicB |
+---------------------+---------------------+
| this | part of this |
| that | part of that |
+---------------------+---------------------+


Related Topics



Leave a reply



Submit