Importing Xml into SQL Server

Import 'xml' into Sql Server

Try this:

DECLARE @XML XML = '<EventSchedule>
<Event Uid="2" Type="Main Event">
<IsFixed>True</IsFixed>
<EventKind>MainEvent</EventKind>
<Fields>
<Parameter Name="Type" Value="TV_Show"/>
<Parameter Name="Name" Value="The Muppets"/>
<Parameter Name="Duration" Value="00:30:00"/>
</Fields>
</Event>
<Event Uid="3" Type="Secondary Event">
<IsFixed>True</IsFixed>
<EventKind>SecondaryEvent</EventKind>
<Fields>
<Parameter Name="Type" Value="TV_Show"/>
<Parameter Name="Name" Value="The Muppets II"/>
<Parameter Name="Duration" Value="00:30:00"/>
</Fields>
</Event>
</EventSchedule>'

SELECT
EventUID = Events.value('@Uid', 'int'),
EventType = Events.value('@Type', 'varchar(20)'),
EventIsFixed =Events.value('(IsFixed)[1]', 'varchar(20)'),
EventKind =Events.value('(EventKind)[1]', 'varchar(20)')
FROM
@XML.nodes('/EventSchedule/Event') AS XTbl(Events)

Gives me an output of:

Sample Image

And of course, you can easily do an

INSERT INTO dbo.YourTable(EventUID, EventType, EventIsFixed, EventKind)
SELECT
......

to insert that data into a relational table.

Update: assuming you have your XML in files - you can use this code to load the XML file into an XML variable in SQL Server:

DECLARE @XmlFile XML

SELECT @XmlFile = BulkColumn
FROM OPENROWSET(BULK 'path-to-your-XML-file', SINGLE_BLOB) x;

and then use the above code snippet to parse the XML.

Update #2: if you need the parameters, too - use this XQuery statement:

SELECT
EventUID = Events.value('@Uid', 'int'),
EventType = Events.value('@Type', 'varchar(20)'),
EventIsFixed = Events.value('(IsFixed)[1]', 'varchar(20)'),
EventKind = Events.value('(EventKind)[1]', 'varchar(20)'),
ParameterType = Events.value('(Fields/Parameter[@Name="Type"]/@Value)[1]', 'varchar(20)'),
ParameterName = Events.value('(Fields/Parameter[@Name="Name"]/@Value)[1]', 'varchar(20)'),
ParameterDuration = Events.value('(Fields/Parameter[@Name="Duration"]/@Value)[1]', 'varchar(20)')
FROM
@XML.nodes('/EventSchedule/Event') AS XTbl(Events)

Results in:

Sample Image

Fastest method to import large XML into SQL-Server table

Dealing with NULL values is something special in XML.

The definition of a NULL value in XML is not existing. So

<a>
<b>hi</b>
<c></c>
<d/>
</a>
  • <a> is the root elelement.
  • <b> is an element with a text() node.
  • <c> is an empty element
  • <d> is a self-closing element
  • <e> is - uhm - not there...

Important hint: <c> and <d> are the same, absolutely no difference!

You can query for the element with

.value('(/a/b)[1]','nvarchar(100)')

And you can query for the the text() node specifically

.value('(/a/b/text())[1]','nvarchar(100)')

In this you find a possible answer (a bit hidden): You can do all your code without the NULL checking predicate, if you query for the text() node specifically.

Change this

ref.value('record[1][not(@xs:nil = "true")]' ,'varchar(100)')

to this

ref.value('(record[1]/text())[1]' ,'varchar(100)')

What might break this: If a <record>'s content may be an empty string you would get a NULL back and not ''. But it should be much faster... Hope, this is okay for you...

About performance: Read this answer. It covers your issue quite well. Especially the part where the time is consumed (follow the links in this answer).

Import Multi-level (element) XML file into MS SQL Server Table

You can build an insert statement around the SELECT below. There may be a better way to do this, but this can at least move you forward a bit.

declare @x as xml;

set @x = '<?xml version="1.0" encoding="UTF-8"?>
<TagValidationList>
<TVLHeader>
<SubmissionType>STVL</SubmissionType>
<SubmissionDateTime>2000-01-00T00:00:01Z</SubmissionDateTime>
<SSIOPHubID>0001</SSIOPHubID>
<HomeAgencyID>1002</HomeAgencyID>
<BulkIndicator>B</BulkIndicator>
<BulkIdentifier>100</BulkIdentifier>
<RecordCount>3</RecordCount>
</TVLHeader>
<TVLDetail>
<TVLTagDetails>
<HomeAgencyID>1234</HomeAgencyID>
<TagAgencyID>1100</TagAgencyID>
<TagSerialNumber>00123456</TagSerialNumber>
<TagStatus>X</TagStatus>
<TagClass>1</TagClass>
<TVLAccountDetails/>
</TVLTagDetails>
<TVLTagDetails>
<HomeAgencyID>2234</HomeAgencyID>
<TagAgencyID>1200</TagAgencyID>
<TagSerialNumber>00223456</TagSerialNumber>
<TagStatus>Y</TagStatus>
<TagClass>2</TagClass>
<TVLPlateDetails>
<PlateCountry>US</PlateCountry>
<PlateState>TX</PlateState>
<PlateNumber>123ABC</PlateNumber>
<PlateEffectiveFrom>2008-03-12T06:00:00Z</PlateEffectiveFrom>
</TVLPlateDetails>
<TVLAccountDetails/>
</TVLTagDetails>
<TVLTagDetails>
<HomeAgencyID>3234</HomeAgencyID>
<TagAgencyID>1300</TagAgencyID>
<TagSerialNumber>12345678</TagSerialNumber>
<TagStatus>Z</TagStatus>
<TagClass>3</TagClass>
<TVLPlateDetails>
<PlateCountry>US</PlateCountry>
<PlateState>OK</PlateState>
<PlateNumber>ABC321</PlateNumber>
</TVLPlateDetails>
<TVLAccountDetails>
<AccountNumber>654321</AccountNumber>
</TVLAccountDetails>
</TVLTagDetails>
</TVLDetail>
</TagValidationList>';

SELECT
t.x.value('HomeAgencyID[1]', 'varchar(4)') as HomeAgencyID,
t.x.value('TagAgencyID[1]', 'varchar(4)') as TagAgencyID,
t.x.value('TagSerialNumber[1]', 'varchar(8)') as TagSerialNumber,
t.x.value('TagStatus[1]', 'varchar(4)') as TagStatus,
t.x.value('TagClass[1]', 'varchar(4)') as TagClass,
t.x.value('(TVLPlateDetails/PlateCountry)[1]', 'varchar(4)') as PlateCountry,
t.x.value('(TVLPlateDetails/PlateState)[1]', 'varchar(4)') as PlateState,
t.x.value('(TVLPlateDetails/PlateNumber)[1]', 'varchar(12)') as PlateNumber,
t.x.value('(TVLAccountDetails/AccountNumber)[1]', 'varchar(12)') as AccountNumber

FROM @x.nodes('/TagValidationList/TVLDetail/TVLTagDetails') t(x)

The output is close to what you're looking for. You can ISNULL-'' some of the outputs if you need to.

HomeAgencyID TagAgencyID TagSerialNumber TagStatus TagClass PlateCountry PlateState PlateNumber  AccountNumber
------------ ----------- --------------- --------- -------- ------------ ---------- ------------ -------------
1234 1100 00123456 X 1 NULL NULL NULL NULL
2234 1200 00223456 Y 2 US TX 123ABC NULL
3234 1300 12345678 Z 3 US OK ABC321 654321

SQL Server: how import XML with node's fields

You can also do it with native XQuery support in SQL Server, without having to resort to the legacy OPENXML calls which are known for memory leaks and other problem.

Just use this code instead:

DECLARE @XML AS XML

SELECT @XML = XMLData FROM XMLwithOpenXML

SELECT
id = xc.value('@id', 'int'),
lon = xc.value('@lon', 'decimal(20,8)'),
lat = xc.value('@lat', 'decimal(20,8)'),
name = xc.value('(tag[@k="name"]/@v)[1]', 'varchar(50)'),
website = xc.value('(tag[@k="website"]/@v)[1]', 'varchar(50)')
FROM
@XML.nodes('/osm/node') AS XT(XC)

And I also recommend to always use the most appropriate datatype - here, the id looks like an INT - so use it as such - and lat and lon are clearly DECIMAL values ; don't just convert everything to strings because you're too lazy to figure out what to use!

Importing XML into SQL Server but trying to make multiple entries if multiple results exist for a child element

This should get you pretty far. It's completely untested, so please read the code, understand it, and make the appropriate changes to get it to work.

I've removed the function and inlined all the code into the loop instead, the function was too bulky for my taste. Now you should be able to see more clearly what's going on.

Effectively it's the exact same code two times, with a small extra step that adds self-references so you can query every product via its primary ID and and its secondary IDs in the same way, as discussed in the comments.

$connectionString = "Data Source=Apps2\Apps2;Initial Catalog=ICECAT;Integrated Security=SSPI"
$batchSize = 50000

# set up [files_index] datatable & read schema from DB
$files_index_table = New-Object System.Data.DataTable
$files_index_adapter = New-Object System.Data.SqlClient.SqlDataAdapter("SELECT * FROM files_index WHERE 0 = 1", $connectionString)
$files_index_adapter.Fill($files_index_table) | Out-Null
$files_index_bcp = New-Object SqlBulkCopy($connectionString)
$files_index_bcp.DestinationTableName = "dbo.files_index"
$files_index_count = 0

# set up [product_ids] datatable & read schema from DB
$product_ids_table = New-Object System.Data.DataTable
$product_ids_adapter = New-Object System.Data.SqlClient.SqlDataAdapter("SELECT * FROM product_ids WHERE 0 = 1", $connectionString)
$product_ids_adapter.Fill($product_ids_table) | Out-Null
$product_ids_bcp = New-Object System.Data.SqlClient.SqlBulkCopy($connectionString)
$product_ids_bcp.DestinationTableName = "dbo.product_ids"
$product_ids_count = 0

# main import loop
$xmlReader = New-Object System.Xml.XmlTextReader("C:\Scripts\icecat\files.index.xml")
while ($xmlReader.Read()) {
# skip any XML nodes that aren't elements
if ($xmlReader.NodeType -ne [System.Xml.XmlNodeType]::Element) { continue }

# handle <file> elements
if ($xmlReader.Name -eq "file") {
$files_index_count++

# remember current product ID, we'll need it when we hit the next <M_Prod_ID> element
$curr_product_id = $xmlReader.GetAttribute("Product_ID")
$is_new_file = $true

$newRow = $files_index_table.NewRow()
$newRow["Product_ID"] = $xmlReader.GetAttribute("Product_ID")
$newRow["path"] = $xmlReader.GetAttribute("path")
$newRow["Updated"] = $xmlReader.GetAttribute("Updated")
$newRow["Quality"] = $xmlReader.GetAttribute("Quality")
$newRow["Supplier_id"] = $xmlReader.GetAttribute("Supplier_id")
$newRow["Prod_ID"] = $xmlReader.GetAttribute("Prod_ID")
$newRow["Catid"] = $xmlReader.GetAttribute("Catid")
$newRow["On_Market"] = $xmlReader.GetAttribute("On_Market")
$newRow["Model_Name"] = $xmlReader.GetAttribute("Model_Name")
$newRow["Product_View"] = $xmlReader.GetAttribute("Product_View")
$newRow["HighPic"] = $xmlReader.GetAttribute("HighPic")
$newRow["HighPicSize"] = $xmlReader.GetAttribute("HighPicSize")
$newRow["HighPicWifiles_index_tableh"] = $xmlReader.GetAttribute("HighPicWifiles_index_tableh")
$newRow["HighPicHeight"] = $xmlReader.GetAttribute("HighPicHeight")
$newRow["Date_Added"] = $xmlReader.GetAttribute("Date_Added")
$files_index_table.Rows.Add($newRow) | Out-Null

if ($files_index_table.Rows.Count -eq $batchSize) {
$files_index_bcp.WriteToServer($files_index_table)
$files_index_table.Rows.Clear()
Write-Host "$files_index_count <file> elements processed so far"
}
# handle <M_Prod_ID> elements
} elseif ($xmlReader.Name -eq "M_Prod_ID") {
$product_ids_count++

# add self-reference row to the [product_ids] table
# only for the first <M_Prod_ID> per <file> we need to do this
if ($is_new_file) {
$newRow = $product_ids_table.NewRow()
$newRow["Product_ID"] = $curr_product_id # from above
$newRow["Alternative_ID"] = $curr_product_id
$product_ids_table.Rows.Add($newRow) | Out-Null
$is_new_file = $false
}

$newRow = $product_ids_table.NewRow()
$newRow["Product_ID"] = $curr_product_id # from above
$newRow["Alternative_ID"] = $xmlReader.Value
$product_ids_table.Rows.Add($newRow) | Out-Null

if ($product_ids_table.Rows.Count -eq $batchSize) {
$product_ids_bcp.WriteToServer($files_index_table)
$product_ids_table.Rows.Clear()
Write-Host "$product_ids_count <M_Prod_ID> elements processed so far"
}
}
}

# write any remaining rows to the server
if ($files_index_table.Rows.Count -gt 0) {
$files_index_bcp.WriteToServer($files_index_table)
$files_index_table.Rows.Clear()
}
Write-Host "$files_index_count <file> elements processed overall"

if ($product_ids_table.Rows.Count -gt 0) {
$product_ids_bcp.WriteToServer($product_ids_table)
$product_ids_table.Rows.Clear()
}
Write-Host "$product_ids_count <M_Prod_ID> elements processed overall"

Method for importing System.Data.Dataset XML file into SQL Server using T-SQL

Please try the following solution.

It is using T-SQL and XQuery methods .nodes() and .value().

I saved your XML as 'e:\Temp\NewDataSet.xml' file.

SQL Server XML data type can hold up to 2GB size wise.

If the performance of the suggested method is not that good, depending on the volume of the data, it is possible to load the entire XML file into a temporary table with one row and one column.

SQL

DECLARE @tbl1 TABLE (ID INT IDENTITY PRIMARY KEY,  col1 VARCHAR(50), col2 VARCHAR(50));
DECLARE @tbl2 TABLE (ID INT IDENTITY PRIMARY KEY, col1 VARCHAR(50), col2 VARCHAR(50));

DECLARE @xml XML;

SELECT @xml = XmlDoc
FROM OPENROWSET (BULK N'e:\Temp\NewDataSet.xml', SINGLE_BLOB, CODEPAGE='65001') AS Tab(XmlDoc);

INSERT INTO @tbl1 (col1, col2)
SELECT c.value('(col1/text())[1]', 'VARCHAR(50)') AS col1
, c.value('(col2/text())[1]','VARCHAR(50)') AS col2
FROM @xml.nodes('/NewDataSet/table1') AS t(c);

INSERT INTO @tbl2 (col1, col2)
SELECT c.value('(col1/text())[1]', 'VARCHAR(50)') AS col1
, c.value('(col2/text())[1]','VARCHAR(50)') AS col2
FROM @xml.nodes('/NewDataSet/table2') AS t(c);

-- test
SELECT * FROM @tbl1;
SELECT * FROM @tbl2;

Output

Table1

+----+--------+--------+
| ID | col1 | col2 |
+----+--------+--------+
| 1 | tkshrq | 6krrtq |
| 2 | k60stu | sqnhp9 |
+----+--------+--------+

Table2

+----+--------+--------+
| ID | col1 | col2 |
+----+--------+--------+
| 1 | 6k1thw | n2ocgz |
| 2 | 26kmw5 | ym3iwd |
+----+--------+--------+


Related Topics



Leave a reply



Submit