What Is the Fastest Way to Combine Two Xml Files into One

What is the fastest way to combine two xml files into one

The easiest way to do this is using LINQ to XML. You can use either Union or Concat depending on your needs.

var xml1 = XDocument.Load("file1.xml");
var xml2 = XDocument.Load("file2.xml");

//Combine and remove duplicates
var combinedUnique = xml1.Descendants("AllNodes")
.Union(xml2.Descendants("AllNodes"));

//Combine and keep duplicates
var combinedWithDups = xml1.Descendants("AllNodes")
.Concat(xml2.Descendants("AllNodes"));

PHP - best way to combine multiple XML files into one, then show as a webpage with XML formatting?

Using DOMDocument rather than SimpleXML allows you to do it very easily...

function mergeFile ( DOMDocument $target, $fileName )    {
$source = new DOMDocument();
$target->preserveWhiteSpace = false;
$source->load($fileName);

$import = $target->importNode($source->documentElement, true);
$target->documentElement->appendChild($import);
}

$target = new DOMDocument();
$target->formatOutput = true;
$target->preserveWhiteSpace = true;
$target->loadXML('<?xml version="1.0" encoding="utf-8"?><animals></animals>');
mergeFile($target, "dog.xml");
mergeFile($target, "cat.xml");
mergeFile($target, "rabbit.xml");

$target->loadXML($target->saveXML());
$target->save("animals.xml");

There are a few fiddles in there to ensure the format is correct, at the end it re-loads the document to create the proper layout. Also when loading the sub-documents, the spacing isn't preserved to allow the main document to sort this out.

The output file is...

<?xml version="1.0" encoding="utf-8"?>
<animals>
<animal>
<species>dog</species>
<weight>10</weight>
<length>2</length>
</animal>
<animal>
<species>rabbit</species>
<weight>0.6</weight>
<length>0.3</length>
</animal>
<animal>
<species>cat</species>
<weight>2.5</weight>
<length>1</length>
</animal>
</animals>

How can I merge XML files?

"Automatic XML merge" sounds like a relatively simple requirement, but when you go into all the details, it gets complex pretty fast. Merge with c# or XSLT will be much easier for more specific task, like in the answer for EF model. Using tools to assist with a manual merge can also be an option (see this SO question).

For the reference (and to give an idea about complexity) here's an open-source example from Java world: XML merging made easy

Back to the original question. There are few big gray-ish areas in task specification: when 2 elements should be considered equivalent (have same name, matching selected or all attributes, or also have same position in the parent element); how to handle situation when original or merged XML have multiple equivalent elements etc.

The code below is assuming that

  • we only care about elements at the moment
  • elements are equivalent if element names, attribute names, and attribute values match
  • an element doesn't have multiple attributes with the same name
  • all equivalent elements from merged document will be combined with the first equivalent element in the source XML document.

.

// determine which elements we consider the same
//
private static bool AreEquivalent(XElement a, XElement b)
{
if(a.Name != b.Name) return false;
if(!a.HasAttributes && !b.HasAttributes) return true;
if(!a.HasAttributes || !b.HasAttributes) return false;
if(a.Attributes().Count() != b.Attributes().Count()) return false;

return a.Attributes().All(attA => b.Attributes(attA.Name)
.Count(attB => attB.Value == attA.Value) != 0);
}

// Merge "merged" document B into "source" A
//
private static void MergeElements(XElement parentA, XElement parentB)
{
// merge per-element content from parentB into parentA
//
foreach (XElement childB in parentB.DescendantNodes())
{
// merge childB with first equivalent childA
// equivalent childB1, childB2,.. will be combined
//
bool isMatchFound = false;
foreach (XElement childA in parentA.Descendants())
{
if (AreEquivalent(childA, childB))
{
MergeElements(childA, childB);
isMatchFound = true;
break;
}
}

// if there is no equivalent childA, add childB into parentA
//
if (!isMatchFound) parentA.Add(childB);
}
}

It will produce desired result with the original XML snippets, but if input XMLs are more complex and have duplicate elements, the result will be more... interesting:

public static void Test()
{
var a = XDocument.Parse(@"
<Root>
<LeafA>
<Item1 />
<Item2 />
<SubLeaf><X/></SubLeaf>
</LeafA>
<LeafB>
<Item1 />
<Item2 />
</LeafB>
</Root>");
var b = XDocument.Parse(@"
<Root>
<LeafB>
<Item5 />
<Item1 />
<Item6 />
</LeafB>
<LeafA Name=""X"">
<Item3 />
</LeafA>
<LeafA>
<Item3 />
</LeafA>
<LeafA>
<SubLeaf><Y/></SubLeaf>
</LeafA>
</Root>");

MergeElements(a.Root, b.Root);
Console.WriteLine("Merged document:\n{0}", a.Root);
}

Here's merged document showing how equivalent elements from document B were combined together:

<Root>
<LeafA>
<Item1 />
<Item2 />
<SubLeaf>
<X />
<Y />
</SubLeaf>
<Item3 />
</LeafA>
<LeafB>
<Item1 />
<Item2 />
<Item5 />
<Item6 />
</LeafB>
<LeafA Name="X">
<Item3 />
</LeafA>
</Root>

How to merge xml files into one file with two specific nodes using C#

Try following xml linq

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;

namespace ConsoleApplication1
{
class Program
{
const string FOLDER = @"c:\temp\test\";
static void Main(string[] args)
{
//merged elements
XElement newFilters = new XElement("filters");

foreach (string filename in Directory.GetFiles(FOLDER, "*.xml"))
{
XDocument doc = XDocument.Load(filename);
XElement filters = doc.Descendants("filters").FirstOrDefault();
string folderName = ((string)filters.Element("folder").Element("text")).Trim();

XElement newFolder = newFilters.Elements("folder").Where(x => ((string)x.Element("text")).Trim() == folderName).FirstOrDefault();
if (newFolder == null)
{
newFilters.Add(filters.Elements());
}
else
{
List<XElement> xFilters = filters.Descendants("filter").ToList();
newFolder.Add(xFilters);
}
}
}
}
}

Smart merging of two XML files

You can use XmlDocument, open both sources, iterate through nodes and merge it in a new XmlDocument.

Also, with XmlDocument you can use LINQ to test for collisions, what simplifies this task.

    XmlDocument MergeDocs(string SourceA, string SourceB)
{

XmlDocument docA = new XmlDocument();
XmlDocument docB = new XmlDocument();
XmlDocument merged = new XmlDocument();

docA.LoadXml(SourceA);
docB.LoadXml(SourceB);

var childsFromA = docA.ChildNodes.Cast<XmlNode>();
var childsFromB = docB.ChildNodes.Cast<XmlNode>();

var uniquesFromA = childsFromA.Where(ch => childsFromB.Where(chb => chb.Name == ch.Name).Count() == 0);
var uniquesFromB = childsFromB.Where(ch => childsFromA.Where(chb => chb.Name == ch.Name).Count() == 0);

foreach (var unique in uniquesFromA)
merged.AppendChild(DeepCloneToDoc(unique, merged));

foreach (var unique in uniquesFromA)
merged.AppendChild(DeepCloneToDoc(unique, merged));

var Duplicates = from chA in childsFromA
from chB in childsFromB
where chA.Name == chB.Name
select new { A = chA, B = chB };

foreach (var grp in Duplicates)
merged.AppendChild(MergeNodes(grp.A, grp.B, merged));

return merged;

}

XmlNode MergeNodes(XmlNode A, XmlNode B, XmlDocument TargetDoc)
{
var merged = TargetDoc.CreateNode(A.NodeType, A.Name, A.NamespaceURI);

foreach (XmlAttribute attrib in A.Attributes)
merged.Attributes.Append(TargetDoc.CreateAttribute(attrib.Prefix, attrib.LocalName, attrib.NamespaceURI));

var fromA = A.Attributes.Cast<XmlAttribute>();

var fromB = B.Attributes.Cast<XmlAttribute>();

var toAdd = fromB.Where(attr => fromA.Where(ata => ata.Name == attr.Name).Count() == 0);

foreach (var attrib in toAdd)
merged.Attributes.Append(TargetDoc.CreateAttribute(attrib.Prefix, attrib.LocalName, attrib.NamespaceURI));

var childsFromA = A.ChildNodes.Cast<XmlNode>();
var childsFromB = B.ChildNodes.Cast<XmlNode>();

var uniquesFromA = childsFromA.Where(ch => childsFromB.Where(chb => chb.Name == ch.Name).Count() == 0);
var uniquesFromB = childsFromB.Where(ch => childsFromA.Where(chb => chb.Name == ch.Name).Count() == 0);

foreach (var unique in uniquesFromA)
merged.AppendChild(DeepCloneToDoc(unique, TargetDoc));

foreach (var unique in uniquesFromA)
merged.AppendChild(DeepCloneToDoc(unique, TargetDoc));

var Duplicates = from chA in childsFromA
from chB in childsFromB
where chA.Name == chB.Name
select new { A = chA, B = chB };

foreach(var grp in Duplicates)
merged.AppendChild(MergeNodes(grp.A, grp.B, TargetDoc));

return merged;
}

XmlNode DeepCloneToDoc(XmlNode NodeToClone, XmlDocument TargetDoc)
{

var newNode = TargetDoc.CreateNode(NodeToClone.NodeType, NodeToClone.Name, NodeToClone.NamespaceURI);

foreach (XmlAttribute attrib in NodeToClone.Attributes)
newNode.Attributes.Append(TargetDoc.CreateAttribute(attrib.Prefix, attrib.LocalName, attrib.NamespaceURI));

foreach (XmlNode child in NodeToClone.ChildNodes)
newNode.AppendChild(DeepCloneToDoc(NodeToClone, TargetDoc));

return newNode;

}

Note I haven't tested it, done just from memory but you get the idea about how to go.



Related Topics



Leave a reply



Submit