Merging 2 similar XML Files

**gothic_type** · April 18th, 2008, 01:29 AM

The first thing to do is understand the structure of the DataSets created from reading the Xml files (i.e. how the xml gets arranged into tables, columns and rows).

Taking the example you posted earlier (since I'm a bit unsure of your docx example):

Code:

<?xml version="1.0"?>
<catalog>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
   </book>
   <book id="bk102">
      <author>Randall, Cynthia</author>
      <title>Lover Birds</title>
      <genre>Romance</genre>
      <price>4.95</price>
   </book>
   <book id="bk103">
      <author>Vinzovskaia, Irina</author>
      <title>Piano Fort A</title>
      <genre>Romance</genre>
      <price>4.95</price>
   </book>
</catalog>

I would assume that reading this in will give you a dataset with a "book" table consisting of 4 columns (author, title, genre, price) and 3 rows. So figure that out first either by reading the API documentation for the thing that creates the dataset or by looking at it in the debugger or something.

Once you've done that, you need to identify something that's unique in each of your entries, but that will be the same in both files that you are merging. In the example above, the "id" attribute is unique within a file but not unique across files. In the docx example, it might be "Heading", "Sub Heading", etc (I don't know).

Now that you have all this information, you have a number of choices. Two of them that I can think of are subclassing DataSet and overriding the merge method or writing a utility method that takes two DataSets as arguments, and returns a merged DataSet.

Code:

public class MyDataSet : DataSet
{
  // ...
  public override DataSet Merge(DataSet ds)
  {
    // do merging
  }
}

// or

public static class DataSetUtils
{
  public DataSet Merge(DataSet ds1, DataSet ds2)
  {
    DataSet retDs = ...
    // do merging

    return retDs;
  }
}

You also probably want to make sure that the rows in each table in each DataSet are sorted in the same order by whichever column is unique - this way you can more or less go from row 0 to the end of the larger DataSet without having to do anything fancy.

There's probably much better ways of doing it, but you'd have to figure that out yourself. One possibility is instead of using a DataSet work on the xml directly using the Xml support in .NET.

ac

Thread: Merging 2 similar XML Files

Thread Tools

Display

Threaded View

Similar Threads

Ultimate Guide to MS-DOS

more info on Henpeck Rodock worm

A-Z Index of the Linux BASH basic commands

Posting Permissions