Compare two DataTables to determine rows in one but not the other
would I have to iterate through each row on each DataTable to check if they are the same.
Seeing as you've loaded the data from a CSV file, you're not going to have any indexes or anything, so at some point, something is going to have to iterate through every row, whether it be your code, or a library, or whatever.
Anyway, this is an algorithms question, which is not my specialty, but my naive approach would be as follows:
1: Can you exploit any properties of the data? Are all the rows in each table unique, and can you sort them both by the same criteria? If so, you can do this:
- Sort both tables by their ID (using some useful thing like a quicksort). If they're already sorted then you win big.
- Step through both tables at once, skipping over any gaps in ID's in either table. Matched ID's mean duplicated records.
This allows you to do it in (sort time * 2 ) + one pass, so if my big-O-notation is correct, it'd be (whatever-sort-time) + O(m+n) which is pretty good.
(Revision: this is the approach that ΤΖΩΤΖΙΟΥ describes )
2: An alternative approach, which may be more or less efficient depending on how big your data is:
- Run through table 1, and for each row, stick it's ID (or computed hashcode, or some other unique ID for that row) into a dictionary (or hashtable if you prefer to call it that).
- Run through table 2, and for each row, see if the ID (or hashcode etc) is present in the dictionary. You're exploiting the fact that dictionaries have really fast - O(1) I think? lookup. This step will be really fast, but you'll have paid the price doing all those dictionary inserts.
I'd be really interested to see what people with better knowledge of algorithms than myself come up with for this one :-)
Compare two DataTables and select the rows that are not present in second table
You can use Linq, especially Enumerable.Except
helps to find id's in TableA that are not in TableB:
var idsNotInB = TableA.AsEnumerable().Select(r => r.Field<int>("id"))
.Except(TableB.AsEnumerable().Select(r => r.Field<int>("id")));
DataTable TableC = (from row in TableA.AsEnumerable()
join id in idsNotInB
on row.Field<int>("id") equals id
select row).CopyToDataTable();
You can also use Where
but it'll be less efficient:
DataTable TableC = TableA.AsEnumerable()
.Where(ra => !TableB.AsEnumerable()
.Any(rb => rb.Field<int>("id") == ra.Field<int>("id")))
.CopyToDataTable();
Comparing two datatables in C# and finding new, matching and non-macting records
You can try with the Linq methods which are available for Enumerable types like Intersect, Except. Here is an example of doing this.
// Get matching rows from the two tables
IEnumerable<DataRow> matchingRows = table1.AsEnumerable().Intersect(table2.AsEnumerable());
// Get rows those are present in table2 but not in table1
IEnumerable<DataRow> rowsNotInTableA = table2.AsEnumerable().Except(table1.AsEnumerable());
Comparing two DataTables to determine if it is modified
Using foreach loop within another foreach is N X N comparison, that you don't need to do.
Comparing First row with First row of other table, second with second and so on using Zip extension method is very useful for this case.
DataTable original;
DataTable modified;
// your stuff
modified = modified.AsEnumerable().Zip<DataRow, DataRow, DataRow>(original.AsEnumerable(), (DataRow modif, DataRow orig) =>
{
if (!orig.ItemArray.SequenceEqual<object>(modif.ItemArray))
{
modif.SetModified();
}
return modif;
}).CopyToDataTable<DataRow>();
Related Topics
How to Multi-Target a .Net Core Class Library with Csproj
Programmatically Mouse Click in Another Window
How to Filter Directory.Enumeratefiles with Multiple Criteria
Why Do Bcl Collections Use Struct Enumerators, Not Classes
Dynamically Setting CSS Values Using ASP.NET
How to Display Formatted Code in Webpage
Bundling Not Working in MVC5 When I Turn on Release Mode
How to Use CSS on an HTML.Actionlink in C#
How to Marshal a Struct That Contains a Variable-Sized Array to C#
How to Find All Possible Subsets of a Given Array
How to Clean HTML Tags Using C#
Getting HTML Body Content in Winforms Webbrowser After Body Onload Event Executes
Return HTML from ASP.NET Web API