Last Friday night I was doing a performance test on
DataTable class and literally felt an earthquake.
I tried four different methods to fetch data from an
instance of an arbitrary DataTable instance, matching arbitrary criteria.
I tried with 3K to 1 million records in the DataTable and
performed 500-5000 iteration of data selection using each method.
- DataTable.Select()
- Linq
- PLinq
- Traditional (by literally writing code to loop around the DataTable and use if/else clause to find match)
Traditional method was the fastest of them all.
Not very far behind the Traditional method were Linq &
PLinq.
PLinq however seemed to have adverse effect over Linq if the
volume of data in the DataTable was rather low.
The DataTable.Select() was the slowest and in fact poorly
slow.
It was 19-24 times slower than the others.
i.e If something takes 20mins using the other methods, it
would take 6-8 hours using DataTable.Select() method.
Resolution:
I created an extension class for DataTable with four
extension methods with the following signature to handle most of the use cases
where we are using DataTable.Select() method.
DataRow FindFirst(this DataTable table, Func<DataRow, bool>
criteria);
DataRow[] FindAll(this DataTable table, Func<DataRow, bool>
criteria);
bool Exists(this DataTable table, Func<DataRow, bool>
criteria);
DataTable Extract(this DataTable table, Func<DataRow, bool>
criteria);
The implementation of the extension class/methods is available for download here.
Scenario 1: Where we need all the rows matching a certain criteria.
Using
DataTable.Select()
DataRow[] rows = dt.Select(“A=’123’ AND
B=’XYZ’”);
Using the new extension method
DataRow[] rows = dt.FindAll((r)=>r[“A”].Equals(“123”)
&& r[“B”].Equals(“XYZ”));
Scenario 2: Where we need to check the existence of retain record
Using
DataTable.Select()
bool isPresent = dt.Select(“A=’123’ AND
B=’XYZ’”).Length>0;
Using the new extension method
bool isPresent =
dt.Exists((r)=>r[“A”].Equals(“123”) && r[“B”].Equals(“XYZ”));
Scenario 3: Where we need to get first matching record (usually
done on sorted DataTable)
Using
DataTable.Select()
DataRow matchingRecord =
dt.Select(“A=’123’ AND B=’XYZ’”).Rows[0];
Using the new extension method
bool isPresent =
dt.FindFirst((r)=>r[“A”].Equals(“123”) && r[“B”].Equals(“XYZ”));
Scenario 4: Where we need get all the matching rows in form of
another DataTable instance
Using
DataTable.Select()
DataTable dtMatch = SomeUtil.ConvertToDataTable(dt.Select(“A=’123’
AND B=’XYZ’”));
Using the new extension method
DataTable dtMatch =
dt.Extract((r)=>r[“A”].Equals(“123”) && r[“B”].Equals(“XYZ”));
Please do not use DataTable.Select() going forward, use
the Linq or Traditional method instead.