c# - Best way to remove duplicates from DataTable depending on column values -
i have dataset
contains 1 table
, i'm working datatable here.
the code see below works, want have best , efficient way perform task because work data here.
basically, data table should later in database, primary key - of course - must unique.
the primary key of data work in column called computer name
. each entry have date in column date
.
i wrote function searches duplicates in computer name
column, , compare dates of these duplicates delete newest.
the function wrote looks this:
private void mergeduplicate(dataset importeddata) { dictionary<string, list<datarow>> systems = new dictionary<string, list<datarow>>(); dataset importeddatacopy = importeddata.copy(); importeddata.tables[0].clear(); foreach (datarow dr in importeddatacopy.tables[0].rows) { string systemname = dr["computer name"].tostring(); if (!systems.containskey(systemname)) { systems.add(systemname, new list<datarow>()); } systems[systemname].add(dr); } foreach (keyvaluepair<string,list<datarow>> entry in systems) { if (entry.value.count > 1) { int firstdatarowindex = 0; int seconddatarowindex = 1; while (entry.value.count > 1) { datetime time1 = validation.convertstringintodatetime(entry.value[firstdatarowindex]["date"].tostring()); datetime time2 = validation.convertstringintodatetime(entry.value[seconddatarowindex]["date"].tostring()); //delete older entry if (datetime.compare(time1,time2) >= 0) { entry.value.removeat(firstdatarowindex); } else { entry.value.removeat(seconddatarowindex); } } } importeddata.tables[0].importrow(entry.value[0]); } }
my question is, since code works - best , fastest/most efficient way perform task?
i appreciate answers!
i think can done more efficiently. copy dataset once dataset importeddatacopy = importeddata.copy();
, copy again dictionary , delete unnecessary data dictionary. rather remove unnecessary information in 1 pass. this:
private void mergeduplicate(dataset importeddata) { dictionary<string, datarow> systems = new dictionary<string, datarow>(); int = 0; while (i < importeddata.tables[0].rows.count) { datarow dr = importeddata.tables[0].rows[i]; string systemname = dr["computer name"].tostring(); if (!systems.containskey(systemname)) { systems.add(systemname, dr); } else { // existing date date in dictionary. datetime existing = validation.convertstringintodatetime(systems[systemname]["date"].tostring()); // candidate date date of current datarow. datetime candidate = validation.convertstringintodatetime(dr["date"].tostring()); // if candidate date greater existing date replace existing datarow // candidate datarow , delete existing datarow table. if (datetime.compare(existing, candidate) < 0) { importeddata.tables[0].rows.remove(systems[systemname]); systems[systemname] = dr; } else { importeddata.tables[0].rows.remove(dr); } } i++; } }
Comments
Post a Comment