如何在vb.net中删除数据表中的所有重复项？

Question

考虑我的数据表

ID Name
1  AAA
2  BBB
3  CCC
1  AAA
4  DDD

最终输出是

2 BBB
3 CCC
4 DDD

如何使用 Vb.Net 删除数据表中的行如有任何帮助，我们将不胜感激。

Answer 1

如果您只想要不同的行（跳过具有相同 ID 和名称的行），则以下方法有效：

Dim distinctRows = From r In tbl
       Group By Distinct = New With {Key .ID = CInt(r("ID")), Key .Name = CStr(r("Name"))} Into Group
       Where Group.Count = 1
       Select Distinct
' Create a new DataTable containing only the unique rows '
Dim tblDistinct = (From r In tbl
       Join distinctRow In tblDistinct
       On distinctRow.ID Equals CInt(r("ID")) _
       And distinctRow.Name Equals CStr(r("Name"))
       Select r).CopyToDataTable

如果您想从原始表中删除重复项：

Dim tblDups = From r In tbl
       Group By Dups = New With {Key .ID = CInt(r("ID")), Key .Name = CStr(r("Name"))} Into Group
       Where Group.Count > 1
       Select Dups
Dim dupRowList = (From r In tbl
       Join dupRow In tblDups
       On dupRow.ID Equals CInt(r("ID")) _
       And dupRow.Name Equals CStr(r("Name"))
       Select r).ToList()

For Each dup In dupRowList 
    tbl.Rows.Remove(dup)
Next

这是您的示例数据：

Dim tbl As New DataTable
tbl.Columns.Add(New DataColumn("ID", GetType(Int32)))
tbl.Columns.Add(New DataColumn("Name", GetType(String)))
Dim row = tbl.NewRow
row("ID") = 1
row("Name") = "AAA"
tbl.Rows.Add(row)
row = tbl.NewRow
row("ID") = 2
row("Name") = "BBB"
tbl.Rows.Add(row)
row = tbl.NewRow
row("ID") = 3
row("Name") = "CCC"
tbl.Rows.Add(row)
row = tbl.NewRow
row("ID") = 1
row("Name") = "AAA"
tbl.Rows.Add(row)
row = tbl.NewRow
row("ID") = 4
row("Name") = "DDD"
tbl.Rows.Add(row)

Answer 2

您可以使用 DataTable 的 DefaultView.ToTable 方法进行过滤，如下所示：

 Public Sub RemoveDuplicateRows(ByRef rDataTable As DataTable)
    Dim pNewDataTable As DataTable
    Dim pCurrentRowCopy As DataRow
    Dim pColumnList As New List(Of String)
    Dim pColumn As DataColumn

    'Build column list
    For Each pColumn In rDataTable.Columns
        pColumnList.Add(pColumn.ColumnName)
    Next

    'Filter by all columns
    pNewDataTable = rDataTable.DefaultView.ToTable(True, pColumnList.ToArray)

    rDataTable = rDataTable.Clone

    'Import rows into original table structure
    For Each pCurrentRowCopy In pNewDataTable.Rows
        rDataTable.ImportRow(pCurrentRowCopy)
    Next
End Sub

Answer 3

假设您要检查所有列，这应该从数据表 (DT) 中删除重复项：

        DT = DT.DefaultView.ToTable(True, Array.ConvertAll((From v In DT.Columns Select v.ColumnName).ToArray(), Function(x) x.ToString()))

除非我忽略了它，否则这似乎不在文档中（DataView.ToTable Method），但这似乎也做了同样的事情：

DT = DT.DefaultView.ToTable(True)

Answer 4

''' <summary>
''' Removes duplicate rows from a DataTable based on a specified column.
''' </summary>
''' <param name="TableToModify">The DataTable to modify.</param>
''' <param name="ColumnNameToCompare">The column name to compare for duplicates.</param>
Sub RemoveDuplicatesFromADataTable(ByRef TableToModify As DataTable, ByVal ColumnNameToCompare As String)
    'Start at the top row (Very important and the key for this to work!!).
    Dim Index1 As Integer = 0
    Dim rows As DataRowCollection = TableToModify.Rows
    '
    Do Until Index1 = rows.Count
        'Get the value from the current row.
        Dim primaryRow As DataRow = rows(Index1)
        Dim key As Object = primaryRow(ColumnNameToCompare)

        'Compare with all rows below, starting at the bottom.
        For Index2 As Integer = rows.Count - 1 To Index1 + 1 Step -1
            Dim secondaryRow As DataRow = rows(Index2)
            '
            If key.Equals(secondaryRow(ColumnNameToCompare)) Then
                rows.Remove(secondaryRow)
            End If
        Next
        '
        Index1 += 1
    Loop
End Sub

请注意，TableToModify 是一个引用。

所有学分归于： https://www.vbforums.com/showthread.php?870631-已解决-数据表-删除-重复项
@jmcilhinney

如何在vb.net中删除数据表中的所有重复项？

问题描述投票：0回答：4

4个回答

最新问题

如何在vb.net中删除数据表中的所有重复项？

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4