我在 Power Query 中使用以下代码从包含很多列的表中删除空列。它运行速度非常慢,我正在寻找一种方法来加快速度。基本上,如果给定列中的所有条目均为空,则应删除该列
//Remove Empty Columns
ColumnstoKeep = List.Select(
Table.ColumnNames(#"Expanded"),each List.NonNullCount(Table.Column(#"Expanded",_)) <>0 ),
RemoveEmptyColumns = Table.SelectColumns(#"Expanded",ColumnstoKeep),
//Remove Empty Columns
bufferedTable = Table.Buffer( #"Expanded"),
ColumnstoKeep = List.Select(
Table.ColumnNames(#"bufferedTable"),each List.NonNullCount(Table.Column(#"bufferedTable",_)) <>0 ),
RemoveEmptyColumns = Table.SelectColumns(#"bufferedTable",ColumnstoKeep),
这是另一个答案,一旦到达第一个非空列,该答案就会停止。
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIyAhIoKFYHKoNNAipujE0CUzwWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t, Column6 = _t, Column7 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}, {"Column2", Int64.Type}, {"Column3", Int64.Type}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}}),
sampleTable = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"Column4", "Column5", "Column6", "Column7"}),
columnNames = Table.ColumnNames( sampleTable),
Custom1 = List.Generate(
()=> [name = columnNames{index}, index = 0] ,
each [index] < List.Count(columnNames) and List.NonNullCount(Table.Column(sampleTable,[name])) <>0,
each [name = columnNames{index}, index = [index]+ 1] ,
each [name]
)
in
Custom1
这些已经是一些很好的建议了。我以前用过的一个方法是:
let
tbl = Source,
Headers = Table.ColumnNames( tbl ),
Result =
Table.SelectColumns(
tbl,
List.Select(
Headers,
each List.MatchesAny( Table.Column( tbl, _ ), each _ <> null ) )
)
in
Result
我还没有测试过性能,但值得一试。它使用 List.MatchesAny 查找至少有 1 个非空值的任何列(请参阅 https://powerquery.how/list-matchesany/)
干杯, 瑞克
这个问题可以通过 Table.Profile 函数非常优雅地解决,尽管不可否认它有点密集。
这是我的 fxRemoveEmptyColumns 函数,它对我来说效果很好:
(Source as table) as table =>
let
SourceProfile = Table.Buffer(Table.Profile(Source)),
// Buffer used to remove unexpected 'Expression.Error' occuring on some rows
SourceProfileEmptyRows = Table.SelectRows(SourceProfile, each [Count] = [NullCount]),
lstEmptyColumns = SourceProfileEmptyRows[Column],
SourceRemoveColumns = Table.RemoveColumns(Source, lstEmptyColumns)
in
SourceRemoveColumns