我有 2 个表,#"Lead Data1" 和 #"Enrolment Data",分别有 ~300k 和 ~16k 行,带有键(Lead Number)。我正在使用单独的查询来提取#“注册数据”。 我想更新 #"Enrolment Data" 中的 #"Lead Data1" 列 {"Created On", "Course", "Program", "Business"},保留其他列。 如果 #"Enrolment Data" 中的任何键(Lead Number)在 #"Lead Data1" 中找不到,则需要将这些记录追加到其中。
我尝试了以下代码:
let
Source = Excel.Workbook(File.Contents("xxx.xlsb"), null, true),
#"Lead Data1" = Source{[Name="Lead Data"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(#"Lead Data1", [PromoteAllScalars=true]),
#"Removed Other Columns" = Table.SelectColumns(#"Promoted Headers",{"Created On", "Owner", "Lead Number", "Source", "Source type", "TL", "Course", "Program", "Business"}),
在这里,我删除了所有不必要的列,这些是我在组合 #"Lead Data1" 和 #"Enrolment Data" 时在输出中需要的列。我将再次使用#“删除其他列”表来附加不匹配的项目。
#"Removed Other Columns1" = Table.SelectColumns(#"Removed Other Columns",{"Owner", "Lead Number", "Source", "Source type", "TL"}),
我只保留需要合并到#“注册数据”的列(没有这些)
#"Merged Queries" = Table.Join(#"Removed Other Columns1", {"Lead Number"}, #"Enrolment Data", {"Lead Number"}, JoinKind.RightOuter, 2),
我正在与#“注册数据”合并,而不是相反,因为它的行数要少得多(如果这有区别的话)。这会给我“更新”的行,包括那些键不匹配#“Lead Data1”的行。
#"Rows to Remove" = Table.Column(#"Merged Queries","Lead Number"),
合并后,我在#“注册数据”中选择匹配键(更新的行)的列表。
#"Selected Rows" = Table.RemoveMatchingRows(#"Removed Other Columns", each List.Contains(#"Rows to Remove",[#"Lead Number"])),
我从#“Lead Data1”中删除这些行(给出“未更新”行)
#"Appended Queries" = Table.Combine({#"Merged Queries", #"Selected Rows"})
我附加“已更新”和“未更新”行。
in #"Appended Queries"
但是,查询抛出“找不到数据源”的错误 - 我认为这可能是因为我错误地引用了#“Removed Other Columns”表。有什么办法可以解决这个问题,还是我的方法本身有问题?如果是这样,任何解决方案都将受到高度赞赏。
您的示例使用了 Table.Join 函数,但据我所知,它无法处理两个连接表中具有相同名称的某些列,因此我猜您的示例代码已被重新设计。
这里有一个 Table.NestedJoin 的提案(在对列名进行一些更改后,您最终可以将其替换为 Table.Join;我猜 Table.Join 可能会提供更好的性能),其中我直接执行外连接以获得所有预期线路。然后,只要不为空,我只需用“注册数据”列内容更新原始列(来自“潜在客户数据”)。
通过这样的操作会丢失类型(可能有一些解决方法?),因此您必须重新键入受影响的列。
let
Source = LeadData,
#"Merged Queries" = Table.NestedJoin(Source, {"Lead Number"}, #"Enrollment Data", {"Lead Number"}, "Enrollment Data", JoinKind.FullOuter),
#"Expanded Enrollment Data" = Table.ExpandTableColumn(#"Merged Queries", "Enrollment Data", {"Created On", "Lead Number", "Course", "Program", "Business"}, {"Enrollment Data.Created On", "Enrollment Data.Lead Number", "Enrollment Data.Course", "Enrollment Data.Program", "Enrollment Data.Business"}),
#"Replaced Lead Number Value" = Table.ReplaceValue(#"Expanded Enrollment Data",each [Lead Number],each if [Enrollment Data.Lead Number] <> null then [Enrollment Data.Lead Number] else [Lead Number],Replacer.ReplaceValue,{"Lead Number"}),
#"Replaced Course Value" = Table.ReplaceValue(#"Replaced Lead Number Value",each [Course],each if [Enrollment Data.Course] <> null then [Enrollment Data.Course] else [Course],Replacer.ReplaceValue,{"Course"}),
#"Replaced Program Value" = Table.ReplaceValue(#"Replaced Course Value",each [Program],each if [Enrollment Data.Program] <> null then [Enrollment Data.Program] else [Program],Replacer.ReplaceValue,{"Program"}),
#"Replaced Business Value" = Table.ReplaceValue(#"Replaced Program Value",each [Business],each if [Enrollment Data.Business] <> null then [Enrollment Data.Business] else [Business],Replacer.ReplaceValue,{"Business"}),
#"Replaced Created On Value" = Table.ReplaceValue(#"Replaced Business Value",each [Created On],each if [Enrollment Data.Created On] <> null then [Enrollment Data.Created On] else [Created On],Replacer.ReplaceValue,{"Created On"}),
#"Removed Columns" = Table.RemoveColumns(#"Replaced Created On Value",{"Enrollment Data.Lead Number", "Enrollment Data.Course", "Enrollment Data.Program", "Enrollment Data.Business", "Enrollment Data.Created On"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Lead Number", type text}, {"Course", type text}, {"Program", type text}, {"Business", type text}, {"Created On", type date}})<br>
in
#"Changed Type"
我不知道你的错误来自哪里。根据您的数据,我无法让您发布的代码产生任何有意义的内容。所以这可能对你的问题没有帮助,但如果你可以将你的两个表读入 PQ,它就会起作用。
这是合并两个表的方法。 在本例中,假设您已将两个表读取到单独的查询中,然后合并这两个查询。
let
//Join the two tables
Source = Table.NestedJoin(Lead_Data, {"Lead Number"}, Enrolment_Data, {"Lead Number"}, "Enrolment_Data", JoinKind.FullOuter),
//Replace Lead number if Enrolment_Data has a new entry
#"Update Lead Number" = Table.ReplaceValue(
Source,
each [Lead Number],
each [Enrolment_Data][Lead Number]{0},
(x,y,z)=>y??z,
{"Lead Number"}),
//Update other columns with entries from Enrolment_data if any are present
// Note the custom replacer function will maintain the Text type of the columns
#"Update Course/Program/Business" = List.Accumulate(
{"Course","Program","Business"},
#"Update Lead Number",
(s,c)=>Table.ReplaceValue(
s,
each Record.Field(_,c),
each Table.Column([Enrolment_Data],c){0},
(x,y,z) as text => if z <> null then z else y,
{c})
),
//Remove Enrolment_Data table column
#"Removed Columns" = Table.RemoveColumns(#"Update Course/Program/Business",{"Enrolment_Data"}),
//Sort by lead number if needed
#"Sorted Rows" = Table.Sort(#"Removed Columns",{{"Lead Number", Order.Ascending}})
in
#"Sorted Rows"