PowerQuery M:从10个网站导入数据并将其放入单个表中

问题描述 投票:0回答:1

有人可以帮我修复此 VBA 代码,以便输出没有任何空行并且并行显示特定日期的所有数据吗?另外,如果您可以优化这个宏(加载需要很长时间),这样它就不会从这些网站下载所有数据,而只是从每个网站访问这两个容器来下载表格,那就太好了:

  1. XPath="/html/body/main/div/div[4]/div/div/div/div[3]/div[1]/div/div[2]/div[2]/div[1] /表”
  2. XPath="/html/body/main/div/div[4]/div/div/div/div[3]/div[1]/div/div[2]/div[2]/div[2] /表”

非常感谢!


let
    // Define a list of slugs
    Slugs = {
        "IBIT",
        "FBTC",
        "BITB",
        "ARKB",
        "BTCO",
        "EZBC",
        "BRRR",
        "HODL",
        "BTCW",
        "GBTC"
    },

    // Define a function to construct URLs for each slug
    ConstructURL = (Slug) => "https://ycharts.com/companies/" & Slug & "/total_assets_under_management",

    // Define a function to apply transformation steps to each URL
    TransformData = (URL) =>
    let
        Source = Web.Page(Web.Contents(URL)),
        Data0 = Source{0}[Data],
        Data1 = Source{1}[Data],
        CombinedData = Table.Combine({Data0, Data1}),
        // Extract the company name from the URL
        CompanyName = Text.BetweenDelimiters(URL, "companies/", "/"),
        // Rename columns dynamically based on the company name
        RenamedColumns = Table.RenameColumns(CombinedData, {{"Date", "Date"}, {"Value", "Value_" & CompanyName}}),
        // Change the data type of the columns
        ChangedType = Table.TransformColumnTypes(RenamedColumns, {{"Date", type date}, {"Value_" & CompanyName, type text}})
    in
        ChangedType,

    // Construct URLs for each slug
    URLs = List.Transform(Slugs, each ConstructURL(_)),

    // Apply transformation to each URL and combine the results
    CombinedTables = List.Transform(URLs, each TransformData(_)),

    // Combine new data with existing data
    CombinedTable = if List.Count(CombinedTables) > 0 then Table.Combine(CombinedTables) else null
in
    CombinedTable


更新:我已经设法将所有内容放在两列中,但我宁愿有一个日期列和十列带有值的:

let
    // Define a list of slugs
    Slugs = {
        "IBIT",
        "FBTC",
        "BITB",
        "ARKB",
        "BTCO",
        "EZBC",
        "BRRR",
        "HODL",
        "BTCW",
        "GBTC"
    },

    // Define a function to construct URLs for each slug
    ConstructURL = (Slug) => "https://ycharts.com/companies/" & Slug & "/total_assets_under_management",

    // Define a function to apply transformation steps to each URL
    TransformData = (URL) =>
    let
        Source = Web.Page(Web.Contents(URL)),
        Data0 = Source{0}[Data],
        Data1 = Source{1}[Data],
        CombinedData = Table.Combine({Data0, Data1}),
        // Extract the company name from the URL
        CompanyName = Text.BetweenDelimiters(URL, "companies/", "/"),
        // Change the data type of the columns
        ChangedType = Table.TransformColumnTypes(CombinedData, {{"Date", type date}, {"Value", type text}}),
        // Add a custom column for company name
        AddedCompanyColumn = Table.AddColumn(ChangedType, "Company", each CompanyName)
    in
        AddedCompanyColumn,

    // Construct URLs for each slug
    URLs = List.Transform(Slugs, ConstructURL),

    // Apply transformation to each URL and combine the results
    CombinedTables = List.Transform(URLs, each TransformData(_)),

    // Combine new data with existing data
    CombinedTable = if List.Count(CombinedTables) > 0 then Table.Combine(CombinedTables) else null
in
    CombinedTable

有人可以帮我修复此 VBA 代码,以便输出没有任何空行并且并行显示特定日期的所有数据吗?另外,如果您可以优化这个宏(加载需要很长时间),这样它就不会从这些网站下载所有数据,而只是从每个网站访问这两个容器来下载表格,那就太好了:

  1. XPath="/html/body/main/div/div[4]/div/div/div/div[3]/div[1]/div/div[2]/div[2]/div[1] /表”
  2. XPath="/html/body/main/div/div[4]/div/div/div/div[3]/div[1]/div/div[2]/div[2]/div[2] /表”

非常感谢!

excel powerquery m
1个回答
0
投票

请尝试一下。

let
    Comps = {"IBIT","FBTC","BITB"}, 
    GetTable = (Variable as text) =>
        let
            Source = Web.BrowserContents("https://ycharts.com/companies/" & Variable & "/total_assets_under_management"),
            Table1 = Html.Table(Source, {{"Column1", "DIV.col-6:nth-child(2) > TABLE.table:nth-child(1) > TBODY > TR > :nth-child(1)"}, {"Column2", "DIV.col-6:nth-child(2) > TABLE.table:nth-child(1) > TBODY > TR > :nth-child(2)"}}, [RowSelector="DIV.col-6:nth-child(2) > TABLE.table:nth-child(1) > TBODY > TR"]),
            Table2 = Html.Table(Source, {{"Column1", "DIV.col-6:nth-child(1) > TABLE.table:nth-child(1) > TBODY > TR > :nth-child(1)"}, {"Column2", "DIV.col-6:nth-child(1) > TABLE.table:nth-child(1) > TBODY > TR > :nth-child(2)"}}, [RowSelector="DIV.col-6:nth-child(1) > TABLE.table:nth-child(1) > TBODY > TR"]),
            CombinedTables = Table.Combine({Table1, Table2}),
            output = Table.RenameColumns(CombinedTables, {{"Column1", "Date"}, {"Column2", Variable}})
        in
            output,
    Tables = List.Transform(Comps, each GetTable(_)),
    MergedTables = List.Accumulate(List.Skip(Tables), Tables{0}, (state, current) => Table.Join(state, "Date", current, "Date"))
in
    MergedTables
© www.soinside.com 2019 - 2024. All rights reserved.