如何在 Python 中使用带有重复行名称的 Pandas 显示第二行

Question

我正在编写一个脚本来读取 Excel 工作表，我无法修改该工作表，该工作表有多个重复行。

我遇到的问题是，在找到第一行后，它似乎停止添加到索引。

我不知道如何让它跳过第一行并仅索引第二行。

 df = pd.read_excel(aisc_excel_file)

    # Open the profiles.lis file for writing
    with open(profiles_lis_file, 'w') as profiles_lis:
        # Write the header to the profiles.lis file
        profiles_lis.write("PROFILE\tWIDTH\tHEIGHT\n")

        # Iterate through rows in the AISC DataFrame and write to profiles.lis
        for index, row in df.iterrows():
            profile_name = row['AISC_Manual_Label']
            width = row['h, in']
            height = row['b, in']

            # Write the profile data to the profiles.lis file
            profiles_lis.write(f"{profile_name}\t{width}\t{height}\n")

这是 AISC Excel 表： https://www.aisc.org/globalassets/product-files-not-searched/manuals/aisc-shapes-database-v16.0.xlsx

我有两个“AISC_Manual_Label”行标签，我需要第二个的数据。

我尝试过 df.Duplicated 但它总是返回一个布尔值并且索引保持不变。

我还读了一些有关系列的内容，但无法获得第二行。

任何帮助将不胜感激。

谢谢你。

Answer 1

如果我正确理解您的问题，您想要选择列的子集并创建一个 tab-separated-values 文件。如果是的话，你可以试试这个：

subset = df[["AISC_Manual_Label.1", "h.1", "b.1"]]

# subset.to_csv("output.txt", sep="\t") # uncomment to make a file

注意：在原始电子表格中，如果存在重复的标题（这里就是这种情况），pandas 通过附加后缀增量 (

.\d+

) 来消除重复的标题，例如

colx

（第一次出现）、

colx.1

（第二次出现）发生），

colx.2

（第三次出现），依此类推..

使用的输入：

url = "https://www.aisc.org/globalassets/product-files-not-searched/" \
      "manuals/aisc-shapes-database-v16.0.xlsx"

from io import BytesIO; import requests
data = BytesIO(requests.get(url).content)

df = pd.read_excel(data, sheet_name="Database v16.0")

如何在 Python 中使用带有重复行名称的 Pandas 显示第二行

问题描述投票：0回答：1

1个回答

最新问题

如何在 Python 中使用带有重复行名称的 Pandas 显示第二行

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1