从具有重复列的2行创建多索引

Question

我有一个excel文件，我用pandas读取并转换为数据帧。以下是数据帧的示例：

|               | salads_count | salads_count | salads_count | carrot_counts | carrot_counts | carrot_counts |
|---------------|--------------|--------------|--------------|---------------|---------------|---------------|
|               | 01.2016      | 02.2016      | 03.2016      | 01.2016       | 02.2016       | 03.2016       |
| farm_location |              |              |              |               |               |               |
| sweden        | 42           | 41           | 43           | 52            | 51            | 53            |

这是一个非常奇怪的格式，但这就是excel文件中的内容。首先，2个第一行甚至不是多索引形式。

我设法使用下面的代码将其转换为多索引，但是有些列是重复的（例如，salads_count会多次出现）：

arrays = [df.columns.tolist(), df.iloc[0].tolist()]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples)
df.columns = index

我想将列转换为多索引，类似于：

|               | salads_count |         |         | carrot_counts |         |         |
|---------------|--------------|---------|---------|---------------|---------|---------|
|               | 01.2016      | 02.2016 | 03.2016 | 01.2016       | 02.2016 | 03.2016 |
| farm_location |              |         |         |               |         |         |
| sweden        | 42           | 41      | 43      | 52            | 51      | 53      |

甚至更好，像那样：

|               | 01.2016      |              | 02.2016      |             |   |   |
|---------------|--------------|--------------|--------------|-------------|---|---|
|               | carrot_count | salads_count | carrot_count | salad_count |   |   |
| farm_location |              |              |              |             |   |   |
| sweden        | 52           | 42           | 51           | 41          |   |   |

我怎样才能做到这一点？

Answer 1

最好的是通过参数MultiIndex将列转换为read_excel中的header=[0,1]：

df = pd.read_excel(file, header=[0,1])

然后使用swaplevel与sort_index：

df = df.swaplevel(0,1, axis=1).sort_index(axis=1, level=0)

从具有重复列的2行创建多索引

问题描述投票：1回答：1

1个回答

最新问题

从具有重复列的2行创建多索引

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1