在嵌套列表中查找 nan 元素的索引并删除它们

问题描述 投票:0回答:7
names=[['Pat','Sam', np.nan, 'Tom', ''], ["Angela", np.nan, "James", ".", "Jackie"]]
values=[[1, 9, 1, 2, 1], [1, 3, 1, 5, 10]]

我有 2 个列表:

names
values
。每个值都有一个名称,即
Pat
对应于值
1
Sam
对应于值
9
.

我想从

nan
中删除
names
values
中的相应值。

也就是说,我想要一个看起来像这样的

new_names
列表:

[['Pat','Sam', 'Tom', ''], ["Angela", "James", ".", "Jackie"]]

和一个

new_values
列表,看起来像这样:

[[1, 9, 2, 1], [1, 1, 5, 10]]

我的尝试是首先找到这些

nan
条目的索引:

all_nan_idx = []
for idx, name in enumerate(names):
  if pd.isnull(name):
  all_nan_idx.append(idx)

但是,上面没有考虑嵌套列表。

python list nested-lists
7个回答
1
投票

就这个?

import numpy as np
import pandas as pd

names=[['Pat','Sam', np.nan, 'Tom', ''], ["Angela", np.nan, "James", ".", "Jackie"]]
values=[[1, 9, 1, 2, 1], [1, 3, 1, 5, 10]]

new_names = []
new_values = []
for names_, values_ in zip(names, values):
    n = []
    v = []
    for name, value in zip(names_, values_):
        if not pd.isnull(name):
            n.append(name)
            v.append(value)
    new_names.append(n)
    new_values.append(v)

1
投票

可能有一个难以理解的理解可以做到这一点,但这里有一个循序渐进的方法:

import numpy as np

names = [
    ['Pat', 'Sam', np.nan, 'Tom', ''],
    ["Angela", np.nan, "James", ".", "Jackie"]
    ]
values = [
    [1, 9, 1, 2, 1],
    [1, 3, 1, 5, 10]
    ]

new_names = []
new_values = []

for nn, vv in zip(names, values):
    new_names.append([])
    new_values.append([])
    for n, v in zip(nn, vv):
        if not n is np.nan:
            new_names[-1].append(n)
            new_values[-1].append(v)


print(new_names)
print(new_values)

输出:

[['Pat', 'Sam', 'Tom', ''], ['Angela', 'James', '.', 'Jackie']]
[[1, 9, 2, 1], [1, 1, 5, 10]]

0
投票

使用递归函数:

import numpy as np

def filter_nan(names, values):
  new_names, new_values = [], []

  for name, value in zip(names, values, strict=True):
    if name is np.nan:
      continue

    if isinstance(name, list) and isinstance(value, list):
      name, value = filter_nan(name, value)

      new_names.append(name)
      new_values.append(value)

  return new_names, new_values

试试看:

names = [['Pat', 'Sam', np.nan, 'Tom', ''], ["Angela", np.nan, "James", ".", "Jackie"]]
values = [[1, 9, 1, 2, 1], [1, 3, 1, 5, 10]]

print(filter_nan(names, values))

'''
(
  [['Pat', 'Sam', 'Tom', ''], ['Angela', 'James', '.', 'Jackie']],
  [[1, 9, 2, 1], [1, 1, 5, 10]]
)
'''

0
投票

也许有点太多了,但这是另一种选择:

import numpy as np

names = [['Pat', 'Sam', np.nan, 'Tom', ''], ["Angela", np.nan, "James", ".", "Jackie"]]
values = [[1, 9, 1, 2, 1], [1, 3, 1, 5, 10]]

new_names = []
new_values = []

for aux_list in zip(names, values):
    filtered_names, filtered_values = zip(*filter(lambda x: x[0] is not np.nan, zip(*aux_list)))
    new_names.append(list(filtered_names))
    new_values.append(list(filtered_values))

0
投票

这是处理此类情况的更好、更简单的方法

import numpy as np

names=[['Pat','Sam', np.nan, 'Tom', ''], ["Angela", np.nan, "James", ".", "Jackie"]]
values=[[1, 9, 1, 2, 1], [1, 3, 1, 5, 10]]

new_names = []
new_values = []

for i in range(len(names)):
    new_names.append([])
    new_values.append([])
    for j in range(len(names[i])):
        if not isinstance(names[i][j], float):
            new_names[i].append(names[i][j])
            new_values[i].append(values[i][j])
            
print(new_names)
print(new_values)




0
投票

这是一个使用pandas的解决方案:

import pandas as pd


result = []
for n, v in zip(names, values):
    n = pd.Series(n).dropna()
    result.append((n.tolist(), pd.Series(v).loc[n.index].tolist()))

names, values = map(list, zip(*result))

您也可以使用单行代码(如果您使用的是 Python >= 3.8):

import pandas as pd

names, values = map(list, zip(*(
    ((s := pd.Series(n).dropna()).tolist(), pd.Series(v).loc[s.index].tolist())
    for n, v in zip(names, values)
)))

0
投票

为了在一个语句中有效地做到这一点,您可以将输入列表转置为名称-值对序列,以便您可以使用生成器表达式过滤掉空名称,然后将它们转置回两个列表:

new_names, new_values = map(list, zip(*(
    map(list, zip(*(
        (name, value)
        for name, value in zip(*pairs)
        if not pd.isnull(name)
    )))
    for pairs in zip(names, values)
)))

演示:https://replit.com/@blhsing/EnormousHarshFreesoftware#main.py

© www.soinside.com 2019 - 2024. All rights reserved.