Python：我做错了什么？（Python 函数中的关系）

Question

我有一个元组形式的一个月内一些员工的工作时间列表，我想根据当月的最高工作时间来选择当月的员工。但是如果我们有一些元组之间有联系怎么办？职能将如何改变，因为我想包括他们之间有联系的员工。

这就是我做得太过分了。

'''蟒蛇

work_hours = [("Abby",400),("Billy",500),("Cassie",700),("David",700)]

def employee_check（工作时间）：

current_max_hours = 0           
employee_of_the_month = ""

for a,b in work_hours:
    if b >= current_max_hours:
        employee_of_the_month = a
        current_max_hours = b
    
    else:
        pass

return (employee_of_the_month,current_max_hours)

员工检查（工作时间）

'''

然后我得到最后一个作为我的输出。

'''蟒蛇

（'大卫'，700）

'''

Answer 1

您只将一名员工存储为本月的员工，即使有平局也是如此。您需要修改代码以存储多个员工，以防出现平局。

def employee_check(work_hours: list[tuple[str, int]]) -> tuple[list[str], int]:
    """
    Determines the employee(s) with the highest number of work hours in a month.
    
    :param work_hours: List of tuples containing employee names and their corresponding work hours.
    
    :return: A tuple containing a list of employee(s) of the month and the highest number of work hours.
    """
    current_max_hours = 0
    employees_of_the_month = []
    for employee, hours in work_hours:
        if hours > current_max_hours:
            employees_of_the_month = [employee]
            current_max_hours = hours
        elif hours == current_max_hours:
            employees_of_the_month.append(employee)
    return (employees_of_the_month, current_max_hours)

work_hours = [("Abby", 400), ("Billy", 500), ("Cassie", 700), ("David", 700)]
print(employee_check(work_hours))

输出：

(['Cassie', 'David'], 700)

从上面的输出中可以看出，该函数现在将返回当月员工列表以及最高工作时数。

Answer 2

如果您的数据在 pandas DataFrame 中，您可以使用以下一行。如果你想要不同格式的输出，很容易修改它。

import pandas as pd
import numpy as np
# If the data is already in a dataframe
df = pd.DataFrame(index=['Abbie','Billy','Cassie','David'], data=[400,500,700,700], columns=['hours'])

# With pandas
print(df.loc[df['hours'].idxmax()])

下面是使用 numpy 的一种方法。您似乎是 Python 的新手，所以我将此作为示例展示，以便您也更加熟悉 numpy：

# With numpy
my_array = df['hours'].to_numpy()
my_max = np.max(my_array)
my_idx = np.argwhere(my_array == my_max).flatten()
# flatten converts it from a 2x1 array to a 2-array (ie from 2 dimensions to 1)

# Note that iloc uses position-based indices, not label-based.
print(df.iloc[my_idx,])

如果数据在元组中，您可以将其转换为 numpy 数组或数据帧，如下所示：

# If the data is in a tuple
work_hours = [("Abby",400),("Billy",500),("Cassie",700),("David",700)]
df2 = pd.DataFrame(np.array(work_hours), columns=['employee','hours'])
# up to you if you want to make the employee the index

作为一般规则，在 Python 或 R 等解释型语言中，您应该尝试向量化代码并尽可能避免循环。如果您不熟悉这个概念，请查找它。

当循环不可避免时，numba https://numba.pydata.org/ 可以在加速代码方面创造奇迹。显然，在这个小例子中它没有什么区别，但对于更多计算密集型任务请记住它。

Answer 3

您所做的比较是，如果

current_max_hours

变量大于或等于它，则

变量将被替换，所以很明显，如果列表中的最后一项的值等于它之前的项目的值, 它将替换当前值。

在这种情况下，列表的当前顺序是 ("Cassie",700) -> ("David",700), 700 >= 700?是的，所以 Cassie 被 David 取代了。

一个可能的解决方案是使用

sorted

根据谁有最多的时间来组织列表，然后只显示最多的。

work_hours = [("Abby",400), ("Billy",500), ("Cassie",700), ("David",700)]

def employee_check(work_hours: list[tuple[str, int]]) -> list[tuple[str, int]]:
    sorted_work_hours = sorted(work_hours, key=lambda x: x[1], reverse=True)
    employees_of_the_month = []
    for worker, hours in sorted_work_hours:
        if hours == sorted_work_hours[0][1]:
            employees_of_the_month.append((worker, hours))
    return employees_of_the_month

print(employee_check(work_hours))

输出：

[('Cassie', 700), ('David', 700)]

Python：我做错了什么？（Python 函数中的关系）

问题描述投票：0回答：3

3个回答

最新问题

Python：我做错了什么？ （Python 函数中的关系）

问题描述 投票：0回答：3

3个回答

最新问题

Python：我做错了什么？（Python 函数中的关系）

问题描述投票：0回答：3