降低Python3中HackerRank排序问题的时间复杂度

问题描述 投票:4回答:1

我正在尝试从HackerRank中解决Python3中的排序问题:https://www.hackerrank.com/challenges/fraudulent-activity-notifications/problem

此问题需要找到运行中的每个子列表的中位数。

我的代码通过了示例测试用例,但由于超时终止而未完全通过实际测试用例。我怀疑每次使用sort()来查找中位数都会导致时间延迟。

如何改善我的代码?

def activityNotifications(expenditure, d):
    totalDays = len(expenditure)
    notified = 0

    for x in range(d, totalDays):
        check = expenditure[x-d:x]
        check.sort()

        if d % 2 == 0:
            median = (check[int(d/2)] + check[int((d-2)/2)])/2
        else:
            median = check[int((d-1)/2)]

        if expenditure[x] >= median * 2:
            notified += 1

    return notified
python python-3.x sorting median
1个回答
2
投票
要在每次迭代中找到中位数,请对子数组进行排序。这并不是真正有效的方法,尤其是在d不小的情况下。每次迭代的时间复杂度为O(dlog(d))

要找到中位数,我们需要一个排序数组,但不需要sort()方法。如果我们注意到每个expenditure[i]都在[0;200]范围内,那么这里的计数排序听起来是个好主意。基本上,我们使用i计算每个数字counts[i]的频率。要获得排序的数组,我们只需要迭代j: counts[j] > 0

因此,如果counts在每个长度expenditure(间隔d)的间隔中保持[i; i + d)个数字的频率,我们可以通过检查201中的最多counts个数字来找到中位数(请参阅代码以获取细节)。移至下一个间隔[i+1; i+d+1),需要将数字i的频率递减为counts[i]--,并增加数字i+d的频度。这种方法需要O(n*201)时间和O(201)空间复杂度。

现在,请参见下面的代码:

def activityNotifications(expenditure, d): totalDays = len(expenditure) counts = [0] * 201 notifications = 0 for i in xrange(totalDays): # now we have enough data to check if there was any fraudulent activity if i >= d: # let's count frequencies of numbers in range [i - d; i) current_num_of_numbers = 0 prev_number = -1 for j in xrange(201): if counts[j] > 0: current_num_of_numbers += counts[j] # now we can determine the median because we have enough numbers if d < (2 * current_num_of_numbers): if (d % 2 == 0) and (current_num_of_numbers - counts[j] == d / 2): median = (prev_number + j) / 2 else: median = j # if the condition is met then send a notification if expenditure[i] >= (median * 2): notifications += 1 break prev_number = j counts[expenditure[j - d]] -= 1 counts[expenditure[i]] += 1 return notifications


推荐问答