有没有更优雅的方法来使用java 8从列表中获取随机未使用的项?

问题描述 投票:8回答:3

要重构的功能......

<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {
    return allItems.stream()
            .filter(item -> !usedItems.contains(item))
            .sorted((o1, o2) -> new Random().nextInt(2) - 1)
            .findFirst()
            .orElseThrow(() -> new RuntimeException("Did not find item!"));
}

功能可能会像这样使用......

System.out.println(
            notUsedRandomItem(
                    Arrays.asList(1, 2, 3, 4), 
                    Arrays.asList(1, 2)
            )
    ); // Should print either 3 or 4

编辑:通过针对人员列表运行它们来收集建议的实现和测试效率。

edit2:为Person类添加了缺少的equals方法。

import java.util.*;
import java.util.concurrent.TimeUnit;
import java.util.function.BiFunction;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

import static java.util.stream.Collectors.toList;

class Functions {

    <T> T notUsedRandomItemOriginal(List<T> allItems, List<T> usedItems) {
        return allItems.stream()
                .filter(item -> !usedItems.contains(item))
                .sorted((o1, o2) -> new Random().nextInt(2) - 1)
                .findFirst()
                .orElseThrow(() -> new RuntimeException("Did not find item!"));
    }

    <T> T notUsedRandomItemByAominè(List<T> allItems, List<T> usedItems) {
        List<T> distinctItems = allItems.stream()
                .filter(item -> !usedItems.contains(item))
                .collect(toList());

        if (distinctItems.size() == 0) throw new RuntimeException("Did not find item!");

        return distinctItems.get(new Random().nextInt(distinctItems.size()));
    }

    <T> T notUsedRandomItemByEugene(List<T> allItems, List<T> usedItems) {

        // this is only needed because your input List might not support removeAll
        List<T> left = new ArrayList<>(allItems);
        List<T> right = new ArrayList<>(usedItems);

        left.removeAll(right);

        return left.get(new Random().nextInt(left.size()));
    }

    <T> T notUsedRandomItemBySchaffner(List<T> allItems, List<T> usedItems) {

        Set<T> used = new HashSet<>(usedItems);
        List<T> all = new ArrayList<>(allItems);

        Collections.shuffle(all);

        for (T item : all) if (!used.contains(item)) return item;

        throw new RuntimeException("Did not find item!");
    }
}

public class ComparingSpeedOfNotUsedRandomItemFunctions {

    public static void main(String[] plaa) {
        runFunctionsWith(100);
        runFunctionsWith(1000);
        runFunctionsWith(10000);
        runFunctionsWith(100000);
        runFunctionsWith(200000);
    }

    static void runFunctionsWith(int itemCount) {

        TestConfiguration testConfiguration = new TestConfiguration();
        Functions functions = new Functions();

        System.out.println("Function execution time with " + itemCount + " items...");

        System.out.println("Schaffner: " +
                testConfiguration.timeSpentForFindingNotUsedPeople(
                        itemCount, (allPeople, usedPeople) ->
                                functions.notUsedRandomItemBySchaffner(allPeople, usedPeople)
                ));

        System.out.println("Eugene: " +
                testConfiguration.timeSpentForFindingNotUsedPeople(
                        itemCount, (allPeople, usedPeople) ->
                                functions.notUsedRandomItemByEugene(allPeople, usedPeople)
                ));

        System.out.println("Aominè: " +
                testConfiguration.timeSpentForFindingNotUsedPeople(
                        itemCount, (allPeople, usedPeople) ->
                                functions.notUsedRandomItemByAominè(allPeople, usedPeople)
                ));

        System.out.println("Original: " +
                testConfiguration.timeSpentForFindingNotUsedPeople(
                        itemCount, (allPeople, usedPeople) ->
                                functions.notUsedRandomItemOriginal(allPeople, usedPeople)
                ));
    }

}

class TestConfiguration {

    Long timeSpentForFindingNotUsedPeople(int numberOfPeople, BiFunction<List<Person>, List<Person>, Person> function) {

        ArrayList<Person> people = new ArrayList<>();
        IntStream.range(1, numberOfPeople).forEach(i -> people.add(new Person()));
        Collections.shuffle(people);

        List<Person> halfOfPeople =
                people.stream()
                        .limit(numberOfPeople / 2)
                        .collect(Collectors.toList());

        Collections.shuffle(halfOfPeople);

        long before = System.nanoTime();
        Person foundItem = function.apply(people, halfOfPeople);
        long after = System.nanoTime();

        // Return -1 if function do not return valid answer
        if (halfOfPeople.contains(foundItem))
            return (long) -1;

        return TimeUnit.MILLISECONDS.convert(after - before, TimeUnit.NANOSECONDS);
    }

    class Person {
        public final String name = UUID.randomUUID().toString();

        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;

            Person person = (Person) o;

            return name != null ? name.equals(person.name) : person.name == null;
        }

        @Override
        public int hashCode() {
            return name != null ? name.hashCode() : 0;
        }
    }
}

结果:

Function execution time with 100 items...
Schaffner: 0
Eugene: 1
Aominè: 2
Original: 5
Function execution time with 1000 items...
Schaffner: 0
Eugene: 14
Aominè: 13
Original: 5
Function execution time with 10000 items...
Schaffner: 2
Eugene: 564
Aominè: 325
Original: 348
Function execution time with 20000 items...
Schaffner: 3
Eugene: 1461
Aominè: 1418
Original: 1433
Function execution time with 30000 items...
Schaffner: 3
Eugene: 4616
Aominè: 2832
Original: 4567
Function execution time with 40000 items...
Schaffner: 4
Eugene: 10889
Aominè: 4903
Original: 10394

结论

当列表大小达到10000个项目时,到目前为止只有Schaffner的实现可用。

而且因为阅读相当简单,我会选择它作为最优雅的解决方案。

java java-stream
3个回答
3
投票

您应该使用HashSets来提高性能:

<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {

    Set<T> used = new HashSet<>(usedItems);
    Set<T> all = new HashSet<>(allItems);

    all.removeIf(used::contains); // or all.removeAll(used)

    if (all.isEmpty()) throw new RuntimeException("Did not find item!");

    int skip = new Random().nextInt(all.size());
    Iterator<T> it = all.iterator();
    for (int i = 0; i < skip; i++) it.next();

    return it.next();
}

如果它们属于all集,则会从used集中删除元素。正在使用Set.removeIfSet.contains时,元素的去除是最佳的w.r.t性能。然后,在结果集中跳过随机数量的元素,最后,返回该集合的下一个元素。


另一种方法是首先对all列表进行洗牌,然后简单地迭代并返回不属于used集的第一个元素:

<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {

    Set<T> used = new HashSet<>(usedItems);
    List<T> all = new ArrayList<>(allItems);

    Collections.shuffle(all);

    for (T item : all) if (!used.contains(item)) return item;

    throw new RuntimeException("Did not find item!");
}

编辑:检查最后一段代码,我现在意识到没有必要洗牌整个列表。相反,您可以随机化allItems列表的索引并返回不属于used集的第一个元素:

<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {

    Set<T> used = new HashSet<>(usedItems);

    return new Random().ints(allItems.size(), 0, allItems.size())
        .mapToObj(allItems::get)
        .filter(item -> !used.contains(item))
        .findAny()
        .orElseThrow(() -> new RuntimeException("Did not find item!"));
}

4
投票

我可以想到这一点,但不知道它与现有解决方案相比如何扩展:

<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {

    // this is only needed because your input List might not support removeAll
    List<T> left = new ArrayList<>(allItems);
    List<T> right = new ArrayList<>(usedItems);

    left.removeAll(right);

    return left.get(new Random().nextInt(left.size()));

}

要记住的一件事是sorted是一个有状态操作,因此它会对整个“diff”进行排序,但是你只能从中检索一个元素。你的Comparator也是错的,因为你可能会说它们是不同的两个值o1o2 - 这可能会以神秘的方式打破。


1
投票

你已经进入Comparator中间操作的sorted似乎错误和奇怪的方式使用Comparator到我的眼睛;这与@Eugene在帖子中提到的内容有关。

因此,我建议你避免任何类型的陷阱,并始终按照预期的方式使用API​​;而已。

如果你真的想要一个来自上述列表的随机元素;唯一可能的方法是找到两个列表的所有不同元素。所以我们无法提高这方面的速度。

一旦完成,我们只需要在包含不同元素的列表范围内生成一个随机整数,并在其中包含至少一个元素的情况下为其索引。

虽然我不得不承认,在没有使用流的情况下,可能有更好的方法来完成手头的任务;这是我如何稍微修改你的代码以消除.sorted((o1, o2) -> new Random().nextInt(2) - 1)的滥用。

<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {
      List<T> distinctItems =  allItems.stream()
                 .filter(item -> !usedItems.contains(item))
                 .collect(toList());

      if(distinctItems.size() == 0) throw new RuntimeException("Did not find item!");

      return distinctItems.get(new Random().nextInt(distinctItems.size()));
}
© www.soinside.com 2019 - 2024. All rights reserved.