要重构的功能......
<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {
return allItems.stream()
.filter(item -> !usedItems.contains(item))
.sorted((o1, o2) -> new Random().nextInt(2) - 1)
.findFirst()
.orElseThrow(() -> new RuntimeException("Did not find item!"));
}
功能可能会像这样使用......
System.out.println(
notUsedRandomItem(
Arrays.asList(1, 2, 3, 4),
Arrays.asList(1, 2)
)
); // Should print either 3 or 4
编辑:通过针对人员列表运行它们来收集建议的实现和测试效率。
edit2:为Person类添加了缺少的equals方法。
import java.util.*;
import java.util.concurrent.TimeUnit;
import java.util.function.BiFunction;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
import static java.util.stream.Collectors.toList;
class Functions {
<T> T notUsedRandomItemOriginal(List<T> allItems, List<T> usedItems) {
return allItems.stream()
.filter(item -> !usedItems.contains(item))
.sorted((o1, o2) -> new Random().nextInt(2) - 1)
.findFirst()
.orElseThrow(() -> new RuntimeException("Did not find item!"));
}
<T> T notUsedRandomItemByAominè(List<T> allItems, List<T> usedItems) {
List<T> distinctItems = allItems.stream()
.filter(item -> !usedItems.contains(item))
.collect(toList());
if (distinctItems.size() == 0) throw new RuntimeException("Did not find item!");
return distinctItems.get(new Random().nextInt(distinctItems.size()));
}
<T> T notUsedRandomItemByEugene(List<T> allItems, List<T> usedItems) {
// this is only needed because your input List might not support removeAll
List<T> left = new ArrayList<>(allItems);
List<T> right = new ArrayList<>(usedItems);
left.removeAll(right);
return left.get(new Random().nextInt(left.size()));
}
<T> T notUsedRandomItemBySchaffner(List<T> allItems, List<T> usedItems) {
Set<T> used = new HashSet<>(usedItems);
List<T> all = new ArrayList<>(allItems);
Collections.shuffle(all);
for (T item : all) if (!used.contains(item)) return item;
throw new RuntimeException("Did not find item!");
}
}
public class ComparingSpeedOfNotUsedRandomItemFunctions {
public static void main(String[] plaa) {
runFunctionsWith(100);
runFunctionsWith(1000);
runFunctionsWith(10000);
runFunctionsWith(100000);
runFunctionsWith(200000);
}
static void runFunctionsWith(int itemCount) {
TestConfiguration testConfiguration = new TestConfiguration();
Functions functions = new Functions();
System.out.println("Function execution time with " + itemCount + " items...");
System.out.println("Schaffner: " +
testConfiguration.timeSpentForFindingNotUsedPeople(
itemCount, (allPeople, usedPeople) ->
functions.notUsedRandomItemBySchaffner(allPeople, usedPeople)
));
System.out.println("Eugene: " +
testConfiguration.timeSpentForFindingNotUsedPeople(
itemCount, (allPeople, usedPeople) ->
functions.notUsedRandomItemByEugene(allPeople, usedPeople)
));
System.out.println("Aominè: " +
testConfiguration.timeSpentForFindingNotUsedPeople(
itemCount, (allPeople, usedPeople) ->
functions.notUsedRandomItemByAominè(allPeople, usedPeople)
));
System.out.println("Original: " +
testConfiguration.timeSpentForFindingNotUsedPeople(
itemCount, (allPeople, usedPeople) ->
functions.notUsedRandomItemOriginal(allPeople, usedPeople)
));
}
}
class TestConfiguration {
Long timeSpentForFindingNotUsedPeople(int numberOfPeople, BiFunction<List<Person>, List<Person>, Person> function) {
ArrayList<Person> people = new ArrayList<>();
IntStream.range(1, numberOfPeople).forEach(i -> people.add(new Person()));
Collections.shuffle(people);
List<Person> halfOfPeople =
people.stream()
.limit(numberOfPeople / 2)
.collect(Collectors.toList());
Collections.shuffle(halfOfPeople);
long before = System.nanoTime();
Person foundItem = function.apply(people, halfOfPeople);
long after = System.nanoTime();
// Return -1 if function do not return valid answer
if (halfOfPeople.contains(foundItem))
return (long) -1;
return TimeUnit.MILLISECONDS.convert(after - before, TimeUnit.NANOSECONDS);
}
class Person {
public final String name = UUID.randomUUID().toString();
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Person person = (Person) o;
return name != null ? name.equals(person.name) : person.name == null;
}
@Override
public int hashCode() {
return name != null ? name.hashCode() : 0;
}
}
}
结果:
Function execution time with 100 items...
Schaffner: 0
Eugene: 1
Aominè: 2
Original: 5
Function execution time with 1000 items...
Schaffner: 0
Eugene: 14
Aominè: 13
Original: 5
Function execution time with 10000 items...
Schaffner: 2
Eugene: 564
Aominè: 325
Original: 348
Function execution time with 20000 items...
Schaffner: 3
Eugene: 1461
Aominè: 1418
Original: 1433
Function execution time with 30000 items...
Schaffner: 3
Eugene: 4616
Aominè: 2832
Original: 4567
Function execution time with 40000 items...
Schaffner: 4
Eugene: 10889
Aominè: 4903
Original: 10394
结论
当列表大小达到10000个项目时,到目前为止只有Schaffner的实现可用。
而且因为阅读相当简单,我会选择它作为最优雅的解决方案。
您应该使用HashSet
s来提高性能:
<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {
Set<T> used = new HashSet<>(usedItems);
Set<T> all = new HashSet<>(allItems);
all.removeIf(used::contains); // or all.removeAll(used)
if (all.isEmpty()) throw new RuntimeException("Did not find item!");
int skip = new Random().nextInt(all.size());
Iterator<T> it = all.iterator();
for (int i = 0; i < skip; i++) it.next();
return it.next();
}
如果它们属于all
集,则会从used
集中删除元素。正在使用Set.removeIf
和Set.contains
时,元素的去除是最佳的w.r.t性能。然后,在结果集中跳过随机数量的元素,最后,返回该集合的下一个元素。
另一种方法是首先对all
列表进行洗牌,然后简单地迭代并返回不属于used
集的第一个元素:
<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {
Set<T> used = new HashSet<>(usedItems);
List<T> all = new ArrayList<>(allItems);
Collections.shuffle(all);
for (T item : all) if (!used.contains(item)) return item;
throw new RuntimeException("Did not find item!");
}
编辑:检查最后一段代码,我现在意识到没有必要洗牌整个列表。相反,您可以随机化allItems
列表的索引并返回不属于used
集的第一个元素:
<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {
Set<T> used = new HashSet<>(usedItems);
return new Random().ints(allItems.size(), 0, allItems.size())
.mapToObj(allItems::get)
.filter(item -> !used.contains(item))
.findAny()
.orElseThrow(() -> new RuntimeException("Did not find item!"));
}
我可以想到这一点,但不知道它与现有解决方案相比如何扩展:
<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {
// this is only needed because your input List might not support removeAll
List<T> left = new ArrayList<>(allItems);
List<T> right = new ArrayList<>(usedItems);
left.removeAll(right);
return left.get(new Random().nextInt(left.size()));
}
要记住的一件事是sorted
是一个有状态操作,因此它会对整个“diff”进行排序,但是你只能从中检索一个元素。你的Comparator
也是错的,因为你可能会说它们是不同的两个值o1
和o2
- 这可能会以神秘的方式打破。
你已经进入Comparator
中间操作的sorted
似乎错误和奇怪的方式使用Comparator
到我的眼睛;这与@Eugene在帖子中提到的内容有关。
因此,我建议你避免任何类型的陷阱,并始终按照预期的方式使用API;而已。
如果你真的想要一个来自上述列表的随机元素;唯一可能的方法是找到两个列表的所有不同元素。所以我们无法提高这方面的速度。
一旦完成,我们只需要在包含不同元素的列表范围内生成一个随机整数,并在其中包含至少一个元素的情况下为其索引。
虽然我不得不承认,在没有使用流的情况下,可能有更好的方法来完成手头的任务;这是我如何稍微修改你的代码以消除.sorted((o1, o2) -> new Random().nextInt(2) - 1)
的滥用。
<T> T notUsedRandomItem(List<T> allItems, List<T> usedItems) {
List<T> distinctItems = allItems.stream()
.filter(item -> !usedItems.contains(item))
.collect(toList());
if(distinctItems.size() == 0) throw new RuntimeException("Did not find item!");
return distinctItems.get(new Random().nextInt(distinctItems.size()));
}