将列表划分为n个大小的列表的有效方法

问题描述 投票:48回答:13

我有一个数组,我想分成更小的n个数组,并对每个数组执行操作。我目前的做法是

用Java中的ArrayLists实现(任何伪代码都可以)

    for (int i = 1; i <= Math.floor((A.size() / n)); i++) {
            ArrayList temp = subArray(A, ((i * n) - n),
                    (i * n) - 1);
            // do stuff with temp
        }

    private ArrayList<Comparable> subArray(ArrayList A, int start,
                int end) {
            ArrayList toReturn = new ArrayList();
            for (int i = start; i <= end; i++) {
                toReturn.add(A.get(i));
            }
            return toReturn;
        }

其中A是列表,n是所需列表的大小

我相信这种方式在处理相当大的列表(大小高达100万)时花费了太多时间,所以我试图弄清楚什么会更有效率。

java arrays arraylist partitioning
13个回答
92
投票

你想要做一些利用List.subList(int, int)视图而不是复制每个子列表的东西。要轻松地做到这一点,请使用GuavaLists.partition(List, int)方法:

List<Foo> foos = ...
for (List<Foo> partition : Lists.partition(foos, n)) {
  // do something with partition
}

请注意,与许多事物一样,这不是非常有效的List不是RandomAccess(如LinkedList)。


0
投票

这是一种将List分区为子列表数组的方法,它确保除最后一个子列表之外的所有子列表都具有相同数量的元素:

static <T> List<T>[] split(List<T> source, int numPartitions) {
    if (numPartitions < 2)
        return new List[]{source};

    final int sourceSize = source.size(),
        partitions = numPartitions > sourceSize ? sourceSize: numPartitions,
        increments = sourceSize / partitions;

    return IntStream.rangeClosed(0, partitions)
        .mapToObj(i -> source.subList(i*increments, Math.min((i+1)*increments, sourceSize)))
        .toArray(List[]::new);
}

如果你想保证numPartitions数组大小,那么你想要:

static <T> List<T>[] split(List<T> source, int numPartitions) {
    if (numPartitions < 2)
        return new List[]{source};

    final int sourceSize = source.size(),
        partitions = numPartitions > sourceSize ? sourceSize: numPartitions,
        increments = sourceSize / partitions;

    return IntStream.range(0, partitions)
        .mapToObj(i -> source.subList(i*increments, i == partitions-1 ? sourceSize : (i+1)*increments))
        .toArray(List[]::new);
}

0
投票
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class SubListTest
{
    public static void main(String[] args)
    {
        List<String> alphabetNames = new ArrayList<String>();

        // populate alphabetNames array with AAA,BBB,CCC,.....
        int a = (int) 'A';
        for (int i = 0; i < 26; i++)
        {
            char x = (char) (a + i);
            char[] array = new char[3];
            Arrays.fill(array, x);
            alphabetNames.add(new String(array));
        }

        int[] maxListSizes = new int[]
        {
            5, 10, 15, 20, 25, 30
        };

        for (int maxListSize : maxListSizes)
        {
            System.out.println("######################################################");
            System.out.println("Partitioning original list of size " + alphabetNames.size() + " in to sub lists of max size "
                + maxListSize);

            ArrayList<List<String>> subListArray = new ArrayList<List<String>>();
            if (alphabetNames.size() <= maxListSize)
            {
                subListArray.add(alphabetNames);
            }
            else
            {
                // based on subLists of maxListSize X
                int subListArraySize = (alphabetNames.size() + maxListSize - 1) / maxListSize;
                for (int i = 0; i < subListArraySize; i++)
                {
                    subListArray.add(alphabetNames.subList(i * maxListSize,
                        Math.min((i * maxListSize) + maxListSize, alphabetNames.size())));
                }
            }

            System.out.println("Resulting number of partitions " + subListArray.size());

            for (List<String> subList : subListArray)
            {
                System.out.println(subList);
            }
        }
    }
}

输出:

######################################################
Partitioning original list of size 26 in to sub lists of max size 5
Resulting number of partitions 6
[AAA, BBB, CCC, DDD, EEE]
[FFF, GGG, HHH, III, JJJ]
[KKK, LLL, MMM, NNN, OOO]
[PPP, QQQ, RRR, SSS, TTT]
[UUU, VVV, WWW, XXX, YYY]
[ZZZ]
######################################################
Partitioning original list of size 26 in to sub lists of max size 10
Resulting number of partitions 3
[AAA, BBB, CCC, DDD, EEE, FFF, GGG, HHH, III, JJJ]
[KKK, LLL, MMM, NNN, OOO, PPP, QQQ, RRR, SSS, TTT]
[UUU, VVV, WWW, XXX, YYY, ZZZ]
######################################################
Partitioning original list of size 26 in to sub lists of max size 15
Resulting number of partitions 2
[AAA, BBB, CCC, DDD, EEE, FFF, GGG, HHH, III, JJJ, KKK, LLL, MMM, NNN, OOO]
[PPP, QQQ, RRR, SSS, TTT, UUU, VVV, WWW, XXX, YYY, ZZZ]
######################################################
Partitioning original list of size 26 in to sub lists of max size 20
Resulting number of partitions 2
[AAA, BBB, CCC, DDD, EEE, FFF, GGG, HHH, III, JJJ, KKK, LLL, MMM, NNN, OOO, PPP, QQQ, RRR, SSS, TTT]
[UUU, VVV, WWW, XXX, YYY, ZZZ]
######################################################
Partitioning original list of size 26 in to sub lists of max size 25
Resulting number of partitions 2
[AAA, BBB, CCC, DDD, EEE, FFF, GGG, HHH, III, JJJ, KKK, LLL, MMM, NNN, OOO, PPP, QQQ, RRR, SSS, TTT, UUU, VVV, WWW, XXX, YYY]
[ZZZ]
######################################################
Partitioning original list of size 26 in to sub lists of max size 30
Resulting number of partitions 1
[AAA, BBB, CCC, DDD, EEE, FFF, GGG, HHH, III, JJJ, KKK, LLL, MMM, NNN, OOO, PPP, QQQ, RRR, SSS, TTT, UUU, VVV, WWW, XXX, YYY, ZZZ]

-1
投票

关于什么

Arrays.copyOfRange( original, from, to )

?


-1
投票

使用Java 8的一行:

IntStream.range(0, list.size() / batchSize + 1)
        .mapToObj(i -> list.subList(i * batchSize,
                Math.min(i * batchSize + batchSize, list.size())))
        .filter(s -> !s.isEmpty()).collect(Collectors.toList());

13
投票

如果您正在使用列表,我使用“Apache Commons Collections 4”库。它在ListUtils类中有一个分区方法:

...
int targetSize = 100;
List<Integer> largeList = ...
List<List<Integer>> output = ListUtils.partition(largeList, targetSize);

这种方法改编自http://code.google.com/p/guava-libraries/


11
投票

例如:

    int partitionSize = 10;
    List<List<String>> partitions = new ArrayList<>();

    for (int i=0; i<yourlist.size(); i += partitionSize) {
        partitions.add(yourlist.subList(i, Math.min(i + partitionSize, yourlist.size())));
    }

    for (List<String> list : partitions) {
        //Do your stuff on each sub list
    }

3
投票

好吧,在我看到ColinD的回答(+1)之前我自己写了一篇,使用Guava绝对是要走的路。单独留下太有趣了,所以下面给出了列表的副本而不是视图,因此GUava的效率肯定比这更高。我发布这个是因为写它很有趣而不是暗示它有效:

Hamcrest测试(无论如何):

assertThat(chunk(asList("a", "b", "c", "d", "e"), 2), 
           equalTo(asList(asList("a", "b"), asList("c", "d"), asList("e"))));

代码:

public static <T> Iterable<Iterable<T>> chunk(Iterable<T> in, int size) {
    List<Iterable<T>> lists = newArrayList();
    Iterator<T> i = in.iterator();
    while (i.hasNext()) {
        List<T> list = newArrayList();
        for (int j=0; i.hasNext() && j<size; j++) {
            list.add(i.next());
        }
        lists.add(list);
    }
    return lists;
}

2
投票
public <E> Iterable<List<E>> partition(List<E> list, final int batchSize)
{
    assert(batchSize > 0);
    assert(list != null);
    assert(list.size() + batchSize <= Integer.MAX_VALUE); //avoid overflow

    int idx = 0;

    List<List<E>> result = new ArrayList<List<E>>();

    for (idx = 0; idx + batchSize <= list.size(); idx += batchSize) {
        result.add(list.subList(idx, idx + batchSize));
    }
    if (idx < list.size()) {
        result.add(list.subList(idx, list.size()));
    }

    return result;
}

2
投票

如果您不想使用库,这是我的解决方案

1.分为N个相等的部分:

private <T> List<List<T>> nPartition(List<T> objs, final int N) {
    return new ArrayList<>(IntStream.range(0, objs.size()).boxed().collect(
            Collectors.groupingBy(e->e%N,Collectors.mapping(e->objs.get(e), Collectors.toList())
                    )).values());
}

2.要分组N个项目:

private <T> List<List<T>> nPartition(List<T> objs, final int N) {
    return new ArrayList<>(IntStream.range(0, objs.size()).boxed().collect(
            Collectors.groupingBy(e->e/N,Collectors.mapping(e->objs.get(e), Collectors.toList())
                    )).values());
    }

在这里行动:https://ideone.com/QiQnbE


1
投票

我刚刚实现了一个列表分区,因为我无法使用库。

所以我想在这里分享我的代码:

import java.util.Iterator;
import java.util.List;
import java.util.NoSuchElementException;

public class ListPartitioning<T> implements Iterable<List<T>> {

  private final List<T> list;
  private final int partitionSize;

  public ListPartitioning(List<T> list, int partitionSize) {
    if (list == null) {
      throw new IllegalArgumentException("list must not be null");
    }
    if (partitionSize < 1) {
      throw new IllegalArgumentException("partitionSize must be 1 or greater");
    }
    this.list = list;
    this.partitionSize = partitionSize;
  }

  @Override
  public Iterator<List<T>> iterator() {
    return new ListPartitionIterator<T>(list, partitionSize);
  }

  private static class ListPartitionIterator<T> implements Iterator<List<T>> {

    private int index = 0;

    private List<T> listToPartition;
    private int partitionSize;
    private List<T> nextPartition;

    public ListPartitionIterator(List<T> listToPartition, int partitionSize) {
      this.listToPartition = listToPartition;
      this.partitionSize = partitionSize;
    }

    @Override
    public boolean hasNext() {
      return index < listToPartition.size();
    }

    @Override
    public List<T> next() {
      if (!hasNext()) {
        throw new NoSuchElementException();
      }

      int partitionStart = index;
      int partitionEnd = Math.min(index + partitionSize, listToPartition.size());

      nextPartition = listToPartition.subList(partitionStart, partitionEnd);
      index = partitionEnd;
      return nextPartition;
    }

    @Override
    public void remove() {
      if (nextPartition == null) {
        throw new IllegalStateException("next must be called first");
      }

      nextPartition.clear();
      index -= partitionSize;
      nextPartition = null;
    }
  }
}

并且基于testng的单元测试。

import org.testng.Assert;
import org.testng.annotations.Test;

import java.util.*;


public class ListPartitioningTest {

  @Test(expectedExceptions = IllegalArgumentException.class)
  public void nullList() {
    ListPartitioning<String> lists = new ListPartitioning<String>(null, 1);
  }

  @Test(groups = Group.UNIT_TEST, expectedExceptions = IllegalArgumentException.class)
  public void wrongPartitionSize() {
    ListPartitioning<String> lists = new ListPartitioning<String>(new ArrayList<String>(), 0);
  }


  @Test()
  public void iteratorTest() {
    List<Integer> integers = Arrays.asList(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15);
    ListPartitioning<Integer> listPartitioning = new ListPartitioning<Integer>(integers, 7);
    Iterator<List<Integer>> partitionIterator = listPartitioning.iterator();
    Assert.assertNotNull(partitionIterator);

    Assert.assertTrue(partitionIterator.hasNext(), "next partition (first)");
    List<Integer> partition = partitionIterator.next();
    Assert.assertEquals(partition, Arrays.asList(0, 1, 2, 3, 4, 5, 6));

    Assert.assertTrue(partitionIterator.hasNext(), "next partition (second)");
    partition = partitionIterator.next();
    Assert.assertEquals(partition, Arrays.asList(7, 8, 9, 10, 11, 12, 13));

    Assert.assertTrue(partitionIterator.hasNext(), "next partition (third)");
    partition = partitionIterator.next();
    Assert.assertEquals(partition, Arrays.asList(14, 15));

    Assert.assertFalse(partitionIterator.hasNext());
  }

  @Test(expectedExceptions = NoSuchElementException.class)
  public void noSuchElementException() {
    List<Integer> integers = Arrays.asList(1);
    ListPartitioning<Integer> listPartitioning = new ListPartitioning<Integer>(integers, 2);
    Iterator<List<Integer>> partitionIterator = listPartitioning.iterator();
    List<Integer> partition = partitionIterator.next();
    partition = partitionIterator.next();
  }

  @Test(expectedExceptions = IllegalStateException.class)
  public void removeWithoutNext() {
    List<Integer> integers = new ArrayList<Integer>(Arrays.asList(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15));
    ListPartitioning<Integer> listPartitioning = new ListPartitioning<Integer>(integers, 7);
    Iterator<List<Integer>> partitionIterator = listPartitioning.iterator();
    partitionIterator.remove();
  }

  @Test()
  public void remove() {
    List<Integer> integers = new ArrayList<Integer>(Arrays.asList(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15));
    ListPartitioning<Integer> listPartitioning = new ListPartitioning<Integer>(integers, 7);
    Iterator<List<Integer>> partitionIterator = listPartitioning.iterator();

    partitionIterator.next();
    partitionIterator.next();

    partitionIterator.remove();
    Assert.assertTrue(partitionIterator.hasNext(), "next partition ");
    List<Integer> partition = partitionIterator.next();
    Assert.assertEquals(partition, Arrays.asList(14, 15));

    Assert.assertFalse(partitionIterator.hasNext());

    Assert.assertEquals(integers, Arrays.asList(0, 1, 2, 3, 4, 5, 6, 14, 15));
  }
}

0
投票

如果您正在处理数组,可以使用System.arraycopy()

 int[] a = {1,2,3,4,5};

 int[] b = new int[2];
 int[] c = new int[3];

 System.arraycopy(a, 0, b, 0, 2); // b will be {1,2}
 System.arraycopy(a, 2, c, 0, 3); // c will be {3,4,5}

0
投票

由于您希望优化性能,因此应使用并行流而不是for循环。这样您就可以使用多个线程。

Lists.partition(A, n).parallelStream().forEach({
    //do stuff with temp
});

您还可以使用其他方式与流进行混合,例如,如果匹配您的目的,则收集或映射。

© www.soinside.com 2019 - 2024. All rights reserved.