分块进行I / O。一旦一个块可用,就开始处理这些块,同时在后台读取其他块(但预先读取的块不超过X个)。接收时并行处理块。按读取顺序(即,按读取的块的原始顺序)消耗每个已处理的块。
我已经建立了一个MWE类来模仿这种情况,并且在一定程度上有效:
Flux.fromFile(path, some-function-to-define-chunk) // done with Flux.generate in MWE below .prefetchOnIoThread(x-count: int) // at this point we try to maintain a buffer filled with x-count pre-read chunks .parallelMapOrdered(n-threads: int, limit-process-ahead: int) // n-threads: are constantly trying to drain the x-count buffer, doing some transformation // limit-process-ahead: as the operation results are needed in order, if we encounter an // input element that takes a while to process, we don't want the pipeline to run too far // ahead of this problematic element (to not overflow the buffers and use too much memory) .consume(TMapped v)
当前使用Reactor(MWE)的尝试:
Dependency:implementation 'io.projectreactor:reactor-core:3.3.5.RELEASE'
import reactor.core.Disposable;
import reactor.core.publisher.Flux;
import reactor.core.publisher.ParallelFlux;
import reactor.core.scheduler.Schedulers;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.concurrent.atomic.AtomicInteger;
public class Tmp {
static final SimpleDateFormat fmt = new SimpleDateFormat("HH:mm:ss.SSS");
static long millisRead = 1; // time taken to "read" a chunk
static long millisProcess = 100; // time take to "process" a chunk
public static void main(String[] args) {
log("Before flux construct");
// Step 1: Generate / IO
Flux<Integer> f = Flux.generate( // imitate IO
AtomicInteger::new,
(atomicInteger, synchronousSink) -> {
sleepQuietly(millisRead);
Integer next = atomicInteger.getAndIncrement();
if (next > 50) {
synchronousSink.complete();
log("Emitting complete");
} else {
log("Emitting next : %d", next);
synchronousSink.next(next);
}
return atomicInteger;
},
atomicInteger -> log("State consumer called: pos=%s", atomicInteger.get()));
f = f.publishOn(Schedulers.elastic());
f = f.subscribeOn(Schedulers.elastic());
ParallelFlux<Integer> pf = f.parallel(2, 2);
pf = pf.runOn(Schedulers.elastic(), 2);
// Step 2: transform in parallel
pf = pf.map(i -> { // imitate processing steps
log("Processing begin: %d", i);
sleepQuietly(millisProcess); // 10x the time it takes to create an input for this operation
log("Processing done : %d", i);
return 1000 + i;
});
// Step 3: use transformed data, preferably in order of generation
Disposable sub = pf.sequential(3).subscribe(
next -> log(String.format("Finally got: %d", next)),
err -> err.printStackTrace(),
() -> log("Complete!"));
while (!sub.isDisposed()) {
log("Waiting pipeline completion...");
sleepQuietly(500);
}
log("Main done");
}
public static void log(String message) {
Thread t = Thread.currentThread();
Date d = new Date();
System.out.printf("[%s] @ [%s]: %s\n", t.getName(), fmt.format(d), message);
}
public static void log(String format, Object... args) {
log(String.format(format, args));
}
public static void sleepQuietly(long millis) {
try {
Thread.sleep(millis);
} catch (InterruptedException e) {
throw new IllegalStateException();
}
}
}
问题陈述:以块的形式进行I / O。一旦一个块可用,就开始处理这些块,同时在后台读取其他块(但预先读取的块不超过X个)。处理...
考虑到缺乏答案,我会发表我的想法。