我遇到了一个问题,即在应用程序运行了一段时间后,发送给akka群集中的远程参与者的消息具有很高的延迟(10+ s)。日志表明,提供程序尝试解析旧的参与者ActorPath
,与我实际向其发送消息的ActorRef
无关。您可以从问题底部的日志中找到节选。
我的应用程序包含3个ActorSystem,它们通过akka-cluster
进行交互,通过artery
进行远程传输配置,我将在下面提供applications.conf
。
我以编程方式创建了一个远程演员,将其作为本地演员的孩子(like so),然后再次从该子演员那里做同样的事情(因此最终以不同的ActorSystem
结束了父子孙子)。
如果我在应用程序开始时将孙子的消息发送到其父项,我确实会收到具有适当延迟(<10毫秒)的消息。我确实在同一台物理计算机上运行所有三个系统。
但是,如果我启动(并杀死)这三个系统上的其他参与者,则此类消息的消息延迟变得不切实际。我认为这与ActorSystem
的LocalActorRefProvider
之一试图解决先前创建(并且已经死亡)的演员的ActorPath
有关。我将发布应用程序日志的摘录,其中显示了这种情况。
[我不明白,当我专门向(远程)ActorRef
发送消息时,为什么还需要解决任何其他引用,并且真的感谢您的任何帮助或建议。
// application.conf(s)
akka {
actor {
provider = "cluster"
serializers {
kryo = "com.twitter.chill.akka.AkkaSerializer"
}
serialization-bindings {
"java.io.Serializable" = kryo
"akka.actor.Props" = kryo
}
enable-additional-serialization-bindings = on
}
remote {
log-sent-messages = on
log-received-messages = on
log-remote-lifecycle-events = off
artery {
enabled = on
transport = tcp
canonical.hostname = "127.0.0.1" // these obv differ depending on the system
canonical.port = 7900
}
}
cluster {
seed-nodes = [ // set in AbstractSystem
"akka://[email protected]:7900"
],
# WARN Don't use auto-down feature of Akka Cluster in production.
# auto-down-unreachable-after = 10s
}
loglevel = "DEBUG"
}
// ParentActor.java
public class ParentActor extends ClusterAwareActor {
public static final String DEFAULT_NAME = "ParentActor";
public static Props props() {
return Props.create(TestActor.class, () -> new TestActor());
}
TestActor() {}
private final Instant initTime = Instant.now();
@Override
public Receive createReceive() {
return receiveBuilder()
.match(ClusterEvent.CurrentClusterState.class, this::handle)
.match(ClusterEvent.MemberUp.class, this::handle)
.build().orElse(super.createReceive());
}
@Override
protected void handle(ClusterEvent.CurrentClusterState msg) {
super.handle(msg);
Collection<ActorRef> children = this.remoteActorsOf(ChildActor.props(), ChildActor.DEFAULT_NAME, SystemRole.SLAVE, true);
}
}
// excerpt from ClusterAwareActor
// [ ... ]
protected Collection<ActorRef> remoteActorsOf(Props props, String actorName, SystemRole role, boolean skipOwnSystem) {
Collection<ActorRef> results = Lists.newArrayList();
Collection<Member> members = this.systemRoleMemberMap.get(role);
for (Member member : members) {
if (cluster.selfMember().equals(member) && skipOwnSystem) continue;
Props remoteProps = props.withDeploy(new Deploy(new RemoteScope(member.address())));
if (actorName != null) {
results.add(this.getContext().actorOf(remoteProps, CustomStringUtils.randomActorName(actorName))); // this is just so I don't have name clashes
} else {
results.add(this.getContext().actorOf(remoteProps));
}
}
return results;
}
// ChildActor.java
public class ChildActor extends ClusterAwareActor {
public static final String DEFAULT_NAME = "ChildActor";
public static Props props() {
return Props.create(TestActorTwo.class, () -> new TestActorTwo());
}
TestActorTwo() {}
private final Instant initTime = Instant.now();
@Override
public Receive createReceive() {
return receiveBuilder()
.match(ClusterEvent.CurrentClusterState.class, this::handle)
.match(Messages.PongWithSendTime.class, this::handle)
.build().orElse(super.createReceive());
}
@Override
protected void handle(ClusterEvent.CurrentClusterState msg) {
super.handle(msg);
Collection<ActorRef> children = this.remoteActorsOf(GrandChildActor.props(), SystemRole.SLAVE, true);
Preconditions.checkState(children.size() == 1);
Lists.newArrayList(children).get(0).tell(new Messages.PingMessage(), self());
log().info("[{}] Sent PING at {} ms", DEFAULT_NAME, Duration.between(this.initTime, Instant.now()).toMillis());
}
private void handle(Messages.PongWithTime msg) {
log().info("[{}] Received PONG at {} ms, transmission took ~ {} ms",
DEFAULT_NAME,
Duration.between(this.initTime, Instant.now()).toMillis(),
Duration.between(msg.getSendTime(), Instant.now()).toMillis());
}
}
// GrandChildActor.java
public class GrandChildActor extends AbstractLoggingActor {
public static final String DEFAULT_NAME = "GrandChildActor";
public static Props props() {
return Props.create(PongSayer.class, () -> new PongSayer());
}
PongSayer() {}
@Override
public Receive createReceive() {
return receiveBuilder()
.match(Messages.PingMessage.class, this::handle)
.build().orElse(super.createReceive());
}
private void handle(Messages.PingMessage __) {
getContext().getParent().tell(new Messages.PongWithTime(Instant.now()), self());
}
}
只要没有早日创建和杀死其他参与者,所有这些都可以正常工作(即,消息能够以相当快的速度通过)。在这种情况下,单个消息的传输(仅包装Instant
)可能需要10到30秒。
您可以在ActorSystem
中看到包含ChildActor
的日志:
[INFO] [10/05/2019 22:37:12.431] [dispatcher] [akka://[email protected]:7700/remote/akka/[email protected]:7900/user/ParenteActor/ChildActor] Created GrandChildActor and sent PingMessage.
[...]
[DEBUG] [10/05/2019 22:37:13.668] [dispatcher] [akka.actor.LocalActorRefProvider(akka://app)] Resolve (deserialization) of path [remote/akka/[email protected]:7900/user/path/to/other/unrelated/actor] doesn't match an active actor. It has probably been stopped, using deadLetters.
[DEBUG] [10/05/2019 22:37:13.738] [dispatcher] [akka.actor.LocalActorRefProvider(akka://app)] Resolve (deserialization) of path [remote/akka/[email protected]:7900/user/path/to/other/unrelated/actor] doesn't match an active actor. It has probably been stopped, using deadLetters.
[DEBUG] [10/05/2019 22:37:13.809] [dispatcher] [akka.actor.LocalActorRefProvider(akka://app)] Resolve (deserialization) of path [remote/akka/[email protected]:7900/user/path/to/other/unrelated/actor] doesn't match an active actor. It has probably been stopped, using deadLetters.
[...] // there is a metric ton of these
[DEBUG] [10/05/2019 22:37:32.498] [dispatcher] [akka.actor.LocalActorRefProvider(akka://app)] Resolve (deserialization) of path [remote/akka/[email protected]:7900/user/path/to/other/unrelated/actor] doesn't match an active actor. It has probably been stopped, using deadLetters.
[DEBUG] [10/05/2019 22:37:32.568] [dispatcher] [akka.actor.LocalActorRefProvider(akka://app)] Resolve (deserialization) of path [remote/akka/[email protected]:7900/user/path/to/other/unrelated/actor] doesn't match an active actor. It has probably been stopped, using deadLetters.
[DEBUG] [10/05/2019 22:37:32.638] [dispatcher] [akka.actor.LocalActorRefProvider(akka://app)] Resolve (deserialization) of path [remote/akka/[email protected]:7900/user/path/to/other/unrelated/actor] doesn't match an active actor. It has probably been stopped, using deadLetters.
[DEBUG] [10/05/2019 22:37:32.708] [dispatcher] [akka.actor.LocalActorRefProvider(akka://app)] Resolve (deserialization) of path [remote/akka/[email protected]:7900/user/path/to/other/unrelated/actor] doesn't match an active actor. It has probably been stopped, using deadLetters.
// before finally
[INFO] [10/05/2019 22:37:32.711] [dispatcher] [akka://[email protected]:7700/remote/akka/[email protected]:7900/user/ParentActor/ChildActor] Received Lookup response. Transmission took 20234 ms.
由于信息不完整和删节,无法准确地找到确切答案,但是Akka不会尝试解决无关的ActorRefs。该消息表明集群的另一个节点实际上正在向这些死角发送消息。