我正在尝试使用Open MPI的Java绑定创建自定义数据类型(我称为AttribStruct
),但是当我尝试运行该程序时遇到了invalid datatype
错误。我怀疑是因为结构中的数组尺寸是在运行时确定的。
我的目标是在每个AttribStruct
和send
操作中发送n个recv
缓冲区。因此,我将所有n
个单独的缓冲区连接到一个大缓冲区中,以便将它们全部一起发送。然后,接收端将缓冲区解构。
下面是AttribStruct类
public class AttribStruct extends Struct {
// Pointers to the objectives, variables, and constraints
private final int objectives;
private final int variables;
private final int constraints;
// an int that represents a boolean for whether the objectives are valid (1 true, 0 false)
private final int validObjectiveFunctionsValues;
private final int validConstraintsViolationValues;
public final int objCount;
public final int varCount;
public final int constrCount;
public AttribStruct(int objCount, int varCount, int constrCount) {
this.objCount = objCount;
this.varCount = varCount;
this.constrCount = constrCount;
this.objectives = addDouble(this.objCount);
this.constraints = addDouble(this.constrCount);
this.variables = addDouble(this.varCount);
this.validObjectiveFunctionsValues = addInt();
this.validConstraintsViolationValues = addInt();
}
@Override
protected Data newData() {
return new Data();
}
public class Data extends Struct.Data {
/*
* Getters go here
*/
/*
* Setters go here
*/
} // End -- Data
}
下面是我发送多个结构的示例:
AttribStruct attr = new AttribStruct(4, 3, 0);
/*
* Build buffer goes here
*/
attr.getType().commit();
MPI.COMM_WORLD.iSend(toSend, n, attr.getType(), target, 0);
attr.getType().free();
toSend
是n
不同的AttribStruct缓冲区串联在一起,n
是我发送的attrib结构的数量,attr
是AttribStruct类的实例,target
是节点的等级我正在与之通信,0
只是一个占位符。
下面是目标节点接收消息的粗略示例:
AttribStruct attr = new AttribStruct(4, 3, 0);
attr.getType().commit();
MPI.COMM_WORLD.recv(msgBuffer, n, attr.getType(), MASTER_RANK, 0);
attr.getType().free();
/*
* Deconstruct buffer goes here
*/
但是,当我运行程序时,出现以下错误消息:
[node01:13219] *** An error occurred in MPI_Recv
[node01:13219] *** reported by process [203161601,0]
[node01:13219] *** on communicator MPI_COMM_WORLD
[node01:13219] *** MPI_ERR_TYPE: invalid datatype
[node01:13219] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[node01:13219] *** and potentially your MPI job)
我的总体策略有意义吗?如果是这样,您能帮我弄清楚我做错了什么吗?让我知道是否需要更多细节,因为我省略了一些代码以减少混乱。