我正在尝试从桌面 Eclipse 中运行的 Java 程序连接到远程 HDFS。我能够连接。但在尝试读取数据时出现此异常:
原因:org.apache.hadoop.ipc.RpcException:RPC 响应超出最大数据量
有人可以帮忙解决这个问题吗?
我有一个非常基本的代码用于读取测试数据。错误来自 hdfs.open();
FileSystem hdfs =null;
String uriPath = "hdfs://" + Constants.HOST + ":" + Constants.PORT+ "/test/hello_world.txt";
String hadoopBase ="hdfs://" + Constants.HOST + ":" + Constants.PORT;
Configuration conf = new Configuration();
conf.set("fs.default.name", hadoopBase);
URI uri;
InputStream inputStream = null;
try {
uri = new URI(uriPath);
hdfs = FileSystem.get(uri, conf);
Path path = new Path(uri);
inputStream = hdfs.open(path);
IOUtils.copyBytes(inputStream, System.out, 4096, false);
} catch (URISyntaxException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
try {
hdfs.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
IOUtils.closeStream(inputStream);
}
这是完整的异常:
java.io.IOException: Failed on local exception: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:785)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1485)
at org.apache.hadoop.ipc.Client.call(Client.java:1427)
at org.apache.hadoop.ipc.Client.call(Client.java:1337)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
at com.sun.proxy.$Proxy11.getBlockLocations(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:826)
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:815)
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:804)
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:319)
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:281)
at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:270)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1115)
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:321)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:333)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
at DataUtil.readData(DataUtil.java:29)
at main(Main.java:24)
Caused by: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length
at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1800)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1155)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1052)
检查您的 core-site.xml :
<property>
<name>fs.default.name</name>
<value>hdfs://host:port</value>
</property>
此端口可以是 9000 或 8020。 确保您在代码或命令中使用相同的端口
尝试这个解决方案: 将此配置添加到 hdfs-site.xml
<property>
<name>ipc.maximum.data.length</name>
<value>134217728</value>
</property>
首先需要检查活动namenode的namenode.log中的真实响应数据长度。 消息必须类似于:
WARN org.apache.hadoop.ipc.Server: Large response size 786010791 for call Call#3
知道响应数据长度后,可以根据实际大小更改参数 ipc.maximum.data.length。
顺便说一句,问题可能出在客户端,就像我的情况一样。因此只需将参数添加到客户端 core-site.xml 或直接添加到命令中即可。 例如:
hadoop distcp -D ipc.maximum.response.length=1073741824 ...