如何使用Java API正确访问Hadoop 3.2.1?

问题描述 投票:0回答:1

我以伪分布式模式在远程centos服务器上安装了Hadoop 3.2.1,并检查了防火墙配置。我暴露了端口9000和9800-9900。本地环境是Windows且未启动的hadoop,我必须安装它,否则应用程序将抛出无法找到HADOOP_HOME的异常,这也使我感到困惑。

我的Maven依赖项如下:

            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-hdfs-client</artifactId>
                <version>3.2.0</version>
                <scope>provided</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-hdfs</artifactId>
                <version>3.2.0</version>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-common</artifactId>
                <version>3.2.0</version>
            </dependency>

我的测试如下:

    @Test
    public void testHDFSRepository() throws IOException {
        Configuration conf= new Configuration();
        conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
        conf.set("fs.defaultFS", "hdfs://remoteip:9000");
        FileSystem fs = FileSystem.get(conf);
        fs.copyFromLocalFile(new Path("C:\\Users\\admin\\Desktop\\111.txt"), new Path("/demo/111.txt"));

    }

然后抛出异常并将一个0B文件上传到Hadoop

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /demo/111.txt could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2219)
    at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2789)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)

nameNode记录如下:

2020-02-18 19:41:09,526 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 1 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable:  unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2020-02-18 19:41:09,526 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on default port 9000, call Call#4 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from xx.xx.xx.xx:13049
java.io.IOException: File /demo/111.txt could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2219)
    at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2789)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)

另一方面,我可以成功使用远程服务器上的bin/hdfs dfs -put。]

查看错误信息,我确实只有一个dataNode,我应该添加一些datanode吗?

java hadoop
1个回答
0
投票

您的datanode进程不正常

此操作中不包括1个节点。

您需要检查其日志或namenode UI以确定问题

我可以在远程服务器上成功使用bin / hdfs dfs -put

然后在本地下载CLI工具并为远程服务器进行配置

© www.soinside.com 2019 - 2024. All rights reserved.