Apache Drill S3：未选择默认架构

Question

我正在尝试使用Apache Drill。我是整个环境的新手，只是想了解Apache Drill的工作原理。

我试图使用Apache Drill查询存储在s3上的json数据。我的水桶是在美国东部（弗吉尼亚州北部）创建的。我使用this链接为S3创建了一个新的Storage Plugin。

以下是我的新S3 Storage Plugin的配置：

{
  "type": "file",
  "enabled": true,
  "connection": "s3a://testing-drill/",
  "config": {
    "fs.s3a.access.key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "fs.s3a.secret.key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
  },
  "workspaces": {
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    },
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    }
  },
  "formats": {
    "psv": {
      "type": "text",
      "extensions": [
        "tbl"
      ],
      "delimiter": "|"
    },
    "csv": {
      "type": "text",
      "extensions": [
        "csv"
      ],
      "delimiter": ","
    },
    "tsv": {
      "type": "text",
      "extensions": [
        "tsv"
      ],
      "delimiter": "\t"
    },
    "parquet": {
      "type": "parquet"
    },
    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    },
    "avro": {
      "type": "avro"
    },
    "sequencefile": {
      "type": "sequencefile",
      "extensions": [
        "seq"
      ]
    },
    "csvh": {
      "type": "text",
      "extensions": [
        "csvh"
      ],
      "extractHeader": true,
      "delimiter": ","
    }
  }
}

我还配置了我的core-site-example.xml如下：

<configuration>

    <property>
        <name>fs.s3a.access.key</name>
        <value>xxxxxxxxxxxxxxxxxxxx</value>
    </property>

    <property>
        <name>fs.s3a.secret.key</name>
        <value>xxxxxxxxxxxxxxxxxxxxxxxx</value>
    </property>

    <property>
        <name>fs.s3a.endpoint</name>
        <value>s3.us-east-1.amazonaws.com</value>
    </property>

</configuration>

但是当我尝试使用以下命令使用/设置工作区时：

USE shiv.`root`;

它给了我以下错误：

Error: VALIDATION ERROR: Schema [shiv.root] is not valid with respect to either root schema or current default schema.

Current default schema:  No default schema selected

[Error Id: 6d9515c0-b90f-48aa-9dc5-0c660f1c06ca on ip-10-0-3-241.ec2.internal:31010] (state=,code=0)

如果尝试执行show schemas;，那么我收到以下错误：

show schemas;
Error: SYSTEM ERROR: AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: EEB438A6A0A5E667, AWS Error Code: null, AWS Error Message: Bad Request

Fragment 0:0

[Error Id: 85883537-9b4f-4057-9c90-cdaedec116a8 on ip-10-0-3-241.ec2.internal:31010] (state=,code=0)

我无法理解这个问题的根本原因。

Answer 1

使用Apache Drill with GCS（Google云端存储）时遇到类似问题

运行USE gcs.data查询时出现以下错误。

VALIDATION ERROR: Schema [gcs.data] is not valid with respect to either root schema or current default schema.

Current default schema:  No default schema selected

我跑SHOW SCHEMAS，没有gcs.data架构。

我继续在我的GCS桶中创建data文件夹，gcs.data出现在SHOW SCHEMAS和USE gcs.data查询工作。

根据我对apache drill的有限经验我所理解的是，在文件存储中，如果你有一个工作区使用不存在的文件夹，那么drill会抛出这个错误。

GCS和S3都是文件类型存储，所以可能你遇到了这个问题。

这是我的GCS存储配置

{
  "type": "file",
  "connection": "gs://my-gcs-bkt",
  "config": null,
  "workspaces": {
    "data": {
      "location": "/data",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    },
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    },
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    }
  },
  "formats": {
    "parquet": {
      "type": "parquet"
    },
    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    },
    "tsv": {
      "type": "text",
      "extensions": [
        "tsv"
      ],
      "delimiter": "\t"
    },
    "csvh": {
      "type": "text",
      "extensions": [
        "csvh"
      ],
      "extractHeader": true,
      "delimiter": ","
    },
    "csv": {
      "type": "text",
      "extensions": [
        "csv"
      ],
      "delimiter": ","
    },
    "psv": {
      "type": "text",
      "extensions": [
        "tbl"
      ],
      "delimiter": "|"
    }
  },
  "enabled": true
}

Apache Drill S3：未选择默认架构

问题描述投票：0回答：1

1个回答

最新问题

Apache Drill S3：未选择默认架构

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1