如何使用 Drill 查询 parquet 中的 WKT 列?

问题描述 投票:0回答:0

我想在 Apache Drill 中查询的镶木地板文件中有地理空间数据的文本 WKT 列。我正在运行 Drill 版本 1.21.1.

镶木地板文件具有这种布局(来自

parquet-tools
的输出):

...
############ Column(wkt) ############
name: wkt
path: wkt
max_definition_level: 1
max_repetition_level: 0
physical_type: BYTE_ARRAY
logical_type: String
converted_type (legacy): UTF8
compression: ZSTD (space_saved: 66%)
...

我在使用

ST_AsGeoJSON
时遇到这个错误:

apache drill> select ST_AsGeoJSON(wkt) from dfs.`/Users/matth/projects/parquet` limit 1;
Error: SYSTEM ERROR: GeometryException: invalid shape type

所以我输出 WKT 并尝试将其作为字符串:

apache drill> select wkt from dfs.`/Users/matth/projects/ll/parquet` limit 1;
+----------------------------------------------------------------------------------+
|                                       wkt                                        |
+----------------------------------------------------------------------------------+
| POLYGON((-101.70891499999999 41.1128509,-101.7109674 41.112846499999996,-101.7109632 41.112755899999996,-101.70867039999999 41.112760699999995,-101.7086707 41.1128514,-101.70891499999999 41.1128509)) |
+----------------------------------------------------------------------------------+

几何在 Postgres 中有效:

select ST_IsValid(ST_GeomFromText('POLYGON((-101.70891499999999 41.1128509,-101.7109674 41.112846499999996,-101.7109632 41.112755899999996,-101.70867039999999 41.112760699999995,-101.7086707 41.1128514,-101.70891499999999 41.1128509))'));

它不想在 Drill 中工作:

apache drill> select ST_AsGeoJSON('POLYGON((-101.70891499999999 41.1128509,-101.7109674 41.112846499999996,-101.7109632 41.112755899999996,-101.70867039999999 41.112760699999995,-101.7086707 41.1128514,-101.70891499999999 41.1128509))');
Error: SYSTEM ERROR: GeometryException: invalid shape type


Please, refer to logs for more information.

日志中的错误是(可以发布更多;此处截断)

[Error Id: 1d5eed81-8eea-4e4a-96b3-9151a9282dc5 on siwenna:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: GeometryException: invalid shape type


Please, refer to logs for more information.

...
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: Error while applying rule ReduceAndSimplifyProjectRule, args [rel#10013:LogicalProject.NONE.ANY([]).[](input=RelSubset#10012,exprs=[ST_ASGEOJSON('POLYGON((-101.70891499999999 41.1128509,-101.7109674 41.112846499999996,-101.7109632 41.112755899999996,-101.70867039999999 41.112760699999995,-101.7086707 41.1128514,-101.70891499999999 41.1128509))')])]
    at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:301)
    ... 3 common frames omitted
Caused by: java.lang.RuntimeException: Error while applying rule ReduceAndSimplifyProjectRule, args [rel#10013:LogicalProject.NONE.ANY([]).[](input=RelSubset#10012,exprs=[ST_ASGEOJSON('POLYGON((-101.70891499999999 41.1128509,-101.7109674 41.112846499999996,-101.7109632 41.112755899999996,-101.70867039999999 41.112760699999995,-101.7086707 41.1128514,-101.70891499999999 41.1128509))')])]
...
Caused by: com.esri.core.geometry.GeometryException: invalid shape type
    at com.esri.core.geometry.OperatorImportFromWkbLocal.importFromWkb(OperatorImportFromWkbLocal.java:366)
    at com.esri.core.geometry.OperatorImportFromWkbLocal.executeOGC(OperatorImportFromWkbLocal.java:183)
    at com.esri.core.geometry.ogc.OGCGeometry.fromBinary(OGCGeometry.java:597)
    at org.apache.drill.exec.udfs.gis.STAsGeoJSON.eval(STAsGeoJSON.java:51)
    at org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator.evaluateFunction(InterpreterEvaluator.java:149)
    at org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator$EvalVisitor.visitFunctionHolderExpression(InterpreterEvaluator.java:363)
    ... 26 common frames omitted
2023-05-17 10:17:47,507 [Client-1] INFO  o.a.d.e.r.u.BlockingResultsListener - [#53] Query failed: 
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: GeometryException: invalid shape type
...

我尝试了一些其他的东西,包括

ST_GeoFromText
,在那里我得到了这个错误:

apache drill> select ST_GeoFromText(wkt) from dfs.`/Users/matth/projects/parquet`;
Error: VALIDATION ERROR: From line 1, column 8 to line 1, column 26: No match found for function signature ST_GeoFromText(<ANY>)
parquet apache-drill
© www.soinside.com 2019 - 2024. All rights reserved.