我尝试在 Glue 4.0 中从数据帧创建视图,但收到错误 - AnalysisException:未找到表或视图。 glue数据库中表的数据格式是hudi。
代码-
import sys
from awsglue.transforms import *
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from pyspark.sql.functions import *
sc = SparkContext.getOrCreate()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
# Define the Glue Data Catalog database and table names
database_name = "hudi_db"
table4_name = 'd_person'
table4 = glueContext.create_data_frame.from_catalog(
database=database_name,
table_name=table4_name,
)
rows = table4.count()
distinct_rows = table4.distinct().count()
print(f"Number of rows in data frame: {rows} and distinct rows are: {distinct_rows}")
table4.createOrReplaceTempView(table4_name + '_glue_view')
custom_sql_query = """
SELECT count(*)
FROM d_person_glue_view
"""
# Execute the custom SQL query
result_df = spark.sql(custom_sql_query)
是否需要任何额外的配置?可能导致此错误的原因是什么?
谢谢你。
我尝试过以下方法 -
以下是在Dataframe中读取Hudi表的方法,它对我来说效果很好:
spark.read.format("hudi").load(S3_basePath).createOrReplaceTempView("test")
res = spark.sql("select * from test")
希望这有帮助!