我正在尝试使用azure databricks 在增量表上使用replacewhere 子句。这是重现问题的设置:
CREATE TABLE mymaintable (dt DATE, name STRING, YN string) USING delta;
INSERT INTO mymaintable VALUES ('2024-03-01', 'N1', 'Y');
INSERT INTO mymaintable VALUES ('2024-03-01', 'N2', 'N');
INSERT INTO mymaintable VALUES ('2024-03-01', 'N3', 'Y');
INSERT INTO mymaintable VALUES ('2024-03-02', 'N1', 'N');
INSERT INTO mymaintable VALUES ('2024-03-02', 'N2', 'N');
INSERT INTO mymaintable VALUES ('2024-03-02', 'N3', 'N');
INSERT INTO mymaintable VALUES ('2024-03-03', 'N1', 'Y');
INSERT INTO mymaintable VALUES ('2024-03-03', 'N2', 'Y');
INSERT INTO mymaintable VALUES ('2024-03-03', 'N3', 'Y');
CREATE TABLE myincrementaltable (dt DATE, name STRING, YN string) USING delta;
INSERT INTO myincrementaltable VALUES ('2024-03-03', 'N1', 'X');
INSERT INTO myincrementaltable VALUES ('2024-03-03', 'N2', 'Z');
INSERT INTO myincrementaltable VALUES ('2024-03-04', 'Q1', 'X');
INSERT INTO myincrementaltable VALUES ('2024-03-04', 'Q2', 'Z');
这就是设置。现在我想把增量表替换到主表中。
这有效:
INSERT INTO mymaintable
REPLACE WHERE dt >= "2024-03-03"
TABLE myincrementaltable
但这不是:
INSERT INTO mymaintable
REPLACE WHERE dt >= (SELECT MAX(dt) from mymaintable)
TABLE myincrementaltable
它失败并出现错误:
AnalysisException: [TABLE_OR_VIEW_NOT_FOUND] The table or view `mymaintable` cannot be found. Verify the spelling and correctness of the schema and catalog.
有办法做到这一点吗?
谢谢!
我不确定直接 SQL 语句更新,但我会在单独的 python 单元中尝试以下操作
Var = spark.sql("SELECT MAX(dt) from sandbox.mymaintable").collect()[0][0]
SQLString = f"""
INSERT INTO sandbox.mymaintable
REPLACE WHERE dt >= '{Var}'
TABLE sandbox.myincrementaltable"""
spark.sql(SQLString)
尝试一下。希望有帮助