spark-cassandra-connect python deleteFromCassandra方法

问题描述 投票:0回答:1

我正在Databricks Notebook上使用Spark,Cassandra,Spark-Cassandra-Connector,根据他们的网站,我们可以使用'deleteFromCassandra'删除行:https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.mdhttps://datastax-oss.atlassian.net/browse/SPARKC-349这是我的python脚本:

def read_table(tableName,kespace, columns):
  dfData = (spark
        .read
        .format("org.apache.spark.sql.cassandra")
        .options(table = tableName, keyspace = kespace)
        .load()
        .select(*columns))
  return dfData 

emails='[email protected]'.split(",")
df = read_table(my_table, my_keyspace,"*").where(col("email").isin(emails))
df.rdd.deleteFromCassandra(my_keyspace, my_table)

失败:

AttributeError: 'RDD' object has no attribute 'deleteFromCassandra'

注意到他们提供的所有示例都在Scala中,这是否意味着函数'deleteFromCassandra'在Python中不可用?

apache-spark cassandra spark-cassandra-connector
1个回答
0
投票

库存的Spark Cassandra Connector无法使用,因为Python绑定仅支持数据帧。但是p yspark-cassandra应该是可能的,它在Spark Packages site中也可以作为--packages anguenot:pyspark-cassandra:2.4.0使用。像这样的东西:

dataFrame.rdd().deleteFromCassandra(keyspace, table)
© www.soinside.com 2019 - 2024. All rights reserved.