如何使用Spark Cassandra Connector创建表?

问题描述 投票:1回答:1

我最近开始使用Spark Cassandra Connector,我手动创建了我的表,并且能够保存数据。以下是文档的简化代码段:

CREATE TABLE test.words (word text PRIMARY KEY, count int);
val collection = sc.parallelize(Seq(("cat", 30), ("fox", 40)))
collection.saveToCassandra("test", "words", SomeColumns("word", "count"))

有没有办法通过从案例类推断模式而不实际编写原始查询来以编程方式创建表?

scala apache-spark cassandra spark-cassandra-connector
1个回答
1
投票

是的,你可以用saveAsCassandraTablesaveAsCassandraTableEx作为described in documentation。第一个函数将根据您的数据自动创建表(请注意,它将使用一列作为分区键)。第二个函数将允许您通过指定分区键,聚类列等来自定义架构,如下所示(代码来自文档):

val p1Col = new ColumnDef("col1",PartitionKeyColumn,UUIDType)
val c1Col = new ColumnDef("col2",ClusteringColumn(0),UUIDType)
val c2Col = new ColumnDef("col3",ClusteringColumn(1),DoubleType)
val rCol = new ColumnDef("col4",RegularColumn,IntType)

// Create table definition
val table = TableDef("test","words",Seq(p1Col),Seq(c1Col, c2Col),Seq(rCol))

// Map rdd into custom data structure and create table
val rddOut = rdd.map(s => outData(s._1, s._2(0), s._2(1), s._3))
rddOut.saveAsCassandraTableEx(table, SomeColumns("col1", "col2", "col3", "col4"))
© www.soinside.com 2019 - 2024. All rights reserved.