我是Cassandra数据库的新手,我正在尝试将Spark数据帧保存到Cassandra DB。创建表时出现异常。 "SyntaxException: no viable alternative at input"
。
val sparkContext = spark.sparkContext
//Set the Log file level
sparkContext.setLogLevel("WARN")
//Connect Spark to Cassandra and execute CQL statements from Spark applications
val connector = CassandraConnector(sparkContext.getConf)
connector.withSessionDo(session =>
{
session.execute("DROP KEYSPACE IF EXISTS my_keyspace")
session.execute("CREATE KEYSPACE my_keyspace WITH replication = {'class':'SimpleStrategy', 'replication_factor':1}")
session.execute("USE my_keyspace")
session.execute("CREATE TABLE mytable('Inbound_Order_No' varchar,'Material' varchar,'Container_net_weight' double,'Shipping_Line' varchar,'Container_No' varchar,'Month' int,'Day' int,'Year' int,'Job_Run_Date' timestamp, PRIMARY KEY(Inbound_Order_No,Container_No))")
df.write
.format("org.apache.spark.sql.cassandra")
.mode("overwrite")
.option("confirm.truncate", "true")
.option("spark.cassandra.connection.host", "localhost")
.option("spark.cassandra.connection.port", "9042")
.option("keyspace", "my_keyspace")
.option("table", "mytable")
.save()
}
)
我无法跟踪错误,因此寻求帮助。请注意:我正在Windows系统中进行这项工作,并且一切都在本地设置。如果您发现任何其他错误,我也分享了我的Spark代码,请与我分享。
session.execute("CREATE TABLE mytable(\"Inbound_Order_No\" varchar,\"Material\" varchar,\"Container_net_weight\" double,\"Shipping_Line\" varchar,\"Container_No\" varchar,\"Month\" int,\"Day\" int,\"Year\" int,\"Job_Run_Date\" timestamp, PRIMARY KEY(\"Inbound_Order_No\",\"Container_No\"))")
双引号用于区分大小写的列,而不是单引号。
session.execute("CREATE TABLE mytable(Inbound_Order_No varchar,Material varchar,Container_net_weight double,Shipping_Line varchar,Container_No varchar,Month int,Day int,Year int,Job_Run_Date timestamp, PRIMARY KEY(Inbound_Order_No,Container_No))")
如果要使用小写的列名,请使用上面的查询。默认情况下,Cassandra将创建小写的列名(如果没有用双引号引起来的话)