正在将Kafka主题流式传输到Spark中,但是在导入KafkaUtils时遇到问题
import sys
from pyspark import SparkContext
from pyspark import SparkConf
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils
if __name__ == __main__:
sc = SparkContext(appName='WordCount')
ssc = StreamingContext(sc, 2)
brokers, topic = sys.argv[1:]
kvs = KafkaUtils.createDirectStream(ssc, Jim_Topic)
lines = kvs.map(lambda x: x[1])
counts = lines.flatMap(lambda line: line.split(' ')) \
.map(lambda word: (word, 1)) \
.reduceByKey(lambda a, b: a+b)
counts.pprint()
ssc.start()
ssc.awaitTermination()
在'init。py'中找不到引用'Kafka'
org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.0-preview2
更改为Spark流媒体库org.apache.spark:spark-streaming-kafka-0-8_2.11:3.0.0-preview2