无法将数据从kafka主题发送到elasticsearch

问题描述 投票:0回答:1

[我正在尝试使用mongo从mongoDB获取数据到我的kafka主题数据库(作为源),elasticsearch(作为接收器)和kafka来构建数据管道。我已经成功地从mongoDB接收了我的kafka主题的数据。这是从mongoDB捕获的数据的示例

{"_id": {"_data": "825E88FED8000000012B022C0100296E5A10044D2CA180FAF94580B30CFA4B3CC80E1546645F696400645E88FED793AFA61A58411B2A0004"}, "operationType": "insert", "clusterTime": {"$timestamp": {"t": 1586036440, "i": 1}}, "fullDocument": {"_id": {"$oid": "5e88fed793afa61a58411b2a"}, "name": "Lefèvre Mathis", "phoneNumber": 87640262, "phoneNumber2": 98462768, "phoneNumber3": 50591075, "email": "Lefè[email protected]", "websiteUrl": "www.LefèvreMathis.fr", "legalInformation": {"companyName": "Duval EI", "siren": 7.3887975858196E13, "nic": 28866, "siret": 7.3887975858196E13, "ape": "49.53", "tva": "FR-1173030343", "description": "Blanditiis et placeat voluptas hic et. Quae et autem inventore ut enim fugit. Nihil velit in ut magnam."}, "professionType": {"type": "Hotel", "category": "professionnel"}, "operator": {"name": "Orange"}, "address": [{"city": "Paris", "street": "Quartier Les Halles, Paris 1er Arrondissement, Paris, Île-de-France, France métropolitaine, 75001, France", "zipCode": 75001, "latitude": "48.86330665", "longitude": "2.348370623761905"}], "openingTimeSet": [{"day": "Lundi", "opening": "08:00", "closing": "18:00"}, {"day": "Mardi", "opening": "08:00", "closing": "18:00"}, {"day": "Mercredi", "opening": "08:00", "closing": "18:00"}, {"day": "Jeudi", "opening": "08:00", "closing": "18:00"}, {"day": "Vendredi", "opening": "08:00", "closing": "18:00"}, {"day": "Samedi", "opening": "08:00", "closing": "18:00"}, {"day": "Dimanche", "opening": "08:00", "closing": "18:00"}], "_class": "com.sofrecom.elasticsearch.model.Subscriber"}, "ns": {"db": "elasticsearchApp", "coll": "subscriber"}, "documentKey": {"_id": {"$oid": "5e88fed793afa61a58411b2a"}}}

问题是当我运行ES接收器连接器时出现此异常:

Caused by: org.apache.kafka.connect.errors.DataException: Converting byte[] to Kafka Connect data failed due to serialization error: 
at org.apache.kafka.connect.json.JsonConverter.toConnectData(JsonConverter.java:355)
at org.apache.kafka.connect.storage.Converter.toConnectData(Converter.java:86)
at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$1(WorkerSinkTask.java:485)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
... 13 more

Caused by: org.apache.kafka.common.errors.SerializationException: java.io.CharConversionException: Invalid UTF-32 character 0x658b027b (above 0x0010ffff) at char #1, byte #7)

这是我的kafka-connect配置:

 CONNECT_BOOTSTRAP_SERVERS: kafka:9092
  CONNECT_REST_ADVERTISED_HOST_NAME: connect
  CONNECT_REST_PORT: 8083
  CONNECT_GROUP_ID: compose-connect-group
  CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
  CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
  CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
  CONNECT_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
  CONNECT_VALUE_CONVERTER:  org.apache.kafka.connect.json.JsonConverter
  CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
  CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
  CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR:  1
  CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR:  1
  CONNECT_STATUS_STORAGE_REPLICATION_FACTOR:  1
  CONNECT_PLUGIN_PATH: '/usr/share/java,/etc/kafka-connect/jars'
  CONNECT_CONFLUENT_TOPIC_REPLICATION_FACTOR: 1

我的es-sink-connector:

{ "name": "sink", "config": { "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector", "connection.url": "http://172.21.0.4:9200", "type.name": "subscriber", "topics": "test5.elasticsearchApp.subscriber", "key.ignore": "false","value.converter.schemas.enable": "false","schema.ignore": "true","value.converter":"org.apache.kafka.connect.json.JsonConverter" } }

和mongodb-source-connector

{ "name": "mongo-source", "config": { "connector.class": "com.mongodb.kafka.connect.MongoSourceConnector","tasks.max":1,"connection.uri":"mongodb://mongo1:27017,mongo2:27017","database":"elasticsearchApp","collection":"subscriber", "topic.prefix":"test15","value.converter":"org.apache.kafka.connect.storage.StringConverter"} }

[当我尝试在mongoDBConnector中使用json转换器时,从kafka主题中消费时,我得到了有效载荷的字符串格式

{"schema":{"type":"string","optional":false},"payload":"{\"_id\": {\"_data\": \"825E89EA94000000012B022C0100296E5A10044D2CA180FAF94580B30CFA4B3CC80E1546645F696400645E89EA94FC56002500157F490004\"}, \"operationType\": \"insert\", \"clusterTime\": {\"$timestamp\": {\"t\": 1586096788, \"i\": 1}}, \"fullDocument\": {\"_id\": {\"$oid\": \"5e89ea94fc56002500157f49\"}, \"name\": \"Lefèvre Mathis\", \"phoneNumber\": 87640262, \"phoneNumber2\": 98462768, \"phoneNumber3\": 50591075, \"email\": \"Lefè[email protected]\", \"websiteUrl\": \"www.LefèvreMathis.fr\", \"legalInformation\": {\"companyName\": \"Duval EI\", \"siren\": 7.3887975858196E13, \"nic\": 28866, \"siret\": 7.3887975858196E13, \"ape\": \"49.53\", \"tva\": \"FR-1173030343\", \"description\": \"Blanditiis et placeat voluptas hic et. Quae et autem inventore ut enim fugit. Nihil velit in ut magnam.\"}, \"professionType\": {\"type\": \"Hotel\", \"category\": \"professionnel\"}, \"operator\": {\"name\": \"Orange\"}, \"address\": [{\"city\": \"Paris\", \"street\": \"Quartier Les Halles, Paris 1er Arrondissement, Paris, Île-de-France, France métropolitaine, 75001, France\", \"zipCode\": 75001, \"latitude\": \"48.86330665\", \"longitude\": \"2.348370623761905\"}], \"openingTimeSet\": [{\"day\": \"Lundi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Mardi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Mercredi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Jeudi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Vendredi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Samedi\", \"opening\": \"08:00\", \"closing\": \"18:00\"}, {\"day\": \"Dimanche\", \"opening\": \"08:00\", \"closing\": \"18:00\"}], \"_class\": \"com.sofrecom.elasticsearch.model.Subscriber\"}, \"ns\": {\"db\": \"elasticsearchApp\", \"coll\": \"subscriber\"}, \"documentKey\": {\"_id\": {\"$oid\": \"5e89ea94fc56002500157f49\"}}}"}
mongodb elasticsearch apache-kafka-connect data-migration
1个回答
0
投票
  1. 如果您不希望Mongo连接器生成字符串有效负载,请不要使用此选项

    "value.converter":"org.apache.kafka.connect.storage.StringConverter"
    
  2. 您将需要在接收器中使用它,因为关于该主题的JSON中同时包含schemapayload

    "value.converter.schemas.enable": "true"
    
  3. 您将需要使用Elasticsearch索引映射来解析字符串,因为Connect不会为您这样做。

我不确定Mongo连接器中是否有错误。从未使用过它,但我想认为JSON Comverter应该有效,或者至少应该是Avro。

© www.soinside.com 2019 - 2024. All rights reserved.