如何使用java代码使用Kafka连接器?

问题描述 投票:0回答:2

目前我正在独立模式下使用 Kafka SpoolDir 连接器。将所需的配置添加到属性文件后,我使用

启动连接器
kafka/bin/connect-standalone.sh connect-standalone.properties file-source.properties

有什么方法可以仅使用java代码启动连接器(stadalone/分布式),就像我们编写消费者和生产者java代码一样?

apache-kafka apache-kafka-connect
2个回答
1
投票

ConnectStandalone
是此命令启动的 Java 类,但 Connect Framework 并不意味着作为嵌入式服务运行

您可以在此处看到源代码,它启动服务器并解析配置文件


0
投票
Connect producers and consumers. Internally, Kafka Connect uses standard Java producers and consumers to communicate with Kafka. Connect configures default settings for these producer and consumer instances. These settings include properties that ensure data is delivered to Kafka in order and without any data loss.

Kafka Connectors are ready-to-use components, which can help us to import data from external systems into Kafka topics and export data from Kafka topics into external systems. We can use existing connector implementations for common data sources and sinks or implement our own connectors.




Deploying a connector is as simple as following these 5 steps:

Build the JAR with mvn package.

Find the JAR in your target folder.

Create a connect property file.

Create a directory and place the JAR file in it, e.g. <path-to-confluent>/share/java/kafka-connect-<your-plugin>

With the benefits and drawbacks of each of these classes of related systems in mind, Kafka Connect is designed to have the following key properties:




Broad copying by default: Quickly define connectors that copy vast quantities of data between systems to keep configuration overhead to a minimum. The default unit of work should be an entire database, even if it is also possible to define connectors that copy individual tables.

Streaming and batch: Support copying to and from both streaming and batch-oriented systems.

Scales to the application: Scale down to a single process running one connector in development, testing or a small production environment, and scale up to an organization-wide service for copying data between a wide variety of large scale systems.

Focus on copying data only: Focus on reliable, scalable data copying; leave transformation, enrichment, and other modifications of the data up to frameworks that focus solely on that functionality. Correspondingly, data copied by Kafka Connect must integrate well with stream processing frameworks.

Parallel: Parallelism should be included in the core abstractions, providing a clear avenue for the framework to provide automatic scalability.

Accessible connector API: It must be easy to develop new connectors. The API and runtime model for implementing new connectors should make it simple to use the best library for the job and quickly get data flowing between systems. Where the framework requires support from the connector, (for example, for recovering from faults), all the tools required should be included in the Kafka Connect APIs.
© www.soinside.com 2019 - 2024. All rights reserved.