Kafka Connector for AWS S3 file system
This connector copies Kafka messages into Amazon's S3 file system.
Quick Start:
Follow steps 1 thru 5 from the 'Quick Start' section in Kafka's documentation (http://kafka.apache.org/documentation.html#quickstart)
To summarize:
- Download & install
tar -xzf kafka_2.11-0.9.0.0.tgz cd kafka_2.11-0.9.0.0
- Start Zookeeper
bin/zookeeper-server-start.sh config/zookeeper.properties
- Open a new Terminal window & start the server
bin/kafka-server-start.sh config/server.properties
- Open a new Terminal windows & Create a topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
- In the same terminal window, start Producer & send a few messages
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
This is a message
This is another message
- Open a new Terminal window & confirm that messages were received by Kafka server.
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
- Get the source code for this project in a new Terminal window.
git clone https://github.com/dilteam/s3-kafka-connector
cd s3-kafka-connector
- Open connect-s3-sink.properties & add S3 related properties.
vi src/main/resources/connect-s3-sink.properties
Set these properties:
ACCESS_KEY=
SECRET_KEY=
BUCKET_NAME=
FOLDER_NAME=
- Compile. (You should have Maven installed.)
mvn clean package
- Start S3 Kafka Conector.
bin/startConnector.sh
- Now enter messages in the Producer window. They will show up on S3 at s3://BUCKET_NAME/FOLDER_NAME/
TODO:
-
Allow users to partition data as per their need.
-
Test performance. (Would we get SlowDown errors (503) errors?)
-
Add more S3 configuration parameters.
-
and more....