We are using storm Kafka integration where a Spout reads from a Kafka topic.
Following is the version of storm, Kafka and zookeeper we are using.
Strom : apache-storm-0.9.2-incubating
Kafka : kafka_2.8.0-0.8.1.1
Zookeeper : zookeeper-3.4.6
I am facing following issues at spout.
1)The messages gets failed even if the average time taken is less than max.topology.timeout value, also we aren’t getting any exceptions at any of the bolt.
2)A topology is finally emitting to the Kafka producer i.e. some other topic, but the messages are getting duplicated due to replay issues.
3)The consumer group is isn’t working properly for storm Kafka integration.
a.When we give same group id to the Kafka consumer of different topology but still both are reading same messages.
b.If we have 2 different consumer with different consumer group id in different topology it works fine if both topologies are deployed at the same time, but doesn’t if we deploy one of them after some of the message are already loaded in the topic and read by the first topology.
Kindly help me with above points as it is hampering the overall scope of the project and also time lines.
We never ran 0.9.2-incubating in production as it didn't pass our tests. I don't recall what problems we hit. I do know we didn't end up using it.
1. Average time isn't useful in your scenario. Your average can be below your timeout but many tuples can still be taking longer than that timeout. You are dealing with latency to process and that is going to vary widely.
2. You need to work out your issues with #1 above. Then figure out how to make your system work with at-least-once processing. You can't have exactly once processing in a distributed system. You can get usually once but that is the best you can do. Chapter 4 discusses this towards the end. Ideally, you want a system where it doesnt matter for anything other wasted processing if you duplicate a message.
3. That isn't something I'm capable of helping you hammer out via this forum. I'd suggest creating the simplest topology possible and figure out what in your code/configuration is causing the issue.