Bill Bejeck

author
+ Follow
since Oct 11, 2006
Cows and Likes
Cows
Total received
0
In last 30 days
0
Total given
0
Likes
Total received
2
Received in last 30 days
0
Total given
0
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Bill Bejeck

Hi Noorul,

That's a great question.  While I have experience working with Apache Spark, I don't have any experience working with securing an Apache Spark cluster, so I can't really answer that question.

Thanks,
Bill
Hi Noorul,

I'm sure there is some limit to the number of processors you can add, but I don't have an idea of what that number is.  
As you add more processors, it will take more time for a record to work through the topology.

Cheers,
Bill
Hi Taleh,

Other than helping with your general programming and Java skills, this book really isn't suited to studying for a certification.

Cheers,
Bill
Hi Mauricio,

KSQL is a way to use SQL to run continuous queries over Kafka.  Since KSQL uses Kafka Streams under the covers, when you create a KSQL "Table" you are using a KTable.  

As more resources, Chapter 5 is devoted solely to the KTable and for more information on KSQL you can look at https://docs.confluent.io/current/ksql/docs/index.html.

Yes, you can create a KTable to read from a source topic.

HTH,
Bill
Hi Noorul,

If I understand your question, when using more than 1 partition in a topic, records are read in-order in a partition, but the order of records is not guaranteed across partitions.  

You don't need to worry about using offsets in a multi-node Kafka cluster, the offsets are only used by the consumer when re-connecting so it can pick up where it left off.

HTH,
Bill
Hi Timur,

The DSL makes it easier to write Kafka Streams applications and handles pretty much the low-level details of building a topology.  However, the DSL is "opinionated" in that you can only create what the DSL offers you.  With the Processor API you have to do all the low-level work yourself to put together your topology, but you have no restrictions on what you can build.

Overall my recommendation is to start with the DSL and if you find you need some functionality the DSL doesn't offer, then use the Processor API.

Cheers,
Bill
Hi Germán,

The book mostly covers the "at least once" semantics.  I have included an appendix covering "at most once" semantics though.

Cheers,
Bill
Hi Will,

They are similar technologies, but I have never used Storm, so I can't really give you a good answer to that.

Kafka provides real-time streaming capabilities, while Storm will read from Kafka to perform some operations.  

So with that brief answer in mind, I would say Kafka Streams provides you seamless integration with Kafka and you would use it in place of Storm

What I can tell you is that Kafka continues to grow and has a rich ecosystem around it.

Thanks,
Bill
Hi R.J

When you say Queue, I'm going to assume you are referring to a Kafka topic.  
The process you describe is the exact way you keep track of the last record you consumed.  
However, I don't know what version of Kafka you are running, but you can store the committed offsets in Kafka itself.  
With a KafkaConsumer, you can specify the commit-interval via configs and after the specified time elapses, the consumer will persist the last read offset into Kafka (an internal topic named _consumer_offsets).

Then when you restart the consumer will communicate with the Kafka broker, and it will pick up where left off, with no need for doing a manual seek yourself.

HTH,
Bill
Hi Lanny,

Good question.  

First of all, you need Kafka to run Kafka Streams.  

Right now Kafka is the defacto standard for getting data into other streaming technologies like Spark and Flink.
But with those technologies, you need a separate cluster to get up and running.  
With Kafka Streams, it's just another application you deploy. You can run on a single machine or multiple machines.  
Best of all you can increase or decrease the number of machines you are running on while the application is running, no need to stop, add a node to the cluster and restart.

Thanks,
Bill
Here's one example of the topics I like to blog about http://codingjunkie.net/guava-eventbus (Event Programming with the Guava EventBus)
I also cover other topics, right now I'm focusing on Hadoop/MapReduce.

Thanks
Well I am a little disapointed in my score, but a passing score is a passing score! Here are my results:

The maximum possible score is 400; the minimum to pass is 320.
General Considerations (maximum = 100): 81
Documentation (maximum = 70): 70
O-O Design (maximum = 30): 30
GUI (maximum = 40): 31
Locking (maximum = 80): 44
Data store (maximum = 40): 40
Network server (maximum = 40): 40



I'm not sure why I got hit so hard on the locking section, but I still passed so I guess it is ok. Does any know how or if it is even possible to get more detailed feedback? Thanks to all for the help in questions answered.
12 years ago
Hi,

I have finished my assignment and I am about to submit and wanted to do one last sanity check. My project is the UrlyBird version 1.2.1. According to the requirements starting the application is done by entering the following on the command line:. The requirements clarify the usage of these commands as the following

The mode flag must be either "server", indicating the server program must run, "alone", indicating standalone mode, or left out entirely, in which case the network client and gui must run.

My interpretation of this is to run in network mode, one would enter
java -jar runme server to start the server, then you would need to enter
java -jar runme from another command shell to run the client/GUI. To run the application in non-network mode one would simply execute java -jar runme alone.
Is this correct?
Thanks,
Bill
Has anyone passed the exam first time around using Java 6?
Hi,

I am just about done and ready to submit my project. In my choices.txt file I used the following format:

Issue: The design of GUI
Solution: To implment the design.....

This format seems clear to me, I wanted to get some other opinions.

Thanks,
Bill