What is the use of frameworks like Apache Spark and Kafka in AI? Can machine learning be learnt without the knowledge of Apache Spark and Kafka?
What are the advantages that anyone shall be having if he/she knows these frameworks in addition to libraries like NumPy, Pandas etc.?
My 2cents on this is there is small data and big data ML. Tools like Pandas (used in Python), are really used on "small data" because they require 5-10 times the RAM: http://wesmckinney.com/blog/apache-arrow-pandas-internals/. So if you need to work with data sets of say, 5GB or more, there is a strong chance you will need to use another tool like Spark ML. In the big data ML space some popular tools are Cloud systems: EMR (which has Spark), AWS Sagemaker, Google Big Query (has ML built in now), etc.