Granny's Programming Pearls
"inside of every large program is a small program struggling to get out"
JavaRanch.com/granny.jsp
Win a copy of OCP Oracle Certified Professional Java SE 11 Programmer I Study Guide: Exam 1Z0-815 this week in the Programmer Certification forum!

Noah Gift

Greenhorn
+ Follow
since Sep 13, 2018
Cows and Likes
Cows
Total received
5
In last 30 days
0
Total given
0
Likes
Total received
6
Received in last 30 days
0
Total given
14
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Noah Gift

Great questions by everyone including the winners!
In the example shown there are a couple of "pragmatic" approaches that could tried first, ie. in the spirit of using the highest level tools first. I would approach the problem like this:

Step 1:  Spin up an AWS Sagemaker Instance
Step 2:  Create a Jupyter Notebook
Step 3:  Clean up the data and do some clustering and see if there are some clusters that create types of transactions and plot (with on axis being transaction size)
Step 4:  Add the labels to the original data set and predict which cluster something could be assigned to and deploy via Sagemaker (to either production or other team members via about 1 line of code)

Off the top of my head a workflow like this could be a rapid way to approach the problem
This is an interesting question and taking it from the "pragmatic" viewpoint, I would say Python.  Python is a language I tried to actually drop in favor of fancier languages, but it is just too productive.  A lot of AI programming is experiments, and it lends itself toward something like Python.

If I had to pick what I would hope the language of the future would be, it would look something like F# or Swift for AI.  Swift could be a stealth language that could take market share from Python, not sure, but it is fun to use.
10-20 years out is far enough out that I am not sure what will be happening and that there will even be the concept of web development.  In the next 1-2 years though, yes, I do think it is will be very common to use AI APIs.  A great example is iOS and the CoreML framework, the iOS developer may not be the person training the Machine Learning model, but they know enough about it to use the model in the their application.  My suspicion is much of the hard stuff in AI is going to be commodity:  hyper parameter tuning, model selection, etc, and there will be tremendous value in people who can create solutions with AI APIs.
My 2cents on this conversation is that at first I felt R and Python were really the two strong choices for ML, but recently Python has essentially dominated the space.  If you want to do cutting edge Machine Learning and Deep Learning, there is a very strong chance you will need to use Python.  The good news if you are more familiar with Java is that Python is a very easy language, especially if you ignore 80% of it and stick with only the parts associated with your tasks.

Also, a decent approach if you are "cloud-native" is to write the parts you must in Python, and then use Java with, say AWS Lambda, then you get the best of both worlds.
In addition to those great books which are inspiring to me, I think the latest book I published, Pragmatic AI: An Introduction to Cloud-Based Machine Learning, is a fun and light way to make progress with AI on day 1.  Lots of real world projects and pointers toward approaches to build things.

A lot of free associated colab notebooks are below so you can get an idea of what you are in for:

https://github.com/noahgift/managed_ml_systems_and_iot#colab-noteboks
https://github.com/noahgift/functional_intro_to_python#safari-online-training--essential-machine-learning-and-exploratory-data-analysis-with-python-and-jupyter-notebook
The best advice most people can get on getting to the next level as a programmer is to create a solution.  If you want to do something in AI, pick some that sounds fun like training your own DeepLense Project using MXNet: https://aws.amazon.com/deeplens/.

I have a bunch of ideas for directions of projects you might enjoy here.

https://github.com/noahgift/functional_intro_to_python#safari-online-training--essential-machine-learning-and-exploratory-data-analysis-with-python-and-jupyter-notebook
https://github.com/noahgift/managed_ml_systems_and_iot#colab-noteboks

In a nutshell, my advice is build things that you find interesting personally, and you will get to the next level in the shortest amount of time.
Yes, I think a lot of the examples in the book could be helpful in thinking about AI in finance.  I show how straightforward it is to use AI Natural Language APIS from cloud vendors like:  AWS, Azure and GP.  I also have some examples that show how to collect data from APIS and also scrape data.  An easy AI application would be to look at the sentiment of news and message boards about an individual stock.  

For example is a message board full of negative sentiment?  This could mean a couple of things:  

A.  The stock is going to have a tough time
B.  There are some fraud detection that could alert the company of an organized effort to short the stock.

This could be coded up very quickly with a pragmatic AI approach.
I have talked with quite a few people in the energy sector in the last year, and yes, I think the same "pragmatic AI" approach is valid.  In energy some of the problems I have encountered are ways to use dynamic capacity....say a solar system integrated into a building.  The complexity in this problem is the existing infrastructure and staff (who may be less receptive to change).  By using "off the shelf" frameworks like AWS Sagemaker or Google BigQuery ML results can be created quickly, which gain trust of the people involved in change management.

Additionally, there is a lot going on with edge ML right now.  You can see some developing ideas here:
https://github.com/noahgift/managed_ml_systems_and_iot

Using "off the shelf" chips that talk to high level frameworks are a great way to quickly get results.  In industries like energy, which have a huge legacy technology base, getting results quickly and making the IT part, "the easy part" is a way of limiting the risk of a project.  The opposite approach, and one I don't recommend, is to try to be on the leading edge of cutting edge technology in say, Deep Learning.  Limiting the complexity of technology in my opinion is the secret to getting goals accomplished, and this is what is pragmatic.
One particular way to handle things in a pragmatic manner would be to leverage "by the book" recommendations from a Cloud provider:  AWS, Azure, GCP.  Use their recommended tools and workflows to create a cloud-native architecture that can work with an existing banking architecture.  A good example would be to read through some of the AWS whitepapers:  https://aws.amazon.com/whitepapers/, then apply that thinking to creating AI solutions for banking.
My 2cents on this is there is small data and big data ML.  Tools like Pandas (used in Python), are really used on "small data" because they require 5-10 times the RAM:  http://wesmckinney.com/blog/apache-arrow-pandas-internals/.  So if you need to work with data sets of say, 5GB or more, there is a strong chance you will need to use another tool like Spark ML.  In the big data ML space some popular tools are Cloud systems:  EMR (which has Spark), AWS Sagemaker, Google Big Query (has ML built in now), etc.  
Randy,

I think there will be at least a couple of flavors of AI.  There is "moonshot" AI, where there are people working on projects like completely autonomous vehicles and greater than human intelligence.  My guess is things won't be exactly what we imagine because guessing the future is notoriously difficult (to channel Yogi Berra).

What I do see is inevitable is "Pragmatic AI", which is intelligent, or enhanced automation.  Automation is a multiple century trend and the examples are too numerous to mention.  Machine Learning is mostly about predicting a value and discovering hidden patterns.  Enhanced automation combines the multiple century trend and adds the ability to do those things.  A great example is the semi-autonomous driving mode in a car.  This isn't "full moonshot AI", but it is "enhanced automation", and makes the user of the automation's live much better.

Google GMail has been slowly introducing features like suggested replies to messages and auto-completion of entire phrases while typing.  The email isn't writing itself, but parts of a tedious task are automated.  I would put this in the pragmatic AI area or "enhanced automation".

I think there are incredible opportunities for huge sections of the population to be "pragmatic AI" practitioners because they see automation possibilities that could be "enhanced" by using AI.  They don't necessarily need to even know how Deep Learning works, to create that solution, just as the factory owner doesn't need to have a Phd in Robotics.  Google is a big advocate of this approach and they call this "democratizing AI".  And I agree, this will happen, and I believe it will happen before "moonshot" AI.  

So, this book is a lot about that topic with examples to get people started.




Gary,

My recommendation is to use a greedy algorithm and I have a simulation for proof:  https://github.com/noahgift/or/blob/master/tsp_greedy_random_start.py.  In this approach to solving the traveling salesman problem, I randomly choose a starting point and then always pick the shortest path.

For solving many real-world problems, this approach makes sense.  Step one would be to look at the provider of the APIs. If they are leaders in the space:  Amazon, Microsoft and Google in that order, then they are safe choice to build solutions on top of:   https://www.zdnet.com/article/google-cloud-platform-breaks-into-leader-category-in-gartners-magic-quadrant/.  Then try out the AI API or service, like say, Google AutoML or Amazon Rekognition.  If it solves the problem for your needs, your done, if not, try another company, if that doesn't work then train the model yourself, and use a managed service like AWS Sagemaker.

This is the way I think, and believe for 80-90% of people working in AI, they will be served best by the greedy approach.
Salil,

You don't need any math background to understand the book.  This was another approach I chose is to not make math a central focus.  

I don't explicitly cover Ensemble Learning and Deep Learning in this book with great detail, but this could be a good "philosophical" step in the journey to both of those topics.  In the Google Cloud section, I have an example of using TPUS (I think this may be the first book with an example since I was given Alpha access to them), and some Deep Learning is done via Tensorflow on a TPU.  This is a very basic example though.

Ensemble Learning is only talked about in the context of the Netflix prize where I mention that the winning approach, an ensemble learning method, wasn't implemented because of the complexities of putting it into production.  I think there is a lesson here, and that is that complex ML techniques may ultimately not make it into production if the operationalization is not accounted for.  This is where Managed Machine Learning systems:  Sagemaker, etc, play a role.  They abstract away ML operational complexity.