Alexey Grigorev

Greenhorn
+ Follow
since Dec 08, 2011
Merit badge: grant badges
For More
Cows and Likes
Cows
Total received
1
In last 30 days
0
Total given
0
Likes
Total received
3
Received in last 30 days
0
Total given
0
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Alexey Grigorev

There's no dependency. To be honest, I won't recommend you to read the Java book unless you really need Java. Java is not that popular these days for doing ML.

So you can go to Machine Learning Bookcamp directly
Data science solves business problems using data. AI is just one of the tools in the arsenal of data scientists, but there are others. Machine Learning is a part of AI, and this is what data scientsts use quite often for solving business problems.

So, to answer your question, if you want to do data science, you should learn the basics of machine learning.

My book "Machine Learning Bookcamp" can help you with that - check it here: http://mlbookcamp.com/
Hi Paul,

There are a few points that come to mind. They are more conceptual though, and at the end it still boils down to writing a lot of code.

First, there's a training phase in ML projects, where you get some dataset and produce a model. It's not similar to CRUD, but quite similar to traditional ETL processes.

After that, we need to apply the model.

One option for doing it is putting the model to a web service. Often it's a service that gets a POST request and replies with predictions. Nothing unusual.

There's a slight conceptual difference though: machine learning applications make predictions. These predictions may sometimes be wrong.

In CRUD, you can't be wrong, you save whatever data the user gives you, and then later show this data. If there are no bugs, it just works.
As a programmer, you already have the most essential skill: being able to code.

Now you just need to pick a few projects and do them. Lickily, my book will help =)

Maybe you've heard about a bot from Microsoft that after one hour learning from Twitter became a Nazi? https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist

It's an old story, but Microsoft learned to be careful the hard way. Now they are doing a lot of research to prevent that from happening again
Accidentally I have a book that teaches the basics of ML and also Python. Maybe it'll be interesting for you http://mlbookcamp.com/
Java is not that popular for AI/ML, so you might be better of learning some python first, it's not that difficult. After Java, learning the basics of Python will take a couple of hours. I know that from experience - this is how much time it took me to learn it (I was a Java dev back then)

If you really want to do it with Java, there are a couple of resources that you may find useful. For example, Mastering Java for Data Science (https://www.packtpub.com/product/mastering-java-for-data-science/9781782174271)
Hi Frank,

Yes, there's one chapter about training deep learning models with TensorFlow. In this chapter, we're predicting the category of clothes - we're deternining if we have a picture of a T-shirt, shirt, pants, and so on. This chapter is almost finished and will be released soon

The code is already available, you can check it here: https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/chapter-07-neural-nets/07-neural-nets-train.ipynb


Next, I'll work on another chapter, also about TensorFlow - it's going to be about deploying deep learning models to production.
I'd start with basics of information retrieval: things like TF-IDF, bag of words, etc.

There's a good introduction book available for free: https://nlp.stanford.edu/IR-book/information-retrieval-book.html
I'd start with basics of information retrieval: things like TF-IDF, bag of words, etc.

There's a good introduction book available for free: https://nlp.stanford.edu/IR-book/information-retrieval-book.html
I agree with Lucian, project-based learning is a great way to focus on getting practical skills without going uncesessarily too deep in theory

Kaggle is a great source of inpiration for projects, I do recommend checking it.

You can also check a repository with the projects from my book: https://github.com/alexeygrigorev/mlbookcamp-code

There are 4 example projects so far:

- predicting the price of a car
- churn prediction (determining which customers are likely to stop using the services of a company)
- credit risk scoring
- classifying the type of clothes
Do you want to predict a number? For example, the price of a car. Then it's a regression problem

Do you want to predict a class? For example, is an email spam or not? Then it's a classification problem
A typical way of detecting fraud is looking at anomalies:

- are there big sums of money moved around?
- are there many small transactions from linked accounts to some common place?
- were these accounts sleeping but then suddenly woke up and started to move money?

And things like that. A Machine Learning algorithm can detect such suspicious activity and flag transactions for futher analysis


Check this article for more details: https://tech.olx.com/detecting-fraud-rings-with-unsupervised-learning-554bedf29dbf
Are you interested in learning the theory or in using it for practical applications?

If it's the second, then knowing Python and SQL is enough to get started.

Check my book for more details: http://mlbookcamp.com/
> 1. Do I need to know Big Data and/or Hadoop to learn ML ?

No, not really. Python (or ability to learn it quickly) and SQL should be enough to get started

> 2. Does ML cover Big data and Hadoop ?

Big data is quite an ambiguous term, it may mean many things. ML doesn't include big data, hadoop or other things. But these tools may be useful for ML, even though Hadoop is not that popular anymore. Often, people use tools like Spark and Flink for doing similar tasks

3. Does ML cover SAS and/r R ?

You can do ML with SAS and R, but Python is de-facto standard these days