Friday, May 11, 2012

Machine Learning for Hackers - Book Review

I was following the courses in Machine Learning through the Coursera website when i had the chance to review this book. For me it's difficult to start programming without a real motivation or pressure of a deadline and I hate myself for that. Reading or follow only theorical lectures is, of course, useful to improve your knowledge and find similar patterns in different problems but only when you start to practically use what you have learned that your mind stores the information in a more solid way.

This book tries to be less mathematical as possible, diving directly into practice with the R language functionality.
After explain the functionality of the R language, the author explains the main categories of machine learning algorithms as Supervised and Unsupervised learning.
  • Categorization: using a Bayesian Classifier wrote using R, we will be able to categorize Spam messages from good messages
  • Priority Ranking: in this chapter the author describes what features are used by google to define a priority in the received emails and provides the code for write your own ranker based on this features.
  • Linear Regression: this chapter explains one of the powerful tools of Machine Learning, Linear Regression. Using Linear Regression the case of study explains how to create a system that predicts page views of the 1000 top websites. This chapter explains also different measures of error that can be performed to test how our model works.
  • Nonlinear Regression and Regularization: the first part of the chapter explains Nonlinear regression that means when the prediction can't be mapped in the linear formula:                                       Prediction= Constant + X+Y+Z.  The second part of the chapter talks about Regularization that explains how to prevent the overfitting, that means our model fits so good the training data but is no able to fit the test data in the long run.This is due to the fitting of the training data that is so precise that starts to model also the noise (outliers) in the training data.
  • Optimization: After we define a measure of error for the model, we can tune it, optimizing some parameter. This parameters are the result of the minimization of the error measure for example. The case of study of this chapter is to build a Code breaking system. 
  • PCA: This technique allows to reduce the complexity of the data reducing the dimension of the problem and is useful to extract a vectore that resumes all the data set.
  • MDS: Sometimes is difficult to see the relations between different features without a graphical help. The MDS algorithm allows to create a clustering of all the data and is able to plot this clusters in a meaningful way.
  • K-nn nearest neighborhood: Sometimes isn't possible to define a model general for the whole dataset so, we can make a prediction using the nearest possible choices and give a score for each one. This system is used in the case study to realize a recommandations system that suggests packages based on the installed packages.
  • Social Graph: this chapter explains how social website define their connectivity model and explains how to study the relationships realizing a "Who to follow" system for twitter.
  • SVM: the support vector machine is another powerful tool for machine learning and is widely used when isn't possible to define a linear decision boundary. The case of study of this chapter is a comparison of the prediction algorithms used in the previous chapters.
For my point of view, this is a good book for who wants to dive into machine learning without know anything about math, statistic and probability. Honestly I think that this approach doesn't work well, at least for me, because some algorithms are difficult to use correctly or find a relation with our problems without the abstraction that math gives. Also, theory is useful to understand better how some parameters affect the prediction. Sometimes in the book there are notions that needs theory but instead there are explained as "change the value and see what happens" method.
Even if this is not a R programming book, the language is not clearly explain in the text and a previous study of the language is probably necessary or translating the code in a more common language as Python will help understand the problems. I've also notice some incoherence between the text and the code, in fact I've faced some errors using the code in the text while the code in the CD-Rom work perfectly.

As conclusion, I will suggest this book to someone who wants to learn autonomously concepts of machine learning but, I suggest to use the online courses for study the theory behind these algorithms.

No comments:

Post a Comment