Monday, July 31, 2017

Part 10: Model Selection & Boosting

Here we'll try to address few questions using Model Selection techniques.
  • How to find the most appropriate Machine Learning model for my business problem ?
  • How to deal with the bias variance tradeoff when building a model and evaluating its performance - k-Fold Cross Validation 
  • How to Improve model performance by choosing the optimal values for the hyper-parameters (the parameters that are not learned) -  Grid Search
Model Selection techniques are:
  • k-Fold Cross Validation - Used to Evaluate Model Performance
  • Grid Search - Use to Improve Model Performance (Finding optimal values for the hyper-parameters)
In last we'll learn the most powerful Machine Learning model: XGBoost.

Cheat Sheet: 

For a given dataset, First step is to know business problem.
  • Regression (Have Dependent Variable & Continuous Outcome)
  • Classification (Have Dependent Variable & Categorical Outcome)
  • Clustering (No Dependent Variable)
Second step is to know problem is Linear/Non-Linear Separable. E.g Choose SVM (For Linear) and Kernel SVM for Non-Linear. To know this question, Grid Search is the best method.


Hope this helps!!

Arun Manglick

No comments:

Post a Comment