Assignment: Classifiers and their Decision Boundaries
First part
Consider the following classifiers:
- classification tree of the depth 1 (so-called a "stump", set the parameter "Limit the depth to" to 1)
- classification tree of the depth 3
- logistic regression
- SVM with RBF (radial basis function) kernel and g=1
- random forest with 50 trees
- nearest neighbors classifier with the number of neighbors set to 5
For each of the classifiers above paint:
- a data set where the classifier finds the "right" decision boundary
- a data set where the classifier fails to find the "right" decision boundary
Demonstrate A and B using scatter plots. A minimal schema contains Paint Data, Predictions and Scatter Plot, plus a learner (say, a Tree, receiving the data and passing a classifier to the Predictions). In the scatter plot, you can color the dots by the predicted class and set the shape to represent the true class value. For instance, for classification tree, the Scatter Plot widget could look something like the following:
Second part
Demonstrate the effect of regularization strength for SVM with RBF kernels (modify the value of g) or Neural Networks (modify the number of layers and the number of neurons per layer).
In the homework, just show us the graphs (press Ctrl-C or Cmd-C in the widget to copy the image to the clipboard), not the entire widget. With each graph, you may report also on AUC scores you get using your data and 10-fold cross-validation. It may happen that for some of the classifiers you won't be able to paint a data set matching A and B. If this is the case, please provide your intuition why.
Happy painting!