DAaML/z4

From WikiZMSI

Boosting (AdaBoost / RealBoost with shallow trees)

Classroom

  • Implement the class representing AdaBoost ensemble classifier. Its constructor should allow to define what kind of weak classifiers should be applied in the ensemble (e.g. by passing their Python's class and parameters). Also, the constructor should allow to define the wanted number of boosting rounds (T).
  • Remark: if shallow decision trees (implemented in the previous assignment) are to be used as weak classifiers, one needs to extend them with the possibility to accept and use weights of data examples (so that current boosting weights can be passed to a new tree - a new weak classifier - before its fit is executed).
  • When performing the fit function (during main boosting loop) report to the screen information about successive weak classifiers: their errors (both zero-one errors and exponential) and their importance coefficients (alphas).
  • Prepare a suitable predict function. If needed, provide additional decision_function in the base learner class (e.g. within decision tree) to force weak learners return binary responses {-1, 1} rather than class labels.

Homework

  • Perform batch experiments (on Olivetti data) with AdaBoost coupled with CART decision trees. Elements to experiment with: no. of used PCA features (n = 20, 50, 100), no. of boosting rounds (T = 16, 32, 64), depths of trees being the weak classifiers (depth_limit = 1, 2, 3). Report and plot training and testing errors along the progress of the algorithm (over t = 1, 2, ..., T). Prepare a short pdf report on the above experiments.