Machine Learning

Learning materials

Time slots for the 'exam'

Friday, Feb 4, 1-2 pm. We will use the same zoom meeting as for the lectures.
(other days will be announced later)

Office hours

tba

Assignment 1

linear regression (0.5 points) and Ridge (iterative method: 0.5 points; formula: 0.25 points). Deadline: labs during the week 18--22 October.

Assignment 2

Write neural network from scratch to deal with a classification problem of your choice. It is enough to implement a 2-layer dense neural network and play with some of the hyperparameters. With a good implementation you may easily take more layers and experiment with the number of layers as well, but it is not obligatory.
Suggestions: use the MNIST database with handwritten digits (see below); use Layer as a basic building block so that you may use efficient matrix operations.
You may look at the forward pass/backpropagation example, sort of hand-calculated using very small 2x2 network.
For debugging you may adapt the unit tests from a (bad) example implementation. (This implementation uses Neuron as a base building block, which is a bad idea.)

Database with handwritten digits:
you may take the files mnist_loader.py and mnist.pkl.gz from this repository. Please read the documentation in mnist_loader.py to learn how to use it (it's simple!).

Assignment 3

Data. Look for multivariate, categorical. Iris, breast cancer and titanic (not on that webpage) are one of the most commonly used (but Iris is not the best, it has too few observations). Minimal program: use decision trees to model; check pruning, find alpha by cross-correlations (or checking on test data); check at least 2 out of these three: bagging, random forests, boosting.

Assignment 4

Use k-means and hierarchical clustering on a data set of your choice. Run it several times (for k-means) and use at least 2 different linkage types (for hierarchical). You may use e.g. scikit. Deadline: January, 21.

Assignment 5

Use CNNs or RNNs for some problem of your choice. For example, you may use CNN for image classification, or RNNs for some time series prediction. You may use Keras or some other framework. Deadline: January, 25.

Rules
up