Tema predmeta so sodobni algoritmi za učenje iz podatkovnih tokov. Učili se bomo o odprtih izzivih na področju (inkrementalni modeli za nadzorovano učenje, stiskanje podatkov, odkrivanje spremembe v porazdelitvi toka (concept drift), gručenje iz podatkov, specializirane statistike za vrednotenje uspešnosti). S pridobljenim znanjem bo študent sposoben uporabljati svoje znanje o strojnem učenju pri aplikacijah, ki so povezane z obilico vsakdanjih podatkov (finančne in bančne transakcije, vremenski podatki, senzorski podatki itd.).

Predmet bo organiziran kot kombinacija predavanj in laboratorijskih vaj (te bodo izvedene z uporabo statističnega paketa R). V okviru vaj bodo študenti znanje aplicirali na izbranem problemu, ki je lahko direktno povezan tudi s tematiko doktorske naloge. V preostanku semestra bo organizirano tudi medsebojno tekmovanje za izdelavo najbolj točnega napovednega modela na podanih podatkih.

Incremental Learning from Data Streams

The goal of the proposed course is to teach the students about the state-of-the-art algorithms that are used to perform learning from data streams. The course will guide the students through the major open challenges in the field (supervised learning, data compression, concept drift detection, clustering from streams, specialized evaluation statistics). With this knowledge, the students will be able to apply their machine learning skills to a specialized and useful area that is connected to the abundance of data in our everyday lives (bank/weather/financial transactions, sensor readings etc.).

The course will be organized by mixing lectures with hands-on lab exercises that the students will do in the Statistical package R. The lab exercises will include applying the acquired knowledge on their own problem and stimulating a competition between students to achieve the best possible learning results.