Osnutek odseka

  • Orodja za analizo velikih podatkovnih baz

    Obravnavali bomo algoritme podatkovnega rudarjenja in strojnega učenja za analizo zelo velikih količin podatkov. Predmet se izvaja vzporedno s predmetom Mining massive data sets na Stanford University (prof. Jure Leskovec). Običajno poteka od prve polovice januarja do sredine marca.

    Mining massive data sets

    The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data.

    Topics include:

    MapReduce and Spark; Frequent Itemsets and Association Rules; Locality-Sensitive Hashing; Clustering; Dimensionality reduction; Recommender Systems; Link Analysis: PageRank & Extensions); Community Detection in Graphs; Learning Embeddings; Graph Representation Learning; Graph Neural Networks; Large-Scale Supervised Machine Learning; Mining Data Streams; Computational Advertising; Optimizing Submodular Functions; Multi-Armed Bandits

    Assignments and grading:

    • 4 homework assignments requiring coding and theory (40%)
    • Final exam (30%)
    • Weekly Colab notebooks (30%)

    Useful links:

    All deadlines at FRI are exactly the same as Stanford deadlines.

  • Video lectures from the current course:

  • Submit Colab notebooks here; every week no later than Friday 9 am.

    Your submission should contain a ZIP file:

    • Jupyter notebook in HTML format (download the jupyter notebook file and then use the command "jupyter nbconvert --to html <file_name.ipynb>" in the command prompt).
    • text file with answers to the questions (the submission page will always contain a document with questions). 

    Each file should use the following naming convention:

    colab<number>_<name>_<surname>.html
    colab<number>_<name>_<surname>.txt

    • Please pay attention to the questions in the attached PDF - they are a part of Colab 0 notebook!

      Rok za oddajo je podaljšan: torek, 20. januar 2026, 09.00

    • Please pay attention to the questions in the attached PDF - they are a part of Colab 1 notebook!

      Rok za oddajo je podaljšan: torek, 20. januar 2026, 09.00

  • Your submission (every second Friday 9:00 CET) should be a ZIP file containing three files:

    • file <name>_<surname>.pdf: written report.
    • file <name>_<surname>.zip: all the requested code. Use subfolders ("q1", "q2", ...) for partiqular questions. Include at least .ipynb and .html files, .py files are welcome too.
    • Cover sheet (make sure you state your collaborators and the date of submission).

    Note that you submit the homeworks only here on Ucilnica. You do not submit anything to GradeScope or SNAP website!

    Late days: you are allowed to use the “late days” twice with your homework (but only once per particular homework!). Do not submit your homework later than Tuesday 9:00 CET, the first Tuesday after the regular deadline.