Tedenski oris

  • Splošno


    Mining Massive Data Sets

    Orodja za analizo velikih podatkovnih baz
    Izvajanje predmeta poteka v predavalnici P04 ob ponedeljkih ob 16h.
    Izpit bo v četrtek, 21. marca, ob 16h v predavalnici P03.

    Instructor: Jure Leskovec
    Teaching Assistants: Matej Guid, Aleš Papič

    Email: matej.guid@fri.uni-lj.si, ap5327@student.uni-lj.si

    Schedule: This course starts in the second week of January. We will follow the CS246 schedule, which means that you will also have to do homework assignment during exam break. 

    The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. Topics include: MapReduce and Spark/Hadoop, Frequent itemsets and Association rules, Near Neighbor Search in High Dimensions, Locality Sensitive Hashing (LSH), Dimensionality reduction, Recommendation Systems, Clustering, Analysis of massive graphs, Link Analysis (PageRank, HITS), Web spam (TrustRank), Proximity search on graphs, Large scale supervised machine learning, Mining data streams, Learning through experimentation, Web Advertising and Optimizing submodular functions. This course is offered in collaboration with the Stanford University, which offers this course as CS246. Videos of lectures will be available for download. Our university will organize short weekly review sessions and consultations. 

    Pregledali bomo algoritme strojnega učenja in iskanja znanj v podatkih, ki zmorejo obdelati zelo velike količine podatkov. Med drugim bomo obravnavali naslednje teme: postopek "MapReduce" (preslikaj in skrči), pogosto ponavljoče se stvari v košaricah in povezovalna pravila, učinkovito iskanje sosedov v velikih podatkih, zgoščevanje s sosednostjo (LSH), zmanjševanje dimenzionalnosti, priporočilni sistemi, odkrivanje skupin v podatkih, analiza masivnih grafov, analiza povezav (PageRank, HITS), nezaželene spletne vsebine (TrustRank), iskanje bližnjih vozlišč v grafih, nadzorovano strojno učenje na velikih podatkih, učenje iz podatkovnih tokov, učenje z eksperimentiranjem, spletno oglaševanje in optimiranje submodularnih funkcij. Predmet bo izvajal predavatelj iz Stanforda, kjer se ta predmet izvaja kot CS246. Predavanj ne boste spremljali v živo, pač pa prek video posnetkov. Na FRI bomo organizirali kratke preglede odpredavanega in konzultacijske vaje.


    USEFUL LINKS / KORISTNE POVEZAVE

    Course website / Spletna stran predmeta: http://web.stanford.edu/class/cs246/

    Important info / Pomembne informacije:


    Classes / Predavanja

    Additional materials / Dodatna gradiva: https://web.stanford.edu/class/cs246/index.html#coursework

    Reference text / Knjiga: http://www.mmds.org/

    Short Weekly Gradiance quizzes / Tedenski kratki kvizi Gradience :
    • sign up at http://www.gradiance.com/services and register with slo_<name>_<family_name>, use code 3DBCAD12
    • quizzes are posted every Tuesday
    • due 9 days later on Thursday 23:59 Pacific Time (PT), but rather submit earlier!
    • as many submission as you like, your score is based on the most recent submission

    Piazza (forums & discussions)
    • you received invitation via email (let us know if you didn't!)

    More about the course is on the CS246 Stanford web page. All announcements will be on either of the class PIazzas. All deadlines on FRI are exactly the same as Stanford deadlines.


  • Prosojnice z vaj in dodatna gradiva

  • Gradience Quizzes

    Do not submit gradiance quizzes here; this is set up just so you can see the due dates on učilnica.

  • Homework Submissions

    Your submission should consists of two files:

    • file <name>_<surname>.pdf: written report. Please, use Cover sheet as the first page.
    • file <name>_<surname>.zip: all the requested code.

    Note that you submit the homeworks only here on ucilnica. You do not submit anything to GradeScope or SNAP website!

    Along with this online submission, you are also required to submit the written report in paper. The written report should be submitted to Matej's office (if he is not there push it under the office door).

    Late days: you are allowed to use the “late days” twice with your homework (but only once per particular homework!). Do not submit your homework later than Tuesday 9:00 CET, the first Tuesday after the regular deadline.