Mining Massive Data Sets 2024
Section outline
-
Orodja za analizo velikih podatkovnih baz
Pregledali bomo algoritme strojnega učenja in iskanja znanja v podatkih, ki zmorejo obdelati zelo velike količine podatkov. Predmet se izvaja vzporedno s predmetom Mining massive data sets na Stanford University (prof. Jure Leskovec).
Predviden čas izvajanja predmeta: januar – marec 2025.
Mining massive data sets
The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. Topics include:
- MapReduce and Spark
- Frequent itemsets and Association rules
- Near Neighbor Search in High Dimensions
- Locality Sensitive Hashing (LSH)
- Dimensionality reduction: SVD and CUR
- Recommender Systems
- Clustering
- Analysis of massive graphs
- Link Analysis: PageRank, HITS
- Web spam and TrustRank
- Proximity search on graphs
- Large-scale supervised Machine Learning
- Mining data streams
- Learning through experimentation
- Web advertising
- Optimizing submodular functions
Assignments and grading:
- 4 homework assignments requiring coding and theory (40%)
- Final exam (30%)
- Weekly Colab notebooks (30%)
Useful links:
- Course website: http://web.stanford.edu/class/cs246/
- Handouts (PDF): http://web.stanford.edu/class/cs246/handouts/CS246_Info_Handout.pdf
- Reference book: http://www.mmds.org/
All deadlines at FRI are exactly the same as Stanford deadlines.
Video lectures from past courses:
- 2024: https://snap.stanford.edu/class/cs246-videos-2024/
username: snap, password: cs246-videos-2024 - 2023: https://snap.stanford.edu/class/cs246-videos-2023/
username: snap, password: cs246-spring2023-videoarchive - 2022: https://snap.stanford.edu/class/cs246-videos-2022/
username: cs246, password: mining2022 - 2019: http://snap.stanford.edu/class/cs246-videos-2019/
- 2018: http://snap.stanford.edu/class/cs246-videos-2018/