IP (A) (Orodja za analizo velikih podatkovnih baz)
Section outline
-
Orodja za analizo velikih podatkovnih baz
Pregledali bomo algoritme strojnega učenja in iskanja znanja v podatkih, ki zmorejo obdelati zelo velike količine podatkov. Predmet se izvaja vzporedno s predmetom Mining massive data sets na Stanford University (prof. Jure Leskovec).
Predviden čas izvajanja predmeta: januar – marec 2025.
Mining massive data sets
The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. Topics include:
- MapReduce and Spark
- Frequent itemsets and Association rules
- Near Neighbor Search in High Dimensions
- Locality Sensitive Hashing (LSH)
- Dimensionality reduction: SVD and CUR
- Recommender Systems
- Clustering
- Analysis of massive graphs
- Link Analysis: PageRank, HITS
- Web spam and TrustRank
- Proximity search on graphs
- Large-scale supervised Machine Learning
- Mining data streams
- Learning through experimentation
- Web advertising
- Optimizing submodular functions
Assignments and grading:
- 4 homework assignments requiring coding and theory (40%)
- Final exam (30%)
- Weekly Colab notebooks (30%)
Useful links:
- Course website: http://web.stanford.edu/class/cs246/
- Handouts (PDF): http://web.stanford.edu/class/cs246/handouts/CS246_Info_Handout.pdf
- Reference book: http://www.mmds.org/
All deadlines at FRI are exactly the same as Stanford deadlines.
Video lectures from past courses:
- 2024: https://snap.stanford.edu/class/cs246-videos-2024/
username: snap, password: cs246-videos-2024 - 2023: https://snap.stanford.edu/class/cs246-videos-2023/
username: snap, password: cs246-spring2023-videoarchive - 2022: https://snap.stanford.edu/class/cs246-videos-2022/
username: cs246, password: mining2022 - 2019: http://snap.stanford.edu/class/cs246-videos-2019/
- 2018: http://snap.stanford.edu/class/cs246-videos-2018/
-
The exam will start right after you receive the printed exams.
- The exam is worth 30% of your course grade.
- It is a 3-hour exam, meaning it will last 180 minutes.
This exam is open-book and open-notes. You may use notes (digitally created notes are allowed) and/or lecture slides and/or any reference material. However, answers should be written in your own words.
Acceptable uses of computer:
- You may access the Internet, but you may not communicate with any other person. Similarly, AI-driven code completion tools including ChatGPT and GitHub Copilot are not allowed.
- You may use your computer to write code or do any scientific computation, though writing code is not required to solve any of the problems in this exam.
- You can use your computer as a calculator or an e-reader.
-
368.4 KB
-
213.0 KB
-
405.5 KB
-
See the "Final Exam Review Session" lecture in the Winter Course 2022.
8.0 MB -
1.4 MB
-
Uploaded 7/03/22, 22:45
-
Uploaded 7/03/22, 22:46