Cilj predmeta je poglobiti znanje iz strojnega učenja, ki so ga študenti pridobili na dodiplomskem študiju. Pri predmetu spoznavamo najbolj uspešne pristope in se poglobimo vanje, spoznamo kako delujejo in kakšne so njihove omejitve. Predmet pripravi študenta na nadaljnji, bolj poglobljen študij
pristopov strojnega učenja oziroma na uporabo metod strojnega učenja v praksi.

Vsebina predmeta:

Kaj je strojno učenje, kaj so osnovni principi, kaj želimo doseči.
Linearna regresija in regularizacija, cenovne funkcije.
Vrednotenje modelov.
Gradientni sestop in stohastičen gradientni sestop in zakaj sta metodi uporabni v strojnem učenju.
Klasifikacija z metodo logistične regresije.
Generalizirani linearni modeli.
Ansambelske metode.
Jedrne metode.
Umetne nevronske mreže.
Metode za zmanjševanje dimenzionalnosti prostora.
Razlaga modelov strojnega učenja.
Spodbujevano učenje.

How similar are living organisms? Have human indeed descended from Neanderthals? How did various species adapt to living environments? Which genes are responsible for susceptibility to common disseases? Why do we need a different flu vaccine each year?

The answers to these and similar questions can be found through the studies of biomedical data. The discipline that does this is called bioinformatics. Bioinformatitians develop tools to find interesting data patterns and supoort understanding of biological processes. They analyze and compare protein sequences, search for genes, assemble genomes, compare species based on their genetic material, forecast protein structures and find active parts of proteins, analyze gene expression, and inverse engineer genetic networks. They can relate the genome to its phenotype and can trace evolution back to its roots to find adaptations and changes in the organisms.

The course is intended for students in computer science. No prior knowledge in molecular biology is required. We will introduce essential concepts from molecular biology and genetics as these are required to understand associated computational tools. We will learn about sequence alignment, hidden Markov models, clustering, phylogenetic analysis, statistical testing and some machine learning. The course will be practical, and will include up to eight homeworks on analysis of real mlecular biology data. Students are expected to have background in essential probability, statistics, and programming. We will use Python as a programming language.