How similar are living organisms? Have human indeed descended from Neanderthals? How did various species adapt to living environments? Which genes are responsible for susceptibility to common disseases? Why do we need a different flu vaccine each year?
The answers to these and similar questions can be found through the studies of biomedical data. The discipline that does this is called bioinformatics. Bioinformatitians develop tools to find interesting data patterns and supoort understanding of biological processes. They analyze and compare protein sequences, search for genes, assemble genomes, compare species based on their genetic material, forecast protein structures and find active parts of proteins, analyze gene expression, and inverse engineer genetic networks. They can relate the genome to its phenotype and can trace evolution back to its roots to find adaptations and changes in the organisms.
The course is intended for students in computer science. No prior knowledge in molecular biology is required. We will introduce essential concepts from molecular biology and genetics as these are required to understand associated computational tools. We will learn about sequence alignment, hidden Markov models, clustering, phylogenetic analysis, statistical testing and some machine learning. The course will be practical, and will include up to eight homeworks on analysis of real mlecular biology data. Students are expected to have background in essential probability, statistics, and programming. We will use Python as a programming language.
Machine learning is used in industry, medicine, economy etc. for data analysis and knowledge discovery from databases, data mining, for generating knowledge bases for expert systems, for learning predictions and recognitions, playing games, understanding natural language, hand-writing, speech, images etc. A basic principle of machine learning is decribing (modelling) of phenomena from data. The result of learning are rules, functions, relations, equations, and probabilistic distributions. The trained models try to explain data and can be used for decision making during the observation of the modelled process in future. The goal of the course is to present the teoretical basics and basic priciples of machine learning methods, basic machine learning algorithms and their usage in practice for knowledge discovery from data and for learning classification and regression models. Students will apply the theoretical knowledge on real world problems from science and economy.
Overview of course contents:
is learning and relation between learning and intelligence, ML basics, Advanced
attribute evaluation measures, Advanced methods for estimating performance of
ML, Advanced visualization methods, Combining ML algorithms, Bayesian learning,
Calibration of probabilities, Explanation of individual predictions, Numerical
ML methods, Artificial neural networks: RBF, Deep NN, Unsupervised learning:
clustering, Association rules, Estimating the reliability of individual
predictions, Text mining, Matrix factorization, Arcehtypal analysis, ML as data
compression, active learning, user porfiling and recommendation systems, ILP, Introduction
to learning theory.
Practical part is in the form of solving problems and web quizzes and completing the seminar work. Assistant is available for consultations. The grade of practical work is the grade of the seminar work. The precondition for passing practical work is achieving at least 50% of points in web quizzes.
The final course grade consists of practical work grade (50%) and exam (50%). On the written exam students need to achieve at least 50% of points.
- nosilec: Igor Kononenko