Students select project theme and work in groups to complete the project. Students present their midterm progress and results. Students complete the Project with a public presentation of their work.

Project themes are compiled by the lecturer from proposals by faculty members and industry.

Matrix and tensor algebra. Notation. Differentiation.

Theory. Gradient. Convexity. Strong convexity. Lipschitz continuity. Limits on convergence rate. Constrained optimization. Dual function. Dual problem. Strong duality. Slater’s condition. Karush-Kuhn-Tucker condition.

Optimization methods. Gradient. Stochastic gradient. Conjugate gradient. Quasi-Newton. Subgradient. Proximal gradient. Accelerated gradient. Interior-point methods. ADMM. Adaptive gradient methods.

Data. Summarizing data. Visualizing data. The fundamental problem of data analysis: uncertainty in our understanding of the data generating process.

Probability. The axiomatic, Bayesian and classical (frequentist) views of probability. Joint, marginal and conditional densities. Bayes theorem.

Distributions. Common probability distributions. Distributions as a means for expressing probabilistic opinions. Distributions as data generators.

Fundamental statistical techniques. Monte Carlo integration. Bootstrap. Maximum likelihood estimation. Bayesian inference.

Basic statistical tasks. Hypothesis testing vs Bayesian estimation.

The multivariate normal distribution. As a linear transformation. Linear regression. PCA.

The course is an introductory overview of topics relevant to data science. The following topics will be presented to students through lectures by faculty members and guest lecturers from industry and research institutions:

Working with data. Getting. Processing. Storing. Cleaning. Summarizing. Visualizing.

Analytics. Prediction. Clustering. Statistical inference.

Business and social aspects. Privacy. Security. Ethics. Licensing. Intellectual property.

Best practices (tools). Programming, coding standards (Python). Versioning (Github). Reproducibility (Jupyter). Typsetting (LaTeX). Public repositories (ArXiv, Zenodo).

Linear models. Linear regression. Linear discriminant analysis. Logistic regression. Gradient descent. Stochastic gradient descent.

The machine learning approach. Cost functions. Empirical risk minimization. Maximum likelihood estimation. Model evaluation. Cross-validation.

Feature selection. Search-based feature selection. Regularization.

Tree-based models. Decision trees. Random forest. Bagging. Gradient tree boosting.

Clustering. k-means. Expectation Maximization.

Non-linear regression. Basis functions. Splines. Support vector machines. Kernel trick.

Neural networks. Perceptron. Activation functions. Backpropagation.