Final project preparation guidelines and peer review
As peer review and my review will start on Monday, you have some more time to polish your repositories/projects. Some major remarks that you should keep in mind out are the following:
- Comment on all the specifics of your algorithms that you use or have designed (i.e. features, hyperparameters, ...). In some cases it is useful to include an image instead of giving long description.
- When you include graphs/images, they must be readable and provide additional insight (e.g. if there are lots of lines across the image or little space between them, it is not okay). When you report on results, keep them as much as possible in one table, so that reader can compare different configurations. Also, it is useful to bold best results. Both images and tables should be self-contained - i.e. together with caption they need to provide enough information to the reader to understand it's meaning without reading text around.
- Keep your report concise and try to not submit report longer than 8 pages. Also, make sure you followed the proposed template.
- Focus on reporting results and using sensible measures. Try to find examples where your algorithm works better and may not even work at all. Explain why and also justify the differences in approaches that you used. In case previous work exist for your dataset put also best results of other researchers in your results table (even if your results are much lower).
- Your submitted work (repository and report) should be structured in a way that your colleagues would be able to understand and re-run everything. Include all dependencies for you projects:
- In case you have used a non-public (or semi-public) dataset, do not include it in the repository, just put your contact data or protected link to download data.
- Datasets that are available elsewhere should be just linked in your report/repository. If you performed additional transformations on datasets, scripts for that should be available in the repository.
- Some of you used models that take longer time to train. You can include those models (maybe just the best one) in the repository or elsewhere and link it.
- Machine translation groups must make their models available as they take longer time to train!
- Lastly, check that your repositories are publicly available until Monday 6am!
Please find the projects you need to review here: https://docs.google.com/spreadsheets/d/1iRAtijsrzY8I8l_ATmDF_JRMGx1e87-tWtVDINVt7-I/edit?usp=sharing.
Each group needs to review 2 projects of the same topic they have chosen. Please, perform you peer review starting on Monday, May 24 at 6am and finish until 11:55pm.
Submit your peer review scores in the following Google Form: https://forms.gle/vk5PGS9BgpJfezRy8.
You will get score also for your grading, depending on how much (of course by some margin) your grading will be different from the assistant's grading.
The last submission is worth 30 points and consists of the following:
- 10 points: Repository includes README.md with clear instructions on how to easily install prerequisites and run all the analysis. The code must be runnable and output results must be similar to the ones reported in the report. If your training takes a lot of time, include pre-trained models in your repository. It is not the idea to run everything, just some parts and check the code to get a feeling if provided source code is runnable or there is something missing.
- 20 points: Report follows suggested structure (abstract, introduction, related work, methods, results and discussion, conclusion) and is roughly maxed to 8 pages long. The style is the same as defined in the project introduction slides. The report concisely describes data preprocessing, features extraction and approaches/algorithms (at least 3 approaches). Results and discussion show the results and comments on the performance of algorithms. Discussion also points out where the algorithms do not work well and where they achieve good performance. In the conclusion the authors should point out shortcomings of their approaches and give at least one idea for improvements in the future. When reading the report, you could get a feeling that authors know what they are doing, sensibly selected approaches and critically explain and discuss their results. Assign points as follows:
- 5 points: abstract, introduction and related work
- 5 points: clear description of selected datasets and analysis
- 10 points: clear description of methods/algorithms used (at least 3) and adequate discussion of results.
In the review part of the form write justification of your scores.
Final defense will be organized via Zoom for lab sessions as follows:
- Tuesday, May 25 at 5pm: IMapBook project
- Wednesday, May 26 at 5pm: Offensive language exploratory analysis
- Thursday, May 27 at 12am: Offensive language identification
- Thursday, May 27 at 5pm: Machine translation
- * For the machine translation groups we will organize live session at the faculty (those who will not attend in person, will be able to join via Zoom). You will get more information about this during next week.