2 resultados para objective and experiential knowledge

em AMS Tesi di Laurea - Alm@DL - Università di Bologna


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Natural Language Processing (NLP) has seen tremendous improvements over the last few years. Transformer architectures achieved impressive results in almost any NLP task, such as Text Classification, Machine Translation, and Language Generation. As time went by, transformers continued to improve thanks to larger corpora and bigger networks, reaching hundreds of billions of parameters. Training and deploying such large models has become prohibitively expensive, such that only big high tech companies can afford to train those models. Therefore, a lot of research has been dedicated to reducing a model’s size. In this thesis, we investigate the effects of Vocabulary Transfer and Knowledge Distillation for compressing large Language Models. The goal is to combine these two methodologies to further compress models without significant loss of performance. In particular, we designed different combination strategies and conducted a series of experiments on different vertical domains (medical, legal, news) and downstream tasks (Text Classification and Named Entity Recognition). Four different methods involving Vocabulary Transfer (VIPI) with and without a Masked Language Modelling (MLM) step and with and without Knowledge Distillation are compared against a baseline that assigns random vectors to new elements of the vocabulary. Results indicate that VIPI effectively transfers information of the original vocabulary and that MLM is beneficial. It is also noted that both vocabulary transfer and knowledge distillation are orthogonal to one another and may be applied jointly. The application of knowledge distillation first before subsequently applying vocabulary transfer is recommended. Finally, model performance due to vocabulary transfer does not always show a consistent trend as the vocabulary size is reduced. Hence, the choice of vocabulary size should be empirically selected by evaluation on the downstream task similar to hyperparameter tuning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

My thesis falls within the framework of physics education and teaching of mathematics. The objective of this report was made possible by using geometrical (in mathematics) and qualitative (in physics) problems. We have prepared four (resp. three) open answer exercises for mathematics (resp. physics). The test batch has been selected across two different school phases: end of the middle school (third year, 8\textsuperscript{th} grade) and beginning of high school (second and third year, 10\textsuperscript{th} and 11\textsuperscript{th} grades respectively). High school students achieved the best results in almost every problem, but 10\textsuperscript{th} grade students got the best overall results. Moreover, a clear tendency to not even try qualitative problems resolution has emerged from the first collection of graphs, regardless of subject and grade. In order to improve students' problem-solving skills, it is worth to invest on vertical learning and spiral curricula. It would make sense to establish a stronger and clearer connection between physics and mathematical knowledge through an interdisciplinary approach.