2 resultados para tacit and explicit knowledge
em AMS Tesi di Laurea - Alm@DL - Università di Bologna
Resumo:
Natural Language Processing (NLP) has seen tremendous improvements over the last few years. Transformer architectures achieved impressive results in almost any NLP task, such as Text Classification, Machine Translation, and Language Generation. As time went by, transformers continued to improve thanks to larger corpora and bigger networks, reaching hundreds of billions of parameters. Training and deploying such large models has become prohibitively expensive, such that only big high tech companies can afford to train those models. Therefore, a lot of research has been dedicated to reducing a model’s size. In this thesis, we investigate the effects of Vocabulary Transfer and Knowledge Distillation for compressing large Language Models. The goal is to combine these two methodologies to further compress models without significant loss of performance. In particular, we designed different combination strategies and conducted a series of experiments on different vertical domains (medical, legal, news) and downstream tasks (Text Classification and Named Entity Recognition). Four different methods involving Vocabulary Transfer (VIPI) with and without a Masked Language Modelling (MLM) step and with and without Knowledge Distillation are compared against a baseline that assigns random vectors to new elements of the vocabulary. Results indicate that VIPI effectively transfers information of the original vocabulary and that MLM is beneficial. It is also noted that both vocabulary transfer and knowledge distillation are orthogonal to one another and may be applied jointly. The application of knowledge distillation first before subsequently applying vocabulary transfer is recommended. Finally, model performance due to vocabulary transfer does not always show a consistent trend as the vocabulary size is reduced. Hence, the choice of vocabulary size should be empirically selected by evaluation on the downstream task similar to hyperparameter tuning.
Resumo:
To be able to interpret patterns of biodiversity it is important to understand the processes by which new species evolve and how closely related species remain reproductively isolated and ecologically differentiated. Divergence and differentiation can vary during speciation and it can be seen in different stages. Groups of closely related taxa constitute important case studies to understand species and new biodiversity formation. However, it is important to assess the divergence among them at different organismal levels and from an integrative perspective. For this purpose, this study used the brown seaweeds genus Fucus as a model to study speciation, as they constitute a good opportunity to study divergence at different stages. We investigated the divergence patterns in Fucus species from two marginal areas (northern Baltic Sea and the Tjongspollen area), based on phenetic, phylogenetic and biological taxonomical criteria that are respectively characterised by algal morphology, allele frequencies of five microsatellite loci and levels of secondary polyphenolic compounds called phlorotannins. The results from this study showed divergence at morphological and genetic levels to certain extent but complete lack of divergence at biochemical level (i.e. constitutive phlorotannin production) in the Baltic Sea or Norway. Morphological divergence was clearly evident in Tjongspollen (Norway) among putative taxa as they were identified in the field and this divergence corresponds with their neutral genetic divergence. In the Baltic, there are some distinguishable patterns in the morphology of the swedish and finnish individuals according to locality to certain extent but not among putative taxa within localities. Likewise, these morphological patterns have genetic correspondence among localities but not within each locality. At the biochemical level, measured by the phlorotannin contents there were neither evidence of divergence in Norway or the Baltic Sea nor any discernable aggregation pattern among or within localities. Our study have contributed with further understanding of the Baltic Sea Fucus system and its intriguingly rapid and recent divergence as well as of the Tjongspollen area systems where formally undescribed individuals have been observed for the first time; in fact they appear largely differentiated and they may well warrant a new species status. In current times, climate change threatens, peripheral ecosystems, biodiversity, and increased knowledge of processes generating and maintaining biodiversity in those ecosystems seem particularly important and needed.