4 resultados para text analytic approaches
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The aim of this Thesis is to investigate the possibility that the observations related to the epoch of reionization can probe not only the evolution of the IGM state, but also the cosmological background in which this process occurs. In fact, the history of the IGM ionization is indeed affected by the evolution of the sources of ionizing photons that, under the assumption of a structure formation paradigm determined by the hierarchic growth of the matter uctuations, results strongly dependent on the characteristics of the background universe. For the purpose of our investigation, we have analysed the reionization history in innovative cosmological frameworks, still in agreement with the recent observational tests related to the SNIa and the CMB probes, comparing our results with the reionization scenario predicted by the commonly used LCDM cosmology. In particular, in this Thesis we have considered two different alternative universes. The first one is a at universe dominated at late epochs by a dynamic dark energy component, characterized by an equation of state evolving in time. The second cosmological framework we have assumed is a LCDM characterized by a primordial overdensity field having a non-Gaussian probability distribution. The reionization scenario have been investigated, in this Thesis, through semi-analytic approaches based on the hierarichic growth of the matter uctuations and on suitable assumptions concerning the ionization and the recombination of the IGM. We make predictions for the evolution and the distribution of the HII regions, and for the global features of reionization, that can be constrained by future observations. Finally, we brie y discuss the possible future prospects of this Thesis work.
Resumo:
La tesi affronta il tema dell'innovazione della scuola, oggetto di costante attenzione da parte delle organizzazioni internazionali e dei sistemi educativi nazionali, per le sue implicazioni economiche, sociali e politiche, e intende portare un contributo allo studio sistematico e analitico dei progetti e delle esperienze di innovazione complessiva dell'ambiente di apprendimento. Il concetto di ambiente di apprendimento viene approfondito nelle diverse prospettive di riferimento, con specifica attenzione al framework del progetto "Innovative Learning Environments" [ILE], dell’Organisation For Economic And Cultural Development [OECD] che, con una prospettiva dichiaratamente olistica, individua nel dispositivo dell’ambiente di apprendimento la chiave per l’innovazione dell’istruzione nella direzione delle competenze per il ventunesimo Secolo. I criteri presenti nel quadro di riferimento del progetto sono stati utilizzati per un’analisi dell’esperienza proposta come caso di studio, Scuola-Città Pestalozzi a Firenze, presa in esame perché nell’anno scolastico 2011/2012 ha messo in pratica appunto un “disegno” di trasformazione dell’ambiente di apprendimento e in particolare dei caratteri del tempo/scuola. La ricerca, condotta con una metodologia qualitativa, è stata orientata a far emergere le interpretazioni dei protagonisti dell’innovazione indagata: dall’analisi del progetto e di tutta la documentazione fornita dalla scuola è scaturita la traccia per un focus-group esplorativo attraverso il quale sono stati selezionati i temi per le interviste semistrutturate rivolte ai docenti (scuola primaria e scuola secondaria di primo grado). Per quanto concerne l’interpretazione dei risultati, le trascrizioni delle interviste sono state analizzate con un approccio fenomenografico, attraverso l’individuazione di unità testuali logicamente connesse a categorie concettuali pertinenti. L’analisi dei materiali empirici ha permesso di enucleare categorie interpretative rispetto alla natura e agli scopi delle esperienze di insegnamento/apprendimento, al processo organizzativo, alla sostenibilità. Tra le implicazioni della ricerca si ritengono particolarmente rilevanti quelle relative alla funzione docente.
Resumo:
Information is nowadays a key resource: machine learning and data mining techniques have been developed to extract high-level information from great amounts of data. As most data comes in form of unstructured text in natural languages, research on text mining is currently very active and dealing with practical problems. Among these, text categorization deals with the automatic organization of large quantities of documents in priorly defined taxonomies of topic categories, possibly arranged in large hierarchies. In commonly proposed machine learning approaches, classifiers are automatically trained from pre-labeled documents: they can perform very accurate classification, but often require a consistent training set and notable computational effort. Methods for cross-domain text categorization have been proposed, allowing to leverage a set of labeled documents of one domain to classify those of another one. Most methods use advanced statistical techniques, usually involving tuning of parameters. A first contribution presented here is a method based on nearest centroid classification, where profiles of categories are generated from the known domain and then iteratively adapted to the unknown one. Despite being conceptually simple and having easily tuned parameters, this method achieves state-of-the-art accuracy in most benchmark datasets with fast running times. A second, deeper contribution involves the design of a domain-independent model to distinguish the degree and type of relatedness between arbitrary documents and topics, inferred from the different types of semantic relationships between respective representative words, identified by specific search algorithms. The application of this model is tested on both flat and hierarchical text categorization, where it potentially allows the efficient addition of new categories during classification. Results show that classification accuracy still requires improvements, but models generated from one domain are shown to be effectively able to be reused in a different one.
Resumo:
The availability of a huge amount of source code from code archives and open-source projects opens up the possibility to merge machine learning, programming languages, and software engineering research fields. This area is often referred to as Big Code where programming languages are treated instead of natural languages while different features and patterns of code can be exploited to perform many useful tasks and build supportive tools. Among all the possible applications which can be developed within the area of Big Code, the work presented in this research thesis mainly focuses on two particular tasks: the Programming Language Identification (PLI) and the Software Defect Prediction (SDP) for source codes. Programming language identification is commonly needed in program comprehension and it is usually performed directly by developers. However, when it comes at big scales, such as in widely used archives (GitHub, Software Heritage), automation of this task is desirable. To accomplish this aim, the problem is analyzed from different points of view (text and image-based learning approaches) and different models are created paying particular attention to their scalability. Software defect prediction is a fundamental step in software development for improving quality and assuring the reliability of software products. In the past, defects were searched by manual inspection or using automatic static and dynamic analyzers. Now, the automation of this task can be tackled using learning approaches that can speed up and improve related procedures. Here, two models have been built and analyzed to detect some of the commonest bugs and errors at different code granularity levels (file and method levels). Exploited data and models’ architectures are analyzed and described in detail. Quantitative and qualitative results are reported for both PLI and SDP tasks while differences and similarities concerning other related works are discussed.