7 resultados para Nearest Neighbour

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Machine learning comprises a series of techniques for automatic extraction of meaningful information from large collections of noisy data. In many real world applications, data is naturally represented in structured form. Since traditional methods in machine learning deal with vectorial information, they require an a priori form of preprocessing. Among all the learning techniques for dealing with structured data, kernel methods are recognized to have a strong theoretical background and to be effective approaches. They do not require an explicit vectorial representation of the data in terms of features, but rely on a measure of similarity between any pair of objects of a domain, the kernel function. Designing fast and good kernel functions is a challenging problem. In the case of tree structured data two issues become relevant: kernel for trees should not be sparse and should be fast to compute. The sparsity problem arises when, given a dataset and a kernel function, most structures of the dataset are completely dissimilar to one another. In those cases the classifier has too few information for making correct predictions on unseen data. In fact, it tends to produce a discriminating function behaving as the nearest neighbour rule. Sparsity is likely to arise for some standard tree kernel functions, such as the subtree and subset tree kernel, when they are applied to datasets with node labels belonging to a large domain. A second drawback of using tree kernels is the time complexity required both in learning and classification phases. Such a complexity can sometimes prevents the kernel application in scenarios involving large amount of data. This thesis proposes three contributions for resolving the above issues of kernel for trees. A first contribution aims at creating kernel functions which adapt to the statistical properties of the dataset, thus reducing its sparsity with respect to traditional tree kernel functions. Specifically, we propose to encode the input trees by an algorithm able to project the data onto a lower dimensional space with the property that similar structures are mapped similarly. By building kernel functions on the lower dimensional representation, we are able to perform inexact matchings between different inputs in the original space. A second contribution is the proposal of a novel kernel function based on the convolution kernel framework. Convolution kernel measures the similarity of two objects in terms of the similarities of their subparts. Most convolution kernels are based on counting the number of shared substructures, partially discarding information about their position in the original structure. The kernel function we propose is, instead, especially focused on this aspect. A third contribution is devoted at reducing the computational burden related to the calculation of a kernel function between a tree and a forest of trees, which is a typical operation in the classification phase and, for some algorithms, also in the learning phase. We propose a general methodology applicable to convolution kernels. Moreover, we show an instantiation of our technique when kernels such as the subtree and subset tree kernels are employed. In those cases, Direct Acyclic Graphs can be used to compactly represent shared substructures in different trees, thus reducing the computational burden and storage requirements.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tuber borchii (Ascomycota, order Pezizales) is highly valued truffle sold in local markets in Italy. Despite its economic importance, knowledge on its distribution and population variation is scarce. The objective of this work was to investigate the evolutionary forces shaping the genetic structure of this fungus using coalescent and phylogenetic methods to reconstruct the evolutionary history of populations in Italy. To assess population structure, 61 specimens were collected from 11 different Provinces of Italy. Sampling was stratified across hosts and habitats to maximize coverage in native oak and pine stands and both mychorrizae and fruiting bodies were collected. Samples were identified considering anatomo-morphological characters. DNA was extracted and both multilocus (AFLP) and single-locus (18 loci from rDNA, nDNA, and mtDNA) approaches were used to look for polymorphisms. Screening AFLP profiles, both Jaccard and Dice coefficients of similarity were utilized to transform binary matrix into a distance matrix and then to desume Neighbour-Joining trees. Though these are only preliminary examinations, phylogenetic trees were totally concordant with those deriving from single locus analyses. Phylogenetic analyses of the nuclear loci were performed using maximum likelihood with PAUP and a combined phylogenetic inference, using Bayesian estimation with all nuclear gene regions, was carried out. To reconstruct the evolutionary history, we estimated recurrent migration, migration across the history of the sample, and estimated the mutation and approximate age of mutations in each tree using SNAP Workbench. The combined phylogenetic tree using Bayesian estimation suggests that there are two main haplotypes that are difficult to be differentiated on the basis of morphology, of ecological parameters and symbiontic tree. Between these two lineages, that occur in sympatry within T. borchii populations, there is no evidence of recurrent migration. However, migration over the history of the sample was asymmetrical suggesting that isolation was a result of interrupted gene flow followed by range expansion. Low levels of divergence between the haplotypes indicate that there are likely to be two cryptic species within the T. borchii population sampled. Our results suggest that isolation between populations of T. borchii could have led to reproductive isolation between two lineages. This isolation is likely due to sympatric speciation caused by a multiple colonization from different refugia or a recent isolation. In attempting to determinate whether these haplotypes represent separate species or a partition of the same species we applied Biological and Mechanistic species Concepts. Notwithstanding, further analyses are necessary to evaluate if selection favoured premating or post-mating isolation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the last decade the interest for submarine instability grew up, driven by the increasing exploitation of natural resources (primary hydrocarbons), the emplacement of bottom-lying structures (cables and pipelines) and by the development of coastal areas, whose infrastructures increasingly protrude to the sea. The great interest for this topic promoted a number of international projects such as: STEAM (Sediment Transport on European Atlantic Margins, 93-96), ENAM II (European North Atlantic Margin, 96-99), GITEC (Genesis and Impact of Tsunamis on the European Coast 92-95), STRATAFORM (STRATA FORmation on Margins, 95-01), Seabed Slope Process in Deep Water Continental Margin (Northwest Gulf of Mexico, 96-04), COSTA (Continental slope Stability, 00-05), EUROMARGINS (Slope Stability on Europe’s Passive Continental Margin), SPACOMA (04-07), EUROSTRATAFORM (European Margin Strata Formation), NGI's internal project SIP-8 (Offshore Geohazards), IGCP-511: Submarine Mass Movements and Their Consequences (05-09) and projects indirectly related to instability processes, such as TRANSFER (Tsunami Risk ANd Strategies For the European region, 06-09) or NEAREST (integrated observations from NEAR shore sourcES of Tsunamis: towards an early warning system, 06-09). In Italy, apart from a national project realized within the activities of the National Group of Volcanology during the framework 2000-2003 “Conoscenza delle parti sommerse dei vulcani italiani e valutazione del potenziale rischio vulcanico”, the study of submarine mass-movement has been underestimated until the occurrence of the landslide-tsunami events that affected Stromboli on December 30, 2002. This event made the Italian Institutions and the scientific community more aware of the hazard related to submarine landslides, mainly in light of the growing anthropization of coastal sectors, that increases the vulnerability of these areas to the consequences of such processes. In this regard, two important national projects have been recently funded in order to study coastal instabilities (PRIN 24, 06-08) and to map the main submarine hazard features on continental shelves and upper slopes around the most part of Italian coast (MaGIC Project). The study realized in this Thesis is addressed to the understanding of these processes, with particular reference to Stromboli submerged flanks. These latter represent a natural laboratory in this regard, as several kind of instability phenomena are present on the submerged flanks, affecting about 90% of the entire submerged areal and often (strongly) influencing the morphological evolution of subaerial slopes, as witnessed by the event occurred on 30 December 2002. Furthermore, each phenomenon is characterized by different pre-failure, failure and post-failure mechanisms, ranging from rock-falls, to turbidity currents up to catastrophic sector collapses. The Thesis is divided into three introductive chapters, regarding a brief review of submarine instability phenomena and related hazard (cap. 1), a “bird’s-eye” view on methodologies and available dataset (cap. 2) and a short introduction on the evolution and the morpho-structural setting of the Stromboli edifice (cap. 3). This latter seems to play a major role in the development of largescale sector collapses at Stromboli, as they occurred perpendicular to the orientation of the main volcanic rift axis (oriented in NE-SW direction). The characterization of these events and their relationships with successive erosive-depositional processes represents the main focus of cap.4 (Offshore evidence of large-scale lateral collapses on the eastern flank of Stromboli, Italy, due to structurally-controlled, bilateral flank instability) and cap. 5 (Lateral collapses and active sedimentary processes on the North-western flank of Stromboli Volcano), represented by articles accepted for publication on international papers (Marine Geology). Moreover, these studies highlight the hazard related to these catastrophic events; several calamities (with more than 40000 casualties only in the last two century) have been, in fact, the direct or indirect result of landslides affecting volcanic flanks, as observed at Oshima-Oshima (1741) and Unzen Volcano (1792) in Japan (Satake&Kato, 2001; Brantley&Scott, 1993), Krakatau (1883) in Indonesia (Self&Rampino, 1981), Ritter Island (1888), Sissano in Papua New Guinea (Ward& Day, 2003; Johnson, 1987; Tappin et al., 2001) and Mt St. Augustine (1883) in Alaska (Beget& Kienle, 1992). Flank landslide are also recognized as the most important and efficient mass-wasting process on volcanoes, contributing to the development of the edifices by widening their base and to the growth of a volcaniclastic apron at the foot of a volcano; a number of small and medium-scale erosive processes are also responsible for the carving of Stromboli submarine flanks and the transport of debris towards the deeper areas. The characterization of features associated to these processes is the main focus of cap. 6; it is also important to highlight that some small-scale events are able to create damage to coastal areas, as also witnessed by recent events of Gioia Tauro 1978, Nizza, 1979 and Stromboli 2002. The hazard potential related to these phenomena is, in fact, very high, as they commonly occur at higher frequency with respect to large-scale collapses, therefore being more significant in terms of human timescales. In the last chapter (cap. 7), a brief review and discussion of instability processes identified on Stromboli submerged flanks is presented; they are also compared with respect to analogous processes recognized in other submerged areas in order to shed lights on the main factors involved in their development. Finally, some applications of multibeam data to assess the hazard related to these phenomena are also discussed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La Piana di foce del Garigliano (al confine tra Lazio e Campania) è caratterizzata, fino ad epoche recenti, dalla presenza di aree palustri e umide. Lo studio in corso cerca di ricostruire l’evoluzione dell’ambiente costiero mettendolo in relazione alla presenza dell’uomo, alla gestione del territorio, alle vicende storiche e alle variazioni climatiche utilizzando molteplici metodologie tipiche della geoarcheologia. Si tratta di un approccio multidisciplinare che cerca di mettere insieme analisi tipiche dell’archeologia, della topografia antica, della geomorfologia, della geologia e della paleobotanica. Fino all’età del Ferro l’unica traccia di popolamento viene da Monte d’Argento, uno sperone roccioso isolato lungo la costa, posto al limite occidentale di un ambiente sottostante che sembra una palude chiusa e isolata da apporti sedimentari esterni. Con il passaggio all’età del ferro si verifica un mutamento ambientale con la fine della grande palude e la formazione di una piccola laguna parzialmente comunicante con il mare. L’arrivo dei romani alla fine del III secolo a.C. segna la scomparsa dei grandi centri degli Aurunci e la deduzione di tre colonie (Sessa Aurunca, Sinuessa, Minturno). Le attività di sistemazione territoriale non riguardarono però le aree umide costiere, che non vennero bonificate o utilizzate per scopi agricoli, ma mantennero la loro natura di piccoli laghi costieri. Quest’epoca è dunque caratterizzata da una diffusione capillare di insediamenti, basati su piccole fattorie o installazioni legate allo sfruttamento agricolo. Poche sono le aree archeologiche che hanno restituito materiali successivi al II-III secolo d.C. La città resta comunque abitata fino al VI-VII secolo, quando l’instabilità politica e l’impaludamento dovettero rendere la zona non troppo sicura favorendo uno spostamento verso le zone collinari. Un insediamento medievale è attestato solo a Monte d’Argento e una frequentazione saracena dell’inizio del IX secolo è riportata dalle fonti letterarie, ma non vi è ancora nessuna documentazione archeologica.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation investigates corporate governance and dividend policy in banking. This topic has recently attracted the attention of numerous scholars all over the world and currently remains one of the most discussed topics in Banking. The core of the dissertation is constituted by three papers. The first paper generalizes the main achievements in the field of relevant study using the approach of meta-analysis. The second paper provides an empirical analysis of the effect of banking corporate governance on dividend payout. Finally, the third paper investigates empirically the effect of government bailout during 2007-2010 on corporate governance and dividend policy of banks. The dissertation uses a new hand-collected data set with information on corporate governance, ownership structure and compensation structure for a sample of listed banks from 15 European countries for the period 2005-2010. The empirical papers employ such econometric approaches as Within-Group model, difference-in-difference technique, and propensity score matching method based on the Nearest Neighbor Matching estimator. The main empirical results may be summarized as follows. First, we provide evidence that CEO power and connection to government are associated with lower dividend payout ratios. This result supports the view that banking regulators are prevalently concerned about the safety of the bank, and powerful bank CEOs can afford to distribute low payout ratios, at the expense of minority shareholders. Next, we find that government bailout during 2007-2010 changes the banks’ ownership structure and helps to keep lending by bailed bank at the pre-crisis level. Finally, we provide robust evidence for increased control over the banks that receive government money. These findings show the important role of government when overcoming the consequences of the banking crisis, and high quality of governance of public bailouts in European countries.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In Sub-Saharan Africa, non-democratic events, like civil wars and coup d'etat, destroy economic development. This study investigates both domestic and spatial effects on the likelihood of civil wars and coup d'etat. To civil wars, an increase of income growth is one of common research conclusions to stop wars. This study adds a concern on ethnic fractionalization. IV-2SLS is applied to overcome causality problem. The findings document that income growth is significant to reduce number and degree of violence in high ethnic fractionalized countries, otherwise they are trade-off. Income growth reduces amount of wars, but increases its violent level, in the countries with few large ethnic groups. Promoting growth should consider ethnic composition. This study also investigates the clustering and contagion of civil wars using spatial panel data models. Onset, incidence and end of civil conflicts spread across the network of neighboring countries while peace, the end of conflicts, diffuse only with the nearest neighbor. There is an evidence of indirect links from neighboring income growth, without too much inequality, to reduce the likelihood of civil wars. To coup d'etat, this study revisits its diffusion for both all types of coups and only successful ones. The results find an existence of both domestic and spatial determinants in different periods. Domestic income growth plays major role to reduce the likelihood of coup before cold war ends, while spatial effects do negative afterward. Results on probability to succeed coup are similar. After cold war ends, international organisations seriously promote democracy with pressure against coup d'etat, and it seems to be effective. In sum, this study indicates the role of domestic ethnic fractionalization and the spread of neighboring effects to the likelihood of non-democratic events in a country. Policy implementation should concern these factors.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Information is nowadays a key resource: machine learning and data mining techniques have been developed to extract high-level information from great amounts of data. As most data comes in form of unstructured text in natural languages, research on text mining is currently very active and dealing with practical problems. Among these, text categorization deals with the automatic organization of large quantities of documents in priorly defined taxonomies of topic categories, possibly arranged in large hierarchies. In commonly proposed machine learning approaches, classifiers are automatically trained from pre-labeled documents: they can perform very accurate classification, but often require a consistent training set and notable computational effort. Methods for cross-domain text categorization have been proposed, allowing to leverage a set of labeled documents of one domain to classify those of another one. Most methods use advanced statistical techniques, usually involving tuning of parameters. A first contribution presented here is a method based on nearest centroid classification, where profiles of categories are generated from the known domain and then iteratively adapted to the unknown one. Despite being conceptually simple and having easily tuned parameters, this method achieves state-of-the-art accuracy in most benchmark datasets with fast running times. A second, deeper contribution involves the design of a domain-independent model to distinguish the degree and type of relatedness between arbitrary documents and topics, inferred from the different types of semantic relationships between respective representative words, identified by specific search algorithms. The application of this model is tested on both flat and hierarchical text categorization, where it potentially allows the efficient addition of new categories during classification. Results show that classification accuracy still requires improvements, but models generated from one domain are shown to be effectively able to be reused in a different one.