924 resultados para Automatic Analysis of Multivariate Categorical Data Sets
Resumo:
Analysts, politicians and international players from all over the world look at China as one of the most powerful countries on the international scenario, and as a country whose economic development can significantly impact on the economies of the rest of the world. However many aspects of this country have still to be investigated. First the still fundamental role played by Chinese rural areas for the general development of the country from a political, economic and social point of view. In particular, the way in which the rural areas have influenced the social stability of the whole country has been widely discussed due to their strict relationship with the urban areas where most people from the countryside emigrate searching for a job and a better life. In recent years many studies have mostly focused on the urbanization phenomenon with little interest in the living conditions in rural areas and in the deep changes which have occurred in some, mainly agricultural provinces. An analysis of the level of infrastructure is one of the main aspects which highlights the principal differences in terms of living conditions between rural and urban areas. In this thesis, I first carried out the analysis through the multivariate statistics approach (Principal Component Analysis and Cluster Analysis) in order to define the new map of rural areas based on the analysis of living conditions. In the second part I elaborated an index (Living Conditions Index) through the Fuzzy Expert/Inference System. Finally I compared this index (LCI) to the results obtained from the cluster analysis drawing geographic maps. The data source is the second national agricultural census of China carried out in 2006. In particular, I analysed the data refer to villages but aggregated at province level.
Resumo:
The subject of this thesis is the development of a Gaschromatography (GC) system for non-methane hydrocarbons (NMHCs) and measurement of samples within the project CARIBIC (Civil Aircraft for the Regular Investigation of the atmosphere Based on an Instrument Container, www.caribic-atmospheric.com). Air samples collected at cruising altitude from the upper troposphere and lowermost stratosphere contain hydrocarbons at low levels (ppt range), which imposes substantial demands on detection limits. Full automation enabled to maintain constant conditions during the sample processing and analyses. Additionally, automation allows overnight operation thus saving time. A gas chromatography using flame ionization detection (FID) together with the dual column approach enables simultaneous detection with almost equal carbon atom response for all hydrocarbons except for ethyne. The first part of this thesis presents the technical descriptions of individual parts of the analytical system. Apart from the sample treatment and calibration procedures, the sample collector is described. The second part deals with analytical performance of the GC system by discussing tests that had been made. Finally, results for measurement flight are assessed in terms of quality of the data and two flights are discussed in detail. Analytical performance is characterized using detection limits for each compound, using uncertainties for each compound, using tests of calibration mixture conditioning and carbon dioxide trap to find out their influence on analyses, and finally by comparing the responses of calibrated substances during period when analyses of the flights were made. Comparison of both systems shows good agreement. However, because of insufficient capacity of the CO2 trap the signal of one column was suppressed due to breakthroughed carbon dioxide so much that its results appeared to be unreliable. Plausibility tests for the internal consistency of the given data sets are based on common patterns exhibited by tropospheric NMHCs. All tests show that samples from the first flights do not comply with the expected pattern. Additionally, detected alkene artefacts suggest potential problems with storing or contamination within all measurement flights. Two last flights # 130-133 and # 166-169 comply with the tests therefore their detailed analysis is made. Samples were analyzed in terms of their origin (troposphere vs. stratosphere, backward trajectories), their aging (NMHCs ratios) and detected plumes were compared to chemical signatures of Asian outflows. In the last chapter a future development of the presented system with focus on separation is drawn. An extensive appendix documents all important aspects of the dissertation from theoretical introduction through illustration of sample treatment to overview diagrams for the measured flights.
Resumo:
Perfusion CT imaging of the liver has potential to improve evaluation of tumour angiogenesis. Quantitative parameters can be obtained applying mathematical models to Time Attenuation Curve (TAC). However, there are still some difficulties for an accurate quantification of perfusion parameters due, for example, to algorithms employed, to mathematical model, to patient’s weight and cardiac output and to the acquisition system. In this thesis, new parameters and alternative methodologies about liver perfusion CT are presented in order to investigate the cause of variability of this technique. Firstly analysis were made to assess the variability related to the mathematical model used to compute arterial Blood Flow (BFa) values. Results were obtained implementing algorithms based on “ maximum slope method” and “Dual input one compartment model” . Statistical analysis on simulated data demonstrated that the two methods are not interchangeable. Anyway slope method is always applicable in clinical context. Then variability related to TAC processing in the application of slope method is analyzed. Results compared with manual selection allow to identify the best automatic algorithm to compute BFa. The consistency of a Standardized Perfusion Index (SPV) was evaluated and a simplified calibration procedure was proposed. At the end the quantitative value of perfusion map was analyzed. ROI approach and map approach provide related values of BFa and this means that pixel by pixel algorithm give reliable quantitative results. Also in pixel by pixel approach slope method give better results. In conclusion the development of new automatic algorithms for a consistent computation of BFa and the analysis and definition of simplified technique to compute SPV parameter, represent an improvement in the field of liver perfusion CT analysis.
Resumo:
This thesis concerns artificially intelligent natural language processing systems that are capable of learning the properties of lexical items (properties like verbal valency or inflectional class membership) autonomously while they are fulfilling their tasks for which they have been deployed in the first place. Many of these tasks require a deep analysis of language input, which can be characterized as a mapping of utterances in a given input C to a set S of linguistically motivated structures with the help of linguistic information encoded in a grammar G and a lexicon L: G + L + C → S (1) The idea that underlies intelligent lexical acquisition systems is to modify this schematic formula in such a way that the system is able to exploit the information encoded in S to create a new, improved version of the lexicon: G + L + S → L' (2) Moreover, the thesis claims that a system can only be considered intelligent if it does not just make maximum usage of the learning opportunities in C, but if it is also able to revise falsely acquired lexical knowledge. So, one of the central elements in this work is the formulation of a couple of criteria for intelligent lexical acquisition systems subsumed under one paradigm: the Learn-Alpha design rule. The thesis describes the design and quality of a prototype for such a system, whose acquisition components have been developed from scratch and built on top of one of the state-of-the-art Head-driven Phrase Structure Grammar (HPSG) processing systems. The quality of this prototype is investigated in a series of experiments, in which the system is fed with extracts of a large English corpus. While the idea of using machine-readable language input to automatically acquire lexical knowledge is not new, we are not aware of a system that fulfills Learn-Alpha and is able to deal with large corpora. To instance four major challenges of constructing such a system, it should be mentioned that a) the high number of possible structural descriptions caused by highly underspeci ed lexical entries demands for a parser with a very effective ambiguity management system, b) the automatic construction of concise lexical entries out of a bulk of observed lexical facts requires a special technique of data alignment, c) the reliability of these entries depends on the system's decision on whether it has seen 'enough' input and d) general properties of language might render some lexical features indeterminable if the system tries to acquire them with a too high precision. The cornerstone of this dissertation is the motivation and development of a general theory of automatic lexical acquisition that is applicable to every language and independent of any particular theory of grammar or lexicon. This work is divided into five chapters. The introductory chapter first contrasts three different and mutually incompatible approaches to (artificial) lexical acquisition: cue-based queries, head-lexicalized probabilistic context free grammars and learning by unification. Then the postulation of the Learn-Alpha design rule is presented. The second chapter outlines the theory that underlies Learn-Alpha and exposes all the related notions and concepts required for a proper understanding of artificial lexical acquisition. Chapter 3 develops the prototyped acquisition method, called ANALYZE-LEARN-REDUCE, a framework which implements Learn-Alpha. The fourth chapter presents the design and results of a bootstrapping experiment conducted on this prototype: lexeme detection, learning of verbal valency, categorization into nominal count/mass classes, selection of prepositions and sentential complements, among others. The thesis concludes with a review of the conclusions and motivation for further improvements as well as proposals for future research on the automatic induction of lexical features.
Resumo:
Tractor rollover represent a primary cause of death or serious injury in agriculture and despite the mandatory Roll-Over Protective Structures (ROPS), that reduced the number of injuries, tractor accidents are still of great concern. Because of their versatility and wide use many studies on safety are concerned with the stability of tractors, but they often prefer controlled tests or laboratory tests. The evaluation of tractors working in field, instead, is a very complex issue because the rollover could be influenced by the interaction among operator, tractor and environment. Recent studies are oriented towards the evaluation of the actual working conditions developing prototypes for driver assistance and data acquisition. Currently these devices are produced and sold by manufacturers. A warning device was assessed in this study with the aim to evaluate its performance and to collect data on different variables influencing the dynamics of tractors in field by monitoring continuously the working conditions of tractors operating at the experimental farm of the Bologna University. The device consists of accelerometers, gyroscope, GSM/GPRS, GPS for geo-referencing and a transceiver for the automatic recognition of tractor-connected equipment. A microprocessor processes data and provides information, through a dedicated algorithm requiring data on the geometry of the tested tractor, on the level of risk for the operator in terms of probable loss of stability and suggests corrective measures to reduce the potential instability of the tractor.
Resumo:
The recent advent of Next-generation sequencing technologies has revolutionized the way of analyzing the genome. This innovation allows to get deeper information at a lower cost and in less time, and provides data that are discrete measurements. One of the most important applications with these data is the differential analysis, that is investigating if one gene exhibit a different expression level in correspondence of two (or more) biological conditions (such as disease states, treatments received and so on). As for the statistical analysis, the final aim will be statistical testing and for modeling these data the Negative Binomial distribution is considered the most adequate one especially because it allows for "over dispersion". However, the estimation of the dispersion parameter is a very delicate issue because few information are usually available for estimating it. Many strategies have been proposed, but they often result in procedures based on plug-in estimates, and in this thesis we show that this discrepancy between the estimation and the testing framework can lead to uncontrolled first-type errors. We propose a mixture model that allows each gene to share information with other genes that exhibit similar variability. Afterwards, three consistent statistical tests are developed for differential expression analysis. We show that the proposed method improves the sensitivity of detecting differentially expressed genes with respect to the common procedures, since it is the best one in reaching the nominal value for the first-type error, while keeping elevate power. The method is finally illustrated on prostate cancer RNA-seq data.
Resumo:
This work is focused on the study of saltwater intrusion in coastal aquifers, and in particular on the realization of conceptual schemes to evaluate the risk associated with it. Saltwater intrusion depends on different natural and anthropic factors, both presenting a strong aleatory behaviour, that should be considered for an optimal management of the territory and water resources. Given the uncertainty of problem parameters, the risk associated with salinization needs to be cast in a probabilistic framework. On the basis of a widely adopted sharp interface formulation, key hydrogeological problem parameters are modeled as random variables, and global sensitivity analysis is used to determine their influence on the position of saltwater interface. The analyses presented in this work rely on an efficient model reduction technique, based on Polynomial Chaos Expansion, able to combine the best description of the model without great computational burden. When the assumptions of classical analytical models are not respected, and this occurs several times in the applications to real cases of study, as in the area analyzed in the present work, one can adopt data-driven techniques, based on the analysis of the data characterizing the system under study. It follows that a model can be defined on the basis of connections between the system state variables, with only a limited number of assumptions about the "physical" behaviour of the system.
Resumo:
Il lavoro che ho sviluppato presso l'unità di RM funzionale del Policlinico S.Orsola-Malpighi, DIBINEM, è incentrato sull'analisi dati di resting state - functional Magnetic Resonance Imaging (rs-fMRI) mediante l'utilizzo della graph theory, con lo scopo di valutare eventuali differenze in termini di connettività cerebrale funzionale tra un campione di pazienti affetti da Nocturnal Frontal Lobe Epilepsy (NFLE) ed uno di controlli sani. L'epilessia frontale notturna è una peculiare forma di epilessia caratterizzata da crisi che si verificano quasi esclusivamente durante il sonno notturno. Queste sono contraddistinte da comportamenti motori, prevalentemente distonici, spesso complessi, e talora a semiologia bizzarra. L'fMRI è una metodica di neuroimaging avanzata che permette di misurare indirettamente l'attività neuronale. Tutti i soggetti sono stati studiati in condizioni di resting-state, ossia di veglia rilassata. In particolare mi sono occupato di analizzare i dati fMRI con un approccio innovativo in campo clinico-neurologico, rappresentato dalla graph theory. I grafi sono definiti come strutture matematiche costituite da nodi e links, che trovano applicazione in molti campi di studio per la modellizzazione di strutture di diverso tipo. La costruzione di un grafo cerebrale per ogni partecipante allo studio ha rappresentato la parte centrale di questo lavoro. L'obiettivo è stato quello di definire le connessioni funzionali tra le diverse aree del cervello mediante l'utilizzo di un network. Il processo di modellizzazione ha permesso di valutare i grafi neurali mediante il calcolo di parametri topologici che ne caratterizzano struttura ed organizzazione. Le misure calcolate in questa analisi preliminare non hanno evidenziato differenze nelle proprietà globali tra i grafi dei pazienti e quelli dei controlli. Alterazioni locali sono state invece riscontrate nei pazienti, rispetto ai controlli, in aree della sostanza grigia profonda, del sistema limbico e delle regioni frontali, le quali rientrano tra quelle ipotizzate essere coinvolte nella fisiopatologia di questa peculiare forma di epilessia.
Resumo:
L'applicazione di misure, derivanti dalla teoria dell'informazione, fornisce un valido strumento per quantificare alcune delle proprietà dei sistemi complessi. Le stesse misure possono essere utilizzate in robotica per favorire l'analisi e la sintesi di sistemi di controllo per robot. In questa tesi si è analizzata la correlazione tra alcune misure di complessità e la capacità dei robot di portare a termine, con successo, tre differenti task. I risultati ottenuti suggeriscono che tali misure di complessità rappresentano uno strumento promettente anche nel campo della robotica, ma che il loro utilizzo può diventare difficoltoso quando applicate a task compositi.
Resumo:
Purpose: To report an angiographic investigation of midterm atherosclerotic disease progression in below-the-knee (BTK) arteries of claudicants. Methods: Angiograms were performed in 58 consecutive claudicants (35 men; mean age 68.3±8.7 years) with endovascular treatment of femoropopliteal arteries in 58 limbs after a mean follow-up of 3.6±1.2 years. Angiograms were reviewed in consensus by 2 experienced readers blinded to clinical data. Progression of atherosclerosis in 4 BTK arterial segments (tibioperoneal trunk, anterior and posterior tibial arteries, and peroneal artery) was assessed according to the Bollinger score. The composite per calf Bollinger score represented the average of the 4 BTK arterial segment scores. The association of the Bollinger score with cardiovascular risk factors and gender was scrutinized. Results: A statistically significant increase in atherosclerotic burden was observed for the mean composite per calf Bollinger score (5.7±8.3 increase, 95% CI 3.5 to 7.9, p<0.0001), as well as for each single arterial segment analyzed. In multivariate linear regression analysis, diabetes mellitus was associated with a more pronounced progression of atherosclerotic burden in crural arteries (β: 5.6, p=0.035, 95% CI 0.398 to 10.806). Conclusion: Progression of infrapopliteal atherosclerotic lesions is common in claudicants during midterm follow-up. Presence of diabetes mellitus was confirmed as a major risk factor for more pronounced atherosclerotic BTK disease progression.
Resumo:
Background There is an ongoing debate as to whether combined antiretroviral treatment (cART) during pregnancy is an independent risk factor for prematurity in HIV-1-infected women. Objective The aim of the study was to examine (1) crude effects of different ART regimens on prematurity, (2) the association between duration of cART and duration of pregnancy, and (3) the role of possibly confounding risk factors for prematurity. Method We analysed data from 1180 pregnancies prospectively collected by the Swiss Mother and Child HIV Cohort Study (MoCHiV) and the Swiss HIV Cohort Study (SHCS). Results Odds ratios for prematurity in women receiving mono/dual therapy and cART were 1.8 [95% confidence interval (CI) 0.85–3.6] and 2.5 (95% CI 1.4–4.3) compared with women not receiving ART during pregnancy (P=0.004). In a subgroup of 365 pregnancies with comprehensive information on maternal clinical, demographic and lifestyle characteristics, there was no indication that maternal viral load, age, ethnicity or history of injecting drug use affected prematurity rates associated with the use of cART. Duration of cART before delivery was also not associated with duration of pregnancy. Conclusion Our study indicates that confounding by maternal risk factors or duration of cART exposure is not a likely explanation for the effects of ART on prematurity in HIV-1-infected women.
Resumo:
Sinotubular junction dilation is one of the most frequent pathologies associated with aortic root incompetence. Hence, we create a finite element model considering the whole root geometry; then, starting from healthy valve models and referring to measures of pathological valves reported in the literature, we reproduce the pathology of the aortic root by imposing appropriate boundary conditions. After evaluating the virtual pathological process, we are able to correlate dimensions of non-functional valves with dimensions of competent valves. Such a relation could be helpful in recreating a competent aortic root and, in particular, it could provide useful information in advance in aortic valve sparing surgery.