948 resultados para Hierarchical partitioning analysis
Resumo:
Hierarchical multi-label classification is a complex classification task where the classes involved in the problem are hierarchically structured and each example may simultaneously belong to more than one class in each hierarchical level. In this paper, we extend our previous works, where we investigated a new local-based classification method that incrementally trains a multi-layer perceptron for each level of the classification hierarchy. Predictions made by a neural network in a given level are used as inputs to the neural network responsible for the prediction in the next level. We compare the proposed method with one state-of-the-art decision-tree induction method and two decision-tree induction methods, using several hierarchical multi-label classification datasets. We perform a thorough experimental analysis, showing that our method obtains competitive results to a robust global method regarding both precision and recall evaluation measures.
Resumo:
The application of Concurrency Theory to Systems Biology is in its earliest stage of progress. The metaphor of cells as computing systems by Regev and Shapiro opened the employment of concurrent languages for the modelling of biological systems. Their peculiar characteristics led to the design of many bio-inspired formalisms which achieve higher faithfulness and specificity. In this thesis we present pi@, an extremely simple and conservative extension of the pi-calculus representing a keystone in this respect, thanks to its expressiveness capabilities. The pi@ calculus is obtained by the addition of polyadic synchronisation and priority to the pi-calculus, in order to achieve compartment semantics and atomicity of complex operations respectively. In its direct application to biological modelling, the stochastic variant of the calculus, Spi@, is shown able to model consistently several phenomena such as formation of molecular complexes, hierarchical subdivision of the system into compartments, inter-compartment reactions, dynamic reorganisation of compartment structure consistent with volume variation. The pivotal role of pi@ is evidenced by its capability of encoding in a compositional way several bio-inspired formalisms, so that it represents the optimal core of a framework for the analysis and implementation of bio-inspired languages. In this respect, the encodings of BioAmbients, Brane Calculi and a variant of P Systems in pi@ are formalised. The conciseness of their translation in pi@ allows their indirect comparison by means of their encodings. Furthermore it provides a ready-to-run implementation of minimal effort whose correctness is granted by the correctness of the respective encoding functions. Further important results of general validity are stated on the expressive power of priority. Several impossibility results are described, which clearly state the superior expressiveness of prioritised languages and the problems arising in the attempt of providing their parallel implementation. To this aim, a new setting in distributed computing (the last man standing problem) is singled out and exploited to prove the impossibility of providing a purely parallel implementation of priority by means of point-to-point or broadcast communication.
Resumo:
The aim of this thesis is to apply multilevel regression model in context of household surveys. Hierarchical structure in this type of data is characterized by many small groups. In last years comparative and multilevel analysis in the field of perceived health have grown in size. The purpose of this thesis is to develop a multilevel analysis with three level of hierarchy for Physical Component Summary outcome to: evaluate magnitude of within and between variance at each level (individual, household and municipality); explore which covariates affect on perceived physical health at each level; compare model-based and design-based approach in order to establish informativeness of sampling design; estimate a quantile regression for hierarchical data. The target population are the Italian residents aged 18 years and older. Our study shows a high degree of homogeneity within level 1 units belonging from the same group, with an intraclass correlation of 27% in a level-2 null model. Almost all variance is explained by level 1 covariates. In fact, in our model the explanatory variables having more impact on the outcome are disability, unable to work, age and chronic diseases (18 pathologies). An additional analysis are performed by using novel procedure of analysis :"Linear Quantile Mixed Model", named "Multilevel Linear Quantile Regression", estimate. This give us the possibility to describe more generally the conditional distribution of the response through the estimation of its quantiles, while accounting for the dependence among the observations. This has represented a great advantage of our models with respect to classic multilevel regression. The median regression with random effects reveals to be more efficient than the mean regression in representation of the outcome central tendency. A more detailed analysis of the conditional distribution of the response on other quantiles highlighted a differential effect of some covariate along the distribution.
Resumo:
The quench characteristics of second generation (2 G) YBCO Coated Conductor (CC) tapes are of fundamental importance for the design and safe operation of superconducting cables and magnets based on this material. Their ability to transport high current densities at high temperature, up to 77 K, and at very high fields, over 20 T, together with the increasing knowledge in their manufacturing, which is reducing their cost, are pushing the use of this innovative material in numerous system applications, from high field magnets for research to motors and generators as well as for cables. The aim of this Ph. D. thesis is the experimental analysis and numerical simulations of quench in superconducting HTS tapes and coils. A measurements facility for the characterization of superconducting tapes and coils was designed, assembled and tested. The facility consist of a cryostat, a cryocooler, a vacuum system, resistive and superconducting current leads and signal feedthrough. Moreover, the data acquisition system and the software for critical current and quench measurements were developed. A 2D model was developed using the finite element code COMSOL Multiphysics R . The problem of modeling the high aspect ratio of the tape is tackled by multiplying the tape thickness by a constant factor, compensating the heat and electrical balance equations by introducing a material anisotropy. The model was then validated both with the results of a 1D quench model based on a non-linear electric circuit coupled to a thermal model of the tape, to literature measurements and to critical current and quench measurements made in the cryogenic facility. Finally the model was extended to the study of coils and windings with the definition of the tape and stack homogenized properties. The procedure allows the definition of a multi-scale hierarchical model, able to simulate the windings with different degrees of detail.
Resumo:
In vielen Industriezweigen, zum Beispiel in der Automobilindustrie, werden Digitale Versuchsmodelle (Digital MockUps) eingesetzt, um die Konstruktion und die Funktion eines Produkts am virtuellen Prototypen zu überprüfen. Ein Anwendungsfall ist dabei die Überprüfung von Sicherheitsabständen einzelner Bauteile, die sogenannte Abstandsanalyse. Ingenieure ermitteln dabei für bestimmte Bauteile, ob diese in ihrer Ruhelage sowie während einer Bewegung einen vorgegeben Sicherheitsabstand zu den umgebenden Bauteilen einhalten. Unterschreiten Bauteile den Sicherheitsabstand, so muss deren Form oder Lage verändert werden. Dazu ist es wichtig, die Bereiche der Bauteile, welche den Sicherhabstand verletzen, genau zu kennen. rnrnIn dieser Arbeit präsentieren wir eine Lösung zur Echtzeitberechnung aller den Sicherheitsabstand unterschreitenden Bereiche zwischen zwei geometrischen Objekten. Die Objekte sind dabei jeweils als Menge von Primitiven (z.B. Dreiecken) gegeben. Für jeden Zeitpunkt, in dem eine Transformation auf eines der Objekte angewendet wird, berechnen wir die Menge aller den Sicherheitsabstand unterschreitenden Primitive und bezeichnen diese als die Menge aller toleranzverletzenden Primitive. Wir präsentieren in dieser Arbeit eine ganzheitliche Lösung, welche sich in die folgenden drei großen Themengebiete unterteilen lässt.rnrnIm ersten Teil dieser Arbeit untersuchen wir Algorithmen, die für zwei Dreiecke überprüfen, ob diese toleranzverletzend sind. Hierfür präsentieren wir verschiedene Ansätze für Dreiecks-Dreiecks Toleranztests und zeigen, dass spezielle Toleranztests deutlich performanter sind als bisher verwendete Abstandsberechnungen. Im Fokus unserer Arbeit steht dabei die Entwicklung eines neuartigen Toleranztests, welcher im Dualraum arbeitet. In all unseren Benchmarks zur Berechnung aller toleranzverletzenden Primitive beweist sich unser Ansatz im dualen Raum immer als der Performanteste.rnrnDer zweite Teil dieser Arbeit befasst sich mit Datenstrukturen und Algorithmen zur Echtzeitberechnung aller toleranzverletzenden Primitive zwischen zwei geometrischen Objekten. Wir entwickeln eine kombinierte Datenstruktur, die sich aus einer flachen hierarchischen Datenstruktur und mehreren Uniform Grids zusammensetzt. Um effiziente Laufzeiten zu gewährleisten ist es vor allem wichtig, den geforderten Sicherheitsabstand sinnvoll im Design der Datenstrukturen und der Anfragealgorithmen zu beachten. Wir präsentieren hierzu Lösungen, die die Menge der zu testenden Paare von Primitiven schnell bestimmen. Darüber hinaus entwickeln wir Strategien, wie Primitive als toleranzverletzend erkannt werden können, ohne einen aufwändigen Primitiv-Primitiv Toleranztest zu berechnen. In unseren Benchmarks zeigen wir, dass wir mit unseren Lösungen in der Lage sind, in Echtzeit alle toleranzverletzenden Primitive zwischen zwei komplexen geometrischen Objekten, bestehend aus jeweils vielen hunderttausend Primitiven, zu berechnen. rnrnIm dritten Teil präsentieren wir eine neuartige, speicheroptimierte Datenstruktur zur Verwaltung der Zellinhalte der zuvor verwendeten Uniform Grids. Wir bezeichnen diese Datenstruktur als Shrubs. Bisherige Ansätze zur Speicheroptimierung von Uniform Grids beziehen sich vor allem auf Hashing Methoden. Diese reduzieren aber nicht den Speicherverbrauch der Zellinhalte. In unserem Anwendungsfall haben benachbarte Zellen oft ähnliche Inhalte. Unser Ansatz ist in der Lage, den Speicherbedarf der Zellinhalte eines Uniform Grids, basierend auf den redundanten Zellinhalten, verlustlos auf ein fünftel der bisherigen Größe zu komprimieren und zur Laufzeit zu dekomprimieren.rnrnAbschießend zeigen wir, wie unsere Lösung zur Berechnung aller toleranzverletzenden Primitive Anwendung in der Praxis finden kann. Neben der reinen Abstandsanalyse zeigen wir Anwendungen für verschiedene Problemstellungen der Pfadplanung.
Resumo:
Delineating brain tumor boundaries from magnetic resonance images is an essential task for the analysis of brain cancer. We propose a fully automatic method for brain tissue segmentation, which combines Support Vector Machine classification using multispectral intensities and textures with subsequent hierarchical regularization based on Conditional Random Fields. The CRF regularization introduces spatial constraints to the powerful SVM classification, which assumes voxels to be independent from their neighbors. The approach first separates healthy and tumor tissue before both regions are subclassified into cerebrospinal fluid, white matter, gray matter and necrotic, active, edema region respectively in a novel hierarchical way. The hierarchical approach adds robustness and speed by allowing to apply different levels of regularization at different stages. The method is fast and tailored to standard clinical acquisition protocols. It was assessed on 10 multispectral patient datasets with results outperforming previous methods in terms of segmentation detail and computation times.
Resumo:
A central design challenge facing network planners is how to select a cost-effective network configuration that can provide uninterrupted service despite edge failures. In this paper, we study the Survivable Network Design (SND) problem, a core model underlying the design of such resilient networks that incorporates complex cost and connectivity trade-offs. Given an undirected graph with specified edge costs and (integer) connectivity requirements between pairs of nodes, the SND problem seeks the minimum cost set of edges that interconnects each node pair with at least as many edge-disjoint paths as the connectivity requirement of the nodes. We develop a hierarchical approach for solving the problem that integrates ideas from decomposition, tabu search, randomization, and optimization. The approach decomposes the SND problem into two subproblems, Backbone design and Access design, and uses an iterative multi-stage method for solving the SND problem in a hierarchical fashion. Since both subproblems are NP-hard, we develop effective optimization-based tabu search strategies that balance intensification and diversification to identify near-optimal solutions. To initiate this method, we develop two heuristic procedures that can yield good starting points. We test the combined approach on large-scale SND instances, and empirically assess the quality of the solutions vis-à-vis optimal values or lower bounds. On average, our hierarchical solution approach generates solutions within 2.7% of optimality even for very large problems (that cannot be solved using exact methods), and our results demonstrate that the performance of the method is robust for a variety of problems with different size and connectivity characteristics.
Resumo:
For smart applications, nodes in wireless multimedia sensor networks (MWSNs) have to take decisions based on sensed scalar physical measurements. A routing protocol must provide the multimedia delivery with quality level support and be energy-efficient for large-scale networks. With this goal in mind, this paper proposes a smart Multi-hop hierarchical routing protocol for Efficient VIdeo communication (MEVI). MEVI combines an opportunistic scheme to create clusters, a cross-layer solution to select routes based on network conditions, and a smart solution to trigger multimedia transmission according to sensed data. Simulations were conducted to show the benefits of MEVI compared with the well-known Low-Energy Adaptive Clustering Hierarchy (LEACH) protocol. This paper includes an analysis of the signaling overhead, energy-efficiency, and video quality.
Resumo:
Thirty microsatellite markers were analysed in 1426 goats from 45 traditional or rare breeds in 15 European and Middle Eastern countries. In all populations inbreeding was indicated by heterozygosity deficiency (mean FIS = 0.10). Genetic differentiation between breeds was moderate with a mean FST value of 0.07, but for most (c. 71%) northern and central European breeds, individuals could be assigned to their breeds with a success rate of more than 80%. Bayesian-based clustering analysis of allele frequencies and multivariate analysis revealed at least four discrete clusters: eastern Mediterranean (Middle East), central Mediterranean, western Mediterranean and central/northern Europe. About 41% of the genetic variability among the breeds could be explained by their geographical origin. A decrease in genetic diversity from the south-east to the north-west was accompanied by an increase in the level of differentiation at the breed level. These observations support the hypothesis that domestic livestock migrated from the Middle East towards western and northern Europe and indicate that breed formation was more systematic in north-central Europe than in the Middle East. We propose that breed differentiation and molecular diversity are independent criteria for conservation.
Nonparametric Inference Procedure For Percentiles of the Random Effect Distribution in Meta Analysis
Resumo:
Objective. To examine effects of primary care physicians (PCPs) and patients on the association between charges for primary care and specialty care in a point-of-service (POS) health plan. Data Source. Claims from 1996 for 3,308 adult male POS plan members, each of whom was assigned to one of the 50 family practitioner-PCPs with the largest POS plan member-loads. Study Design. A hierarchical multivariate two-part model was fitted using a Gibbs sampler to estimate PCPs' effects on patients' annual charges for two types of services, primary care and specialty care, the associations among PCPs' effects, and within-patient associations between charges for the two services. Adjusted Clinical Groups (ACGs) were used to adjust for case-mix. Principal Findings. PCPs with higher case-mix adjusted rates of specialist use were less likely to see their patients at least once during the year (estimated correlation: –.40; 95% CI: –.71, –.008) and provided fewer services to patients that they saw (estimated correlation: –.53; 95% CI: –.77, –.21). Ten of 11 PCPs whose case-mix adjusted effects on primary care charges were significantly less than or greater than zero (p < .05) had estimated, case-mix adjusted effects on specialty care charges that were of opposite sign (but not significantly different than zero). After adjustment for ACG and PCP effects, the within-patient, estimated odds ratio for any use of primary care given any use of specialty care was .57 (95% CI: .45, .73). Conclusions. PCPs and patients contributed independently to a trade-off between utilization of primary care and specialty care. The trade-off appeared to partially offset significant differences in the amount of care provided by PCPs. These findings were possible because we employed a hierarchical multivariate model rather than separate univariate models.
Resumo:
In this paper, we develop Bayesian hierarchical distributed lag models for estimating associations between daily variations in summer ozone levels and daily variations in cardiovascular and respiratory (CVDRESP) mortality counts for 19 U.S. large cities included in the National Morbidity Mortality Air Pollution Study (NMMAPS) for the period 1987 - 1994. At the first stage, we define a semi-parametric distributed lag Poisson regression model to estimate city-specific relative rates of CVDRESP associated with short-term exposure to summer ozone. At the second stage, we specify a class of distributions for the true city-specific relative rates to estimate an overall effect by taking into account the variability within and across cities. We perform the calculations with respect to several random effects distributions (normal, t-student, and mixture of normal), thus relaxing the common assumption of a two-stage normal-normal hierarchical model. We assess the sensitivity of the results to: 1) lag structure for ozone exposure; 2) degree of adjustment for long-term trends; 3) inclusion of other pollutants in the model;4) heat waves; 5) random effects distributions; and 6) prior hyperparameters. On average across cities, we found that a 10ppb increase in summer ozone level for every day in the previous week is associated with 1.25 percent increase in CVDRESP mortality (95% posterior regions: 0.47, 2.03). The relative rate estimates are also positive and statistically significant at lags 0, 1, and 2. We found that associations between summer ozone and CVDRESP mortality are sensitive to the confounding adjustment for PM_10, but are robust to: 1) the adjustment for long-term trends, other gaseous pollutants (NO_2, SO_2, and CO); 2) the distributional assumptions at the second stage of the hierarchical model; and 3) the prior distributions on all unknown parameters. Bayesian hierarchical distributed lag models and their application to the NMMAPS data allow us estimation of an acute health effect associated with exposure to ambient air pollution in the last few days on average across several locations. The application of these methods and the systematic assessment of the sensitivity of findings to model assumptions provide important epidemiological evidence for future air quality regulations.
Resumo:
In evaluating the accuracy of diagnosis tests, it is common to apply two imperfect tests jointly or sequentially to a study population. In a recent meta-analysis of the accuracy of microsatellite instability testing (MSI) and traditional mutation analysis (MUT) in predicting germline mutations of the mismatch repair (MMR) genes, a Bayesian approach (Chen, Watson, and Parmigiani 2005) was proposed to handle missing data resulting from partial testing and the lack of a gold standard. In this paper, we demonstrate an improved estimation of the sensitivities and specificities of MSI and MUT by using a nonlinear mixed model and a Bayesian hierarchical model, both of which account for the heterogeneity across studies through study-specific random effects. The methods can be used to estimate the accuracy of two imperfect diagnostic tests in other meta-analyses when the prevalence of disease, the sensitivities and/or the specificities of diagnostic tests are heterogeneous among studies. Furthermore, simulation studies have demonstrated the importance of carefully selecting appropriate random effects on the estimation of diagnostic accuracy measurements in this scenario.
Resumo:
Studies of diagnostic accuracy require more sophisticated methods for their meta-analysis than studies of therapeutic interventions. A number of different, and apparently divergent, methods for meta-analysis of diagnostic studies have been proposed, including two alternative approaches that are statistically rigorous and allow for between-study variability: the hierarchical summary receiver operating characteristic (ROC) model (Rutter and Gatsonis, 2001) and bivariate random-effects meta-analysis (van Houwelingen and others, 1993), (van Houwelingen and others, 2002), (Reitsma and others, 2005). We show that these two models are very closely related, and define the circumstances in which they are identical. We discuss the different forms of summary model output suggested by the two approaches, including summary ROC curves, summary points, confidence regions, and prediction regions.