311 resultados para OUTLIERS
Resumo:
Abstract Background Sugarcane is an increasingly economically and environmentally important C4 grass, used for the production of sugar and bioethanol, a low-carbon emission fuel. Sugarcane originated from crosses of Saccharum species and is noted for its unique capacity to accumulate high amounts of sucrose in its stems. Environmental stresses limit enormously sugarcane productivity worldwide. To investigate transcriptome changes in response to environmental inputs that alter yield we used cDNA microarrays to profile expression of 1,545 genes in plants submitted to drought, phosphate starvation, herbivory and N2-fixing endophytic bacteria. We also investigated the response to phytohormones (abscisic acid and methyl jasmonate). The arrayed elements correspond mostly to genes involved in signal transduction, hormone biosynthesis, transcription factors, novel genes and genes corresponding to unknown proteins. Results Adopting an outliers searching method 179 genes with strikingly different expression levels were identified as differentially expressed in at least one of the treatments analysed. Self Organizing Maps were used to cluster the expression profiles of 695 genes that showed a highly correlated expression pattern among replicates. The expression data for 22 genes was evaluated for 36 experimental data points by quantitative RT-PCR indicating a validation rate of 80.5% using three biological experimental replicates. The SUCAST Database was created that provides public access to the data described in this work, linked to tissue expression profiling and the SUCAST gene category and sequence analysis. The SUCAST database also includes a categorization of the sugarcane kinome based on a phylogenetic grouping that included 182 undefined kinases. Conclusion An extensive study on the sugarcane transcriptome was performed. Sugarcane genes responsive to phytohormones and to challenges sugarcane commonly deals with in the field were identified. Additionally, the protein kinases were annotated based on a phylogenetic approach. The experimental design and statistical analysis applied proved robust to unravel genes associated with a diverse array of conditions attributing novel functions to previously unknown or undefined genes. The data consolidated in the SUCAST database resource can guide further studies and be useful for the development of improved sugarcane varieties.
Resumo:
[EN] In this work, we describe an implementation of the variational method proposed by Brox et al. in 2004, which yields accurate optical flows with low running times. It has several benefits with respect to the method of Horn and Schunck: it is more robust to the presence of outliers, produces piecewise-smooth flow fields and can cope with constant brightness changes. This method relies on the brightness and gradient constancy assumptions, using the information of the image intensities and the image gradients to find correspondences. It also generalizes the use of continuous L1 functionals, which help mitigate the efect of outliers and create a Total Variation (TV) regularization. Additionally, it introduces a simple temporal regularization scheme that enforces a continuous temporal coherence of the flow fields.
Resumo:
The purpose of this Thesis is to develop a robust and powerful method to classify galaxies from large surveys, in order to establish and confirm the connections between the principal observational parameters of the galaxies (spectral features, colours, morphological indices), and help unveil the evolution of these parameters from $z \sim 1$ to the local Universe. Within the framework of zCOSMOS-bright survey, and making use of its large database of objects ($\sim 10\,000$ galaxies in the redshift range $0 < z \lesssim 1.2$) and its great reliability in redshift and spectral properties determinations, first we adopt and extend the \emph{classification cube method}, as developed by Mignoli et al. (2009), to exploit the bimodal properties of galaxies (spectral, photometric and morphologic) separately, and then combining together these three subclassifications. We use this classification method as a test for a newly devised statistical classification, based on Principal Component Analysis and Unsupervised Fuzzy Partition clustering method (PCA+UFP), which is able to define the galaxy population exploiting their natural global bimodality, considering simultaneously up to 8 different properties. The PCA+UFP analysis is a very powerful and robust tool to probe the nature and the evolution of galaxies in a survey. It allows to define with less uncertainties the classification of galaxies, adding the flexibility to be adapted to different parameters: being a fuzzy classification it avoids the problems due to a hard classification, such as the classification cube presented in the first part of the article. The PCA+UFP method can be easily applied to different datasets: it does not rely on the nature of the data and for this reason it can be successfully employed with others observables (magnitudes, colours) or derived properties (masses, luminosities, SFRs, etc.). The agreement between the two classification cluster definitions is very high. ``Early'' and ``late'' type galaxies are well defined by the spectral, photometric and morphological properties, both considering them in a separate way and then combining the classifications (classification cube) and treating them as a whole (PCA+UFP cluster analysis). Differences arise in the definition of outliers: the classification cube is much more sensitive to single measurement errors or misclassifications in one property than the PCA+UFP cluster analysis, in which errors are ``averaged out'' during the process. This method allowed us to behold the \emph{downsizing} effect taking place in the PC spaces: the migration between the blue cloud towards the red clump happens at higher redshifts for galaxies of larger mass. The determination of $M_{\mathrm{cross}}$ the transition mass is in significant agreement with others values in literature.
Resumo:
Miglioramento delle prestazioni del modello mono-compartimentale del maximum slope dovuto all'introduzione di sistemi per l'eliminazione degli outliers.
Resumo:
Während der Glazialphasen kam es in den europäischen Mittelgebirgen bedingt durch extensive solifluidale Massenbewegungen zur Bildung von Deckschichten. Diese Deckschichten repräsentieren eine Mischung verschiedener Substrate, wie anstehendes Ausgangsgestein, äolische Depositionen und lokale Erzgänge. Die räumliche Ausdehnung der Metallkontaminationen verursacht durch kleinräumige Erzgänge wird durch die periglaziale Solifluktion verstärkt. Das Ziel der vorliegenden Untersuchung war a) den Zusammenhang zwischen den Reliefeigenschaften und den Ausprägungen der solifluidalen Deckschichten und Böden aufzuklären, sowie b) mittels Spurenelementgehalte und Blei-Isotopen-Verhältnisse als Eingangsdaten für Mischungsmodelle die Beitrage der einzelnen Substrate zum Ausgangsmaterial der Bodenbildung zu identifizieren und quantifizieren und c) die räumliche Verteilung von Blei (Pb) in Deckschichten, die über Bleierzgänge gewandert sind, untersucht, die Transportweite des erzbürtigen Bleis berechnet und die kontrollierenden Faktoren der Transportweite bestimmt werden. Sechs Transekte im südöstlichen Rheinischen Schiefergebirge, einschließlich der durch periglaziale Solifluktion entwickelten Böden, wurden untersucht. Die bodenkundliche Geländeaufnahme erfolgte nach AG Boden (2005). O, A, B und C-Horizontproben wurden auf ihre Spurenelementgehalte und teilweise auf ihre 206Pb/207Pb-Isotopenverhältnisse analysiert. Die steuernden Faktoren der Verteilung und Eigenschaften periglazialer Deckschichten sind neben der Petrographie, Reliefeigenschaften wie Exposition, Hangneigung, Hangposition und Krümmung. Die Reliefanalyse zeigt geringmächtige Deckschichten in divergenten, konvexen Hangbereichen bei gleichzeitig hohem Skelettgehalt. In konvergent, konkaven Hangbereichen nimmt die Deckschichtenmächtigkeit deutlich zu, bei gleichzeitig zunehmendem Lösslehm- und abnehmendem Skelettgehalt. Abhängig von den Reliefeigenschaften und -positionen reichen die ausgeprägten Bodentypen von sauren Braunerden bis hin zu Pseudogley-Parabraunerden. Des Weiteren kommen holozäne Kolluvien in eher untypischen Reliefpositionen wie langgestreckten, kaum geneigten Hangbereichen oder Mittelhangbereichen vor. Außer für Pb bewegen sich die Spurenelementgehalte im Rahmen niedriger Hintergrundgehalte. Die Pb-Gehalte liegen zwischen 20-135 mg kg-1. Abnehmende Spurenelementgehalte und Isotopensignaturen (206Pb/207Pb-Isotopenverhältnisse) von Pb zeigen, dass nahezu kein Pb aus atmosphärischen Depositionen in die B-Horizonte verlagert wurde. Eine Hauptkomponentenanalyse (PCA) der Spurenelementgehalte hat vier Hauptsubstratquellen der untersuchten B-Horizonte identifiziert (Tonschiefer, Löss, Laacher-See-Tephra [LST] und lokale Pb-Erzgänge). Mittels 3-Komponenten-Mischungsmodell, das Tonschiefer, Löss und LST einschloss, konnten, bis auf 10 Ausreißer, die Spurenelementgehalte aller 120 B-Horizontproben erklärt werden. Der Massenbeitrag des Pb-Erzes zur Substratmischung liegt bei <0,1%. Die räumliche Pb-Verteilung zeigt Bereiche lokaler Pb-Gehaltsmaxima hangaufwärtiger Pb-Erzgänge. Mittels eines 206Pb/207Pb-Isotopenverhältnis-Mischungsmodells konnten 14 Bereiche erhöhter lokaler Pb-Gehaltsmaxima ausgewiesen werden, die 76-100% erzbürtigen Bleis enthalten. Mit Hilfe eines Geographischen Informationssystems wurden die Transportweiten des erzbürtigen Bleis mit 30 bis 110 m bestimmt. Die steuerenden Faktoren der Transportweite sind dabei die Schluffkonzentration und die Vertikalkrümmung. Diese Untersuchung zeigt, dass Reliefeigenschaften und Reliefposition einen entscheidenden Einfluss auf die Ausprägung der Deckschichten und Böden im europäischen Mittelgebirgsbereich haben. Mischungsmodelle in Kombination mit Spurenelementanalysen und Isotopenverhältnissen stellen ein wichtiges Werkzeug zur Bestimmung der Beiträge der einzelnen Glieder in Bodensubstratmischungen dar. Außerdem können lokale Bleierzgänge die natürlichen Pb-Gehalte in Böden, entwickelt in periglazialen Deckschichten der letzten Vereisungsphase (Würm), bis über 100 m Entfernung erhöhen.
Resumo:
Seventeen bones (sixteen cadaveric bones and one plastic bone) were used to validate a method for reconstructing a surface model of the proximal femur from 2D X-ray radiographs and a statistical shape model that was constructed from thirty training surface models. Unlike previously introduced validation studies, where surface-based distance errors were used to evaluate the reconstruction accuracy, here we propose to use errors measured based on clinically relevant morphometric parameters. For this purpose, a program was developed to robustly extract those morphometric parameters from the thirty training surface models (training population), from the seventeen surface models reconstructed from X-ray radiographs, and from the seventeen ground truth surface models obtained either by a CT-scan reconstruction method or by a laser-scan reconstruction method. A statistical analysis was then performed to classify the seventeen test bones into two categories: normal cases and outliers. This classification step depends on the measured parameters of the particular test bone. In case all parameters of a test bone were covered by the training population's parameter ranges, this bone is classified as normal bone, otherwise as outlier bone. Our experimental results showed that statistically there was no significant difference between the morphometric parameters extracted from the reconstructed surface models of the normal cases and those extracted from the reconstructed surface models of the outliers. Therefore, our statistical shape model based reconstruction technique can be used to reconstruct not only the surface model of a normal bone but also that of an outlier bone.
Resumo:
Dimensional modeling, GT-Power in particular, has been used for two related purposes-to quantify and understand the inaccuracies of transient engine flow estimates that cause transient smoke spikes and to improve empirical models of opacity or particulate matter used for engine calibration. It has been proposed by dimensional modeling that exhaust gas recirculation flow rate was significantly underestimated and volumetric efficiency was overestimated by the electronic control module during the turbocharger lag period of an electronically controlled heavy duty diesel engine. Factoring in cylinder-to-cylinder variation, it has been shown that the electronic control module estimated fuel-Oxygen ratio was lower than actual by up to 35% during the turbocharger lag period but within 2% of actual elsewhere, thus hindering fuel-Oxygen ratio limit-based smoke control. The dimensional modeling of transient flow was enabled with a new method of simulating transient data in which the manifold pressures and exhaust gas recirculation system flow resistance, characterized as a function of exhaust gas recirculation valve position at each measured transient data point, were replicated by quasi-static or transient simulation to predict engine flows. Dimensional modeling was also used to transform the engine operating parameter model input space to a more fundamental lower dimensional space so that a nearest neighbor approach could be used to predict smoke emissions. This new approach, intended for engine calibration and control modeling, was termed the "nonparametric reduced dimensionality" approach. It was used to predict federal test procedure cumulative particulate matter within 7% of measured value, based solely on steady-state training data. Very little correlation between the model inputs in the transformed space was observed as compared to the engine operating parameter space. This more uniform, smaller, shrunken model input space might explain how the nonparametric reduced dimensionality approach model could successfully predict federal test procedure emissions when roughly 40% of all transient points were classified as outliers as per the steady-state training data.
Resumo:
Over the past four decades, the number of democracies in the world has increased exponentially. This project considers how democracy and FDI affect economic growth as well as whether the impact of FDI depends on the level of democracy in a country. Thus, I explore two major research questions: 1) Whether increased FDI speeds up economic growth, controlling for political regime type, urbanization and other developmental indicators; and 2) Whether an increase in political freedom helps or hinders economic growth, and specifically whether the impact of FDI varies depending on the political regime in the recipient country. To examine these questions, this paper used data from 150 countries over a period between 1980 and 2010 and utilized several models, testing variables such as institutions, agglomerations, urbanization, FDI and type of political regime, among others, for their impact on economic growth. I found that FDI does have a positive impact on economic growth, and that this impact is often magnified when it interacts with other relevant factors. I also found that, after controlling for other variables, FDI inflows do not have a different impact on economic growth in autocracies than they do in democracies. This may be partially explained by autocratic outliers such as China and the OPEC states, which have recently experienced rapid export-led growth. This suggests that factors such as education could have a greater impact on a country¿s economic growth than does its political system.
Resumo:
A total knee arthroplasty performed with navigation results in more accurate component positioning with fewer outliers. It is not known whether image-based or image-free-systems are preferable and if navigation for only one component leads to equal accuracy in leg alignment than navigation of both components. We evaluated the results of total knee arthroplasties performed with femoral navigation. We studied 90 knees in 88 patients who had conventional total knee arthroplasties, image-based total knee arthroplasties, or total knee arthroplasties with image-free navigation. We compared patients' perioperative times, component alignment accuracy, and short-term outcomes. The total surgical time was longer in the image-based total knee arthroplasty group (109 +/- 7 minutes) compared with the image-free (101 +/- 17 minutes) and conventional total knee arthroplasty groups (87 +/- 20 minutes). The mechanical axis of the leg was within 3 degrees of neutral alignment, although the conventional total knee arthroplasty group showed more (10.6 degrees ) variance than the navigated groups (5.8 degrees and 6.4 degrees , respectively). We found a positive correlation between femoral component malalignment and the total mechanical axis in the conventional group. Our results suggest image-based navigation is not necessary, and image-free femoral navigation may be sufficient for accurate component alignment.
Resumo:
The purpose of this study is to develop statistical methodology to facilitate indirect estimation of the concentration of antiretroviral drugs and viral loads in the prostate gland and the seminal vesicle. The differences in antiretroviral drug concentrations in these organs may lead to suboptimal concentrations in one gland compared to the other. Suboptimal levels of the antiretroviral drugs will not be able to fully suppress the virus in that gland, lead to a source of sexually transmissible virus and increase the chance of selecting for drug resistant virus. This information may be useful selecting antiretroviral drug regimen that will achieve optimal concentrations in most of male genital tract glands. Using fractionally collected semen ejaculates, Lundquist (1949) measured levels of surrogate markers in each fraction that are uniquely produced by specific male accessory glands. To determine the original glandular concentrations of the surrogate markers, Lundquist solved a simultaneous series of linear equations. This method has several limitations. In particular, it does not yield a unique solution, it does not address measurement error, and it disregards inter-subject variability in the parameters. To cope with these limitations, we developed a mechanistic latent variable model based on the physiology of the male genital tract and surrogate markers. We employ a Bayesian approach and perform a sensitivity analysis with regard to the distributional assumptions on the random effects and priors. The model and Bayesian approach is validated on experimental data where the concentration of a drug should be (biologically) differentially distributed between the two glands. In this example, the Bayesian model-based conclusions are found to be robust to model specification and this hierarchical approach leads to more scientifically valid conclusions than the original methodology. In particular, unlike existing methods, the proposed model based approach was not affected by a common form of outliers.
Resumo:
Constructing a 3D surface model from sparse-point data is a nontrivial task. Here, we report an accurate and robust approach for reconstructing a surface model of the proximal femur from sparse-point data and a dense-point distribution model (DPDM). The problem is formulated as a three-stage optimal estimation process. The first stage, affine registration, is to iteratively estimate a scale and a rigid transformation between the mean surface model of the DPDM and the sparse input points. The estimation results of the first stage are used to establish point correspondences for the second stage, statistical instantiation, which stably instantiates a surface model from the DPDM using a statistical approach. This surface model is then fed to the third stage, kernel-based deformation, which further refines the surface model. Handling outliers is achieved by consistently employing the least trimmed squares (LTS) approach with a roughly estimated outlier rate in all three stages. If an optimal value of the outlier rate is preferred, we propose a hypothesis testing procedure to automatically estimate it. We present here our validations using four experiments, which include 1 leave-one-out experiment, 2 experiment on evaluating the present approach for handling pathology, 3 experiment on evaluating the present approach for handling outliers, and 4 experiment on reconstructing surface models of seven dry cadaver femurs using clinically relevant data without noise and with noise added. Our validation results demonstrate the robust performance of the present approach in handling outliers, pathology, and noise. An average 95-percentile error of 1.7-2.3 mm was found when the present approach was used to reconstruct surface models of the cadaver femurs from sparse-point data with noise added.
Resumo:
Similarity measure is one of the main factors that affect the accuracy of intensity-based 2D/3D registration of X-ray fluoroscopy to CT images. Information theory has been used to derive similarity measure for image registration leading to the introduction of mutual information, an accurate similarity measure for multi-modal and mono-modal image registration tasks. However, it is known that the standard mutual information measure only takes intensity values into account without considering spatial information and its robustness is questionable. Previous attempt to incorporate spatial information into mutual information either requires computing the entropy of higher dimensional probability distributions, or is not robust to outliers. In this paper, we show how to incorporate spatial information into mutual information without suffering from these problems. Using a variational approximation derived from the Kullback-Leibler bound, spatial information can be effectively incorporated into mutual information via energy minimization. The resulting similarity measure has a least-squares form and can be effectively minimized by a multi-resolution Levenberg-Marquardt optimizer. Experimental results are presented on datasets of two applications: (a) intra-operative patient pose estimation from a few (e.g. 2) calibrated fluoroscopic images, and (b) post-operative cup alignment estimation from single X-ray radiograph with gonadal shielding.
Resumo:
Non-invasive pulse spectrophotometry to measure indocyanine green (ICG) elimination correlates well with the conventional invasive ICG clearance test. Nevertheless, the precision of this method remains unclear for any application, including small-for-size liver remnants. We therefore measured ICG plasma disappearance rate (PDR) during the anhepatic phase of orthotopic liver transplantation using pulse spectrophotometry. Measurements were done in 24 patients. The median PDR after exclusion of two outliers and two patients with inconstant signal was 1.55%/min (95% confidence interval [CI]=0.8-2.2). No correlation with patient age, gender, body mass, blood loss, administration of fresh frozen plasma, norepinephrine dose, postoperative albumin (serum), or difference in pre and post transplant body weight was detected. In conclusion, we found an ICG-PDR different from zero in the anhepatic phase, an overestimation that may arise in particular from a redistribution into the interstitial space. If ICG pulse spectrophotometry is used to measure functional hepatic reserve, the verified average difference from zero (1.55%/min) determined in our study needs to be taken into account.
Resumo:
Soil spectroscopy was applied for predicting soil organic carbon (SOC) in the highlands of Ethiopia. Soil samples were acquired from Ethiopia’s National Soil Testing Centre and direct field sampling. The reflectance of samples was measured using a FieldSpec 3 diffuse reflectance spectrometer. Outliers and sample relation were evaluated using principal component analysis (PCA) and models were developed through partial least square regression (PLSR). For nine watersheds sampled, 20% of the samples were set aside to test prediction and 80% were used to develop calibration models. Depending on the number of samples per watershed, cross validation or independent validation were used.The stability of models was evaluated using coefficient of determination (R2), root mean square error (RMSE), and the ratio performance deviation (RPD). The R2 (%), RMSE (%), and RPD, respectively, for validation were Anjeni (88, 0.44, 3.05), Bale (86, 0.52, 2.7), Basketo (89, 0.57, 3.0), Benishangul (91, 0.30, 3.4), Kersa (82, 0.44, 2.4), Kola tembien (75, 0.44, 1.9),Maybar (84. 0.57, 2.5),Megech (85, 0.15, 2.6), andWondoGenet (86, 0.52, 2.7) indicating that themodels were stable. Models performed better for areas with high SOC values than areas with lower SOC values. Overall, soil spectroscopy performance ranged from very good to good.
Resumo:
Range expansions are extremely common, but have only recently begun to attract attention in terms of their genetic consequences. As populations expand, demes at the wave front experience strong genetic drift, which is expected to reduce genetic diversity and potentially cause ‘allele surfing’, where alleles may become fixed over a wide geographical area even if their effects are deleterious. Previous simulation models show that range expansions can generate very strong selective gradients on dispersal, reproduction, competition and immunity. To investigate the effects of range expansion on genetic diversity and adaptation, we studied the population genomics of the bank vole (Myodes glareolus) in Ireland. The bank vole was likely introduced in the late 1920s and is expanding its range at a rate of ~2.5 km/year. Using genotyping-by-sequencing, we genotyped 281 bank voles at 5979 SNP loci. Fourteen sample sites were arranged in three transects running from the introduction site to the wave front of the expansion. We found significant declines in genetic diversity along all three transects. However, there was no evidence that sites at the wave front had accumulated more deleterious mutations. We looked for outlier loci with strong correlations between allele frequency and distance from the introduction site, where the direction of correlation was the same in all three transects. Amongst these outliers, we found significant enrichment for genic SNPs, suggesting the action of selection. Candidates for selection included several genes with immunological functions and several genes that could influence behaviour.