949 resultados para monotone missing data
Resumo:
Heterogeneous and incomplete datasets are common in many real-world visualisation applications. The probabilistic nature of the Generative Topographic Mapping (GTM), which was originally developed for complete continuous data, can be extended to model heterogeneous (i.e. containing both continuous and discrete values) and missing data. This paper describes and assesses the resulting model on both synthetic and real-world heterogeneous data with missing values.
Resumo:
A certain type of bacterial inclusion, known as a bacterial microcompartment, was recently identified and imaged through cryo-electron tomography. A reconstructed 3D object from single-axis limited angle tilt-series cryo-electron tomography contains missing regions and this problem is known as the missing wedge problem. Due to missing regions on the reconstructed images, analyzing their 3D structures is a challenging problem. The existing methods overcome this problem by aligning and averaging several similar shaped objects. These schemes work well if the objects are symmetric and several objects with almost similar shapes and sizes are available. Since the bacterial inclusions studied here are not symmetric, are deformed, and show a wide range of shapes and sizes, the existing approaches are not appropriate. This research develops new statistical methods for analyzing geometric properties, such as volume, symmetry, aspect ratio, polyhedral structures etc., of these bacterial inclusions in presence of missing data. These methods work with deformed and non-symmetric varied shaped objects and do not necessitate multiple objects for handling the missing wedge problem. The developed methods and contributions include: (a) an improved method for manual image segmentation, (b) a new approach to 'complete' the segmented and reconstructed incomplete 3D images, (c) a polyhedral structural distance model to predict the polyhedral shapes of these microstructures, (d) a new shape descriptor for polyhedral shapes, named as polyhedron profile statistic, and (e) the Bayes classifier, linear discriminant analysis and support vector machine based classifiers for supervised incomplete polyhedral shape classification. Finally, the predicted 3D shapes for these bacterial microstructures belong to the Johnson solids family, and these shapes along with their other geometric properties are important for better understanding of their chemical and biological characteristics.
Resumo:
Estimates of HIV prevalence are important for policy in order to establish the health status of a country's population and to evaluate the effectiveness of population-based interventions and campaigns. However, participation rates in testing for surveillance conducted as part of household surveys, on which many of these estimates are based, can be low. HIV positive individuals may be less likely to participate because they fear disclosure, in which case estimates obtained using conventional approaches to deal with missing data, such as imputation-based methods, will be biased. We develop a Heckman-type simultaneous equation approach which accounts for non-ignorable selection, but unlike previous implementations, allows for spatial dependence and does not impose a homogeneous selection process on all respondents. In addition, our framework addresses the issue of separation, where for instance some factors are severely unbalanced and highly predictive of the response, which would ordinarily prevent model convergence. Estimation is carried out within a penalized likelihood framework where smoothing is achieved using a parametrization of the smoothing criterion which makes estimation more stable and efficient. We provide the software for straightforward implementation of the proposed approach, and apply our methodology to estimating national and sub-national HIV prevalence in Swaziland, Zimbabwe and Zambia.
Resumo:
ABSTRACT Researchers frequently have to analyze scales in which some participants have failed to respond to some items. In this paper we focus on the exploratory factor analysis of multidimensional scales (i.e., scales that consist of a number of subscales) where each subscale is made up of a number of Likert-type items, and the aim of the analysis is to estimate participants' scores on the corresponding latent traits. We propose a new approach to deal with missing responses in such a situation that is based on (1) multiple imputation of non-responses and (2) simultaneous rotation of the imputed datasets. We applied the approach in a real dataset where missing responses were artificially introduced following a real pattern of non-responses, and a simulation study based on artificial datasets. The results show that our approach (specifically, Hot-Deck multiple imputation followed of Consensus Promin rotation) was able to successfully compute factor score estimates even for participants that have missing data.
Resumo:
Some factors complicate comparisons between linkage maps from different studies. This problem can be resolved if measures of precision, such as confidence intervals and frequency distributions, are associated with markers. We examined the precision of distances and ordering of microsatellite markers in the consensus linkage maps of chromosomes 1, 3 and 4 from two F 2 reciprocal Brazilian chicken populations, using bootstrap sampling. Single and consensus maps were constructed. The consensus map was compared with the International Consensus Linkage Map and with the whole genome sequence. Some loci showed segregation distortion and missing data, but this did not affect the analyses negatively. Several inversions and position shifts were detected, based on 95% confidence intervals and frequency distributions of loci. Some discrepancies in distances between loci and in ordering were due to chance, whereas others could be attributed to other effects, including reciprocal crosses, sampling error of the founder animals from the two populations, F(2) population structure, number of and distance between microsatellite markers, number of informative meioses, loci segregation patterns, and sex. In the Brazilian consensus GGA1, locus LEI1038 was in a position closer to the true genome sequence than in the International Consensus Map, whereas for GGA3 and GGA4, no such differences were found. Extending these analyses to the remaining chromosomes should facilitate comparisons and the integration of several available genetic maps, allowing meta-analyses for map construction and quantitative trait loci (QTL) mapping. The precision of the estimates of QTL positions and their effects would be increased with such information.
Resumo:
Background: Cerebral palsy (CP) patients have motor limitations that can affect functionality and abilities for activities of daily living (ADL). Health related quality of life and health status instruments validated to be applied to these patients do not directly approach the concepts of functionality or ADL. The Child Health Assessment Questionnaire (CHAQ) seems to be a good instrument to approach this dimension, but it was never used for CP patients. The purpose of the study was to verify the psychometric properties of CHAQ applied to children and adolescents with CP. Methods: Parents or guardians of children and adolescents with CP, aged 5 to 18 years, answered the CHAQ. A healthy group of 314 children and adolescents was recruited during the validation of the CHAQ Brazilian-version. Data quality, reliability and validity were studied. The motor function was evaluated by the Gross Motor Function Measure (GMFM). Results: Ninety-six parents/guardians answered the questionnaire. The age of the patients ranged from 5 to 17.9 years (average: 9.3). The rate of missing data was low(< 9.3%). The floor effect was observed in two domains, being higher only in the visual analogue scales (<= 35.5%). The ceiling effect was significant in all domains and particularly high in patients with quadriplegia (81.8 to 90.9%) and extrapyramidal (45.4 to 91.0%). The Cronbach alpha coefficient ranged from 0.85 to 0.95. The validity was appropriate: for the discriminant validity the correlation of the disability index with the visual analogue scales was not significant; for the convergent validity CHAQ disability index had a strong correlation with the GMFM (0.77); for the divergent validity there was no correlation between GMFM and the pain and overall evaluation scales; for the criterion validity GMFM as well as CHAQ detected differences in the scores among the clinical type of CP (p < 0.01); for the construct validity, the patients' disability index score (mean: 2.16; SD: 0.72) was higher than the healthy group ( mean: 0.12; SD: 0.23)(p < 0.01). Conclusion: CHAQ reliability and validity were adequate to this population. However, further studies are necessary to verify the influence of the ceiling effect on the responsiveness of the instrument.
Resumo:
When building genetic maps, it is necessary to choose from several marker ordering algorithms and criteria, and the choice is not always simple. In this study, we evaluate the efficiency of algorithms try (TRY), seriation (SER), rapid chain delineation (RCD), recombination counting and ordering (RECORD) and unidirectional growth (UG), as well as the criteria PARF (product of adjacent recombination fractions), SARF (sum of adjacent recombination fractions), SALOD (sum of adjacent LOD scores) and LHMC (likelihood through hidden Markov chains), used with the RIPPLE algorithm for error verification, in the construction of genetic linkage maps. A linkage map of a hypothetical diploid and monoecious plant species was simulated containing one linkage group and 21 markers with fixed distance of 3 cM between them. In all, 700 F(2) populations were randomly simulated with and 400 individuals with different combinations of dominant and co-dominant markers, as well as 10 and 20% of missing data. The simulations showed that, in the presence of co-dominant markers only, any combination of algorithm and criteria may be used, even for a reduced population size. In the case of a smaller proportion of dominant markers, any of the algorithms and criteria (except SALOD) investigated may be used. In the presence of high proportions of dominant markers and smaller samples (around 100), the probability of repulsion linkage increases between them and, in this case, use of the algorithms TRY and SER associated to RIPPLE with criterion LHMC would provide better results. Heredity (2009) 103, 494-502; doi:10.1038/hdy.2009.96; published online 29 July 2009
Resumo:
Recent attempts to explain the susceptibility of vertebrates to declines worldwide have largely focused on intrinsic factors such as body size, reproductive potential, ecological specialization, geographical range and phylogenetic longevity. Here, we use a database of 145 Australian marsupial species to test the effects of both intrinsic and extrinsic factors in a multivariate comparative approach. We model five intrinsic (body size, habitat specialization, diet, reproductive rate and range size) and four extrinsic (climate and range overlap with introduced foxes, sheep and rabbits) factors. We use quantitative measures of geographical range contraction as indices of decline. We also develop a new modelling approach of phylogenetically independent contrasts combined with imputation of missing values to deal simultaneously with phylogenetic structuring and missing data. One extrinsic variable-geographical range overlap with sheep-was the only consistent predictor of declines. Habitat specialization was independently but less consistently associated with declines. This suggests that extrinsic factors largely determine interspecific variation in extinction risk among Australian marsupials, and that the intrinsic factors that are consistently associated with extinction risk in other vertebrates are less important in this group. We conclude that recent anthropogenic changes have been profound enough to affect species on a continent-wide scale, regardless of their intrinsic biology.
Resumo:
Crown group Archosauria, which includes birds, dinosaurs, crocodylomorphs, and several extinct Mesozoic groups, is a primary division of the vertebrate tree of life. However, the higher-level phylogenetic relationships within Archosauria are poorly resolved and controversial, despite years of study. The phylogeny of crocodile-line archosaurs (Crurotarsi) is particularly contentious, and has been plagued by problematic taxon and character sampling. Recent discoveries and renewed focus on archosaur anatomy enable the compilation of a new dataset, which assimilates and standardizes character data pertinent to higher-level archosaur phylogeny, and is scored across the largest group of taxa yet analysed. This dataset includes 47 new characters (25% of total) and eight taxa that have yet to be included in an analysis, and total taxonomic sampling is more than twice that of any previous study. This analysis produces a well-resolved phylogeny, which recovers mostly traditional relationships within Avemetatarsalia, places Phytosauria as a basal crurotarsan clade, finds a close relationship between Aetosauria and Crocodylomorpha, and recovers a monophyletic Rauisuchia comprised of two major subclades. Support values are low, suggesting rampant homoplasy and missing data within Archosauria, but the phylogeny is highly congruent with stratigraphy. Comparison with alternative analyses identifies numerous scoring differences, but indicates that character sampling is the main source of incongruence. The phylogeny implies major missing lineages in the Early Triassic and may support a Carnian-Norian extinction event.
Resumo:
There are many methods for the analysis and design of embedded cantilever retaining walls. They involve various different simplifications of the pressure distribution to allow calculation of the limiting equilibrium retained height and the bending moment when the retained height is less than the limiting equilibrium value, i.e. the serviceability case. Recently, a new method for determining the serviceability earth pressure and bending moment has been proposed. This method makes an assumption defining the point of zero net pressure. This assumption implies that the passive pressure is not fully mobilised immediately below the excavation level. The finite element analyses presented in this paper examine the net pressure distribution on walls in which the retained height is less, than the limiting equilibrium value. The study shows that for all practical walls, the earth pressure distributions on the front and back of the wall are at their limit values, Kp and K-a respectively, when the lumped factor of safety F-r is less than or equal to2.0. A rectilinear net pressure distribution is proposed that is intuitively logical. It produces good predictions of the complete bending moment diagram for walls in the service configuration and the proposed method gives results that have excellent agreement with centrifuge model tests. The study shows that the method for determining the serviceability bending moment suggested by Padfield and Mair(1) in the CIRIA Report 104 gives excellent predictions of the maximum bending moment in practical cantilever walls. It provides the missing data that have been needed to verify and justify the CIRIA 104 method.
Resumo:
Dissertação apresentada para obtenção do Grau de Doutor em Engenharia Electrotécnica e de Computadores – Sistemas Digitais e Percepcionais pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia
Resumo:
In this paper we included a very broad representation of grass family diversity (84% of tribes and 42% of genera). Phylogenetic inference was based on three plastid DNA regions rbcL, matK and trnL-F, using maximum parsimony and Bayesian methods. Our results resolved most of the subfamily relationships within the major clades (BEP and PACCMAD), which had previously been unclear, such as, among others the: (i) BEP and PACCMAD sister relationship, (ii) composition of clades and the sister-relationship of Ehrhartoideae and Bambusoideae + Pooideae, (iii) paraphyly of tribe Bambuseae, (iv) position of Gynerium as sister to Panicoideae, (v) phylogenetic position of Micrairoideae. With the presence of a relatively large amount of missing data, we were able to increase taxon sampling substantially in our analyses from 107 to 295 taxa. However, bootstrap support and to a lesser extent Bayesian inference posterior probabilities were generally lower in analyses involving missing data than those not including them. We produced a fully resolved phylogenetic summary tree for the grass family at subfamily level and indicated the most likely relationships of all included tribes in our analysis.
Resumo:
OBJECTIVES: To assess the prevalence and correlates of childhood and adolescent sexual and/or physical abuse (SPA) in bipolar I disorder (BDI) patients treated for a first episode of psychotic mania. METHODS: The Early Psychosis Prevention and Intervention Centre admitted 786 first-episode psychosis patients between 1998 and 2000. Data were collected from patients' files using a standardized questionnaire. A total of 704 files were available; 43 were excluded because of a nonpsychotic diagnosis at endpoint and 3 due to missing data regarding past stressful events. Among 658 patients with available data, 118 received a final diagnosis of BDI and were entered in this study. RESULTS: A total of 80% of patients had been exposed to stressful life events during childhood and adolescence and 24.9% to SPA; in particular, 29.8% of female patients had been exposed to sexual abuse. Patients who were exposed to SPA had poorer premorbid functioning, higher rates of forensic history, were less likely to live with family during treatment period, and were more likely to disengage from treatment. CONCLUSIONS: SPA is highly prevalent in BDI patients presenting with a first episode of psychotic mania; exposed patients have lower premorbid functional levels and poorer engagement with treatment. The context in which such traumas occur must be explored in order to determine whether early intervention strategies may contribute to diminish their prevalence. Specific psychological interventions must also be developed.
Resumo:
BACKGROUND: Recombinant human insulin-like growth factor I (rhIGF-I) is a possible disease modifying therapy for amyotrophic lateral sclerosis (ALS, which is also known as motor neuron disease (MND)). OBJECTIVES: To examine the efficacy of rhIGF-I in affecting disease progression, impact on measures of functional health status, prolonging survival and delaying the use of surrogates (tracheostomy and mechanical ventilation) to sustain survival in ALS. Occurrence of adverse events was also reviewed. SEARCH METHODS: We searched the Cochrane Neuromuscular Disease Group Specialized Register (21 November 2011), CENTRAL (2011, Issue 4), MEDLINE (January 1966 to November 2011) and EMBASE (January 1980 to November 2011) and sought information from the authors of randomised clinical trials and manufacturers of rhIGF-I. SELECTION CRITERIA: We considered all randomised controlled clinical trials involving rhIGF-I treatment of adults with definite or probable ALS according to the El Escorial Criteria. The primary outcome measure was change in Appel Amyotrophic Lateral Sclerosis Rating Scale (AALSRS) total score after nine months of treatment and secondary outcome measures were change in AALSRS at 1, 2, 3, 4, 5, 6, 7, 8, 9 months, change in quality of life (Sickness Impact Profile scale), survival and adverse events. DATA COLLECTION AND ANALYSIS: Each author independently graded the risk of bias in the included studies. The lead author extracted data and the other authors checked them. We generated some missing data by making ruler measurements of data in published graphs. We collected data about adverse events from the included trials. MAIN RESULTS: We identified three randomised controlled trials (RCTs) of rhIGF-I, involving 779 participants, for inclusion in the analysis. In a European trial (183 participants) the mean difference (MD) in change in AALSRS total score after nine months was -3.30 (95% confidence interval (CI) -8.68 to 2.08). In a North American trial (266 participants), the MD after nine months was -6.00 (95% CI -10.99 to -1.01). The combined analysis from both RCTs showed a MD after nine months of -4.75 (95% CI -8.41 to -1.09), a significant difference in favour of the treated group. The secondary outcome measures showed non-significant trends favouring rhIGF-I. There was an increased risk of injection site reactions with rhIGF-I (risk ratio 1.26, 95% CI 1.04 to 1.54). . A second North American trial (330 participants) used a novel primary end point involving manual muscle strength testing. No differences were demonstrated between the treated and placebo groups in this study. All three trials were at high risk of bias. AUTHORS' CONCLUSIONS: Meta-analysis revealed a significant difference in favour of rhIGF-I treatment; however, the quality of the evidence from the two included trials was low. A third study showed no difference between treatment and placebo. There is no evidence for increase in survival with IGF1. All three included trials were at high risk of bias.
Resumo:
IPH has estimated and forecast clinical diagnosis rates of stroke among adults for the years 2010, 2015 and 2020. In the Republic of Ireland, the data are based on the Survey of Lifestyle, Attitudes and Nutrition (SLÁN) 2007. The data describe the number of adults who report that they have experienced doctor-diagnosed stroke in the previous 12 months. Data are available by age and sex for each Local Health Office of the Health Service Executive (HSE) in the Republic of Ireland. In Northern Ireland, the data are based on the Health and Social Wellbeing Survey 2005/06. The data describe the number of adults who report that they have experienced doctor-diagnosed stroke at any time in the past. Data are available by age and sex for each Local Government District in Northern Ireland. Clinical diagnosis rates in the Republic of Ireland relate to the previous 12 months and are not directly comparable with clinical diagnosis rates in Northern Ireland which relate to anytime in the past. The IPH estimated prevalence per cents may be marginally different to estimated prevalence per cents taken directly from the reference study. There are two reasons for this: 1) The IPH prevalence estimates relate to 2010 while the reference studies relate to earlier years (Northern Ireland Health and Social Wellbeing Survey 2005/06, Survey of Lifestyle, Attitudes and Nutrition 2007, Understanding Society 2009). Although we assume that the risk of the condition in the risk groups do not change over time, the distribution of the number of people in the risk groups in the population changes over time (eg the population ages). This new distribution of the risk groups in the population means that the risk of the condition is weighted differently to the reference study and this results in a different overall prevalence estimate. 2) The IPH prevalence estimates are based on a statistical model of the reference study. The model includes a number of explanatory variables to predict the risk of the condition. Therefore the model does not include records from the reference study that are missing data on these explanatory variables. A prevalence estimate for a condition taken directly from the reference study would include these records.