18 resultados para METHODS: STATISTICAL
em University of Queensland eSpace - Australia
Resumo:
The H I Parkes All Sky Survey (HIPASS) is a blind extragalactic H I 21-cm emission-line survey covering the whole southern sky from declination -90degrees to +25degrees. The HIPASS catalogue (HICAT), containing 4315 H I-selected galaxies from the region south of declination +2degrees, is presented in Meyer et al. (Paper I). This paper describes in detail the completeness and reliability of HICAT, which are calculated from the recovery rate of synthetic sources and follow-up observations, respectively. HICAT is found to be 99 per cent complete at a peak flux of 84 mJy and an integrated flux of 9.4 Jy km. s(-1). The overall reliability is 95 per cent, but rises to 99 per cent for sources with peak fluxes >58 mJy or integrated flux >8.2 Jy km s(-1). Expressions are derived for the uncertainties on the most important HICAT parameters: peak flux, integrated flux, velocity width and recessional velocity. The errors on HICAT parameters are dominated by the noise in the HIPASS data, rather than by the parametrization procedure.
Resumo:
An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local false discovery rate is provided for each gene, and it can be implemented so that the implied global false discovery rate is bounded as with the Benjamini-Hochberg methodology based on tail areas. The latter procedure is too conservative, unless it is modified according to the prior probability that a gene is not differentially expressed. An attractive feature of the mixture model approach is that it provides a framework for the estimation of this probability and its subsequent use in forming a decision rule. The rule can also be formed to take the false negative rate into account.
Resumo:
We present a new algorithm for detecting intercluster galaxy filaments based upon the assumption that the orientations of constituent galaxies along such filaments are non-isotropic. We apply the algorithm to the 2dF Galaxy Redshift Survey catalogue and find that it readily detects many straight filaments between close cluster pairs. At large intercluster separations (> 15 h(-1) Mpc), we find that the detection efficiency falls quickly, as it also does with more complex filament morphologies. We explore the underlying assumptions and suggest that it is only in the case of close cluster pairs that we can expect galaxy orientations to be significantly correlated with filament direction.
Resumo:
We consider the statistical problem of catalogue matching from a machine learning perspective with the goal of producing probabilistic outputs, and using all available information. A framework is provided that unifies two existing approaches to producing probabilistic outputs in the literature, one based on combining distribution estimates and the other based on combining probabilistic classifiers. We apply both of these to the problem of matching the HI Parkes All Sky Survey radio catalogue with large positional uncertainties to the much denser SuperCOSMOS catalogue with much smaller positional uncertainties. We demonstrate the utility of probabilistic outputs by a controllable completeness and efficiency trade-off and by identifying objects that have high probability of being rare. Finally, possible biasing effects in the output of these classifiers are also highlighted and discussed.
Resumo:
An important aspect in manufacturing design is the distribution of geometrical tolerances so that an assembly functions with given probability, while minimising the manufacturing cost. This requires a complex search over a multidimensional domain, much of which leads to infeasible solutions and which can have many local minima. As well, Monte-Carlo methods are often required to determine the probability that the assembly functions as designed. This paper describes a genetic algorithm for carrying out this search and successfully applies it to two specific mechanical designs, enabling comparisons of a new statistical tolerancing design method with existing methods. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
This study has three main objectives. First, it develops a generalization of the commonly used EKS method to multilateral price comparisons. It is shown that the EKS system can be generalized so that weights can be attached to each of the link comparisons used in the EKS computations. These weights can account for differing levels of reliability of the underlying binary comparisons. Second, various reliability measures and corresponding weighting schemes are presented and their merits discussed. Finally, these new methods are applied to an international data set of manufacturing prices from the ICOP project. Although theoretically superior, it appears that the empirical impact of the weighted EKS method is generally small compared to the unweighted EKS. It is also found that this impact is larger when it is applied at lower levels of aggregation. Finally, the importance of using sector specific PPPs in assessing relative levels of manufacturing productivity is indicated.
Resumo:
Genetic assignment methods use genotype likelihoods to draw inference about where individuals were or were not born, potentially allowing direct, real-time estimates of dispersal. We used simulated data sets to test the power and accuracy of Monte Carlo resampling methods in generating statistical thresholds for identifying F-0 immigrants in populations with ongoing gene flow, and hence for providing direct, real-time estimates of migration rates. The identification of accurate critical values required that resampling methods preserved the linkage disequilibrium deriving from recent generations of immigrants and reflected the sampling variance present in the data set being analysed. A novel Monte Carlo resampling method taking into account these aspects was proposed and its efficiency was evaluated. Power and error were relatively insensitive to the frequency assumed for missing alleles. Power to identify F-0 immigrants was improved by using large sample size (up to about 50 individuals) and by sampling all populations from which migrants may have originated. A combination of plotting genotype likelihoods and calculating mean genotype likelihood ratios (D-LR) appeared to be an effective way to predict whether F-0 immigrants could be identified for a particular pair of populations using a given set of markers.
Resumo:
The Accelerating Moment Release (AMR) preceding earthquakes with magnitude above 5 in Australia that occurred during the last 20 years was analyzed to test the Critical Point Hypothesis. Twelve earthquakes in the catalog were chosen based on a criterion for the number of nearby events. Results show that seven sequences with numerous events recorded leading up to the main earthquake exhibited accelerating moment release. Two occurred near in time and space to other earthquakes preceded by AM R. The remaining three sequences had very few events in the catalog so the lack of AMR detected in the analysis may be related to catalog incompleteness. Spatio-temporal scanning of AMR parameters shows that 80% of the areas in which AMR occurred experienced large events. In areas of similar background seismicity with no large events, 10 out of 12 cases exhibit no AMR, and two others are false alarms where AMR was observed but no large event followed. The relationship between AMR and Load-Unload Response Ratio (LURR) was studied. Both methods predict similar critical region sizes, however, the critical point time using AMR is slightly earlier than the time of the critical point LURR anomaly.
Resumo:
Purpose: To evaluate the clinical features, treatment, and outcomes of a cohort of patients with ocular adnexal lymphoproliferative disease classified according to the World Health Organization modification of the Revised European-American Classification of Lymphoid neoplasms and to perform a robust statistical analysis of these data. Methods: Sixty-nine cases of ocular adnexal lymphoproliferative disease, seen in a tertiary referral center from 1992 to 2003, were included in the study. Lesions were classified by using the World Health Organization modification of the Revised European-American Classification of Lymphoid neoplasms classification. Outcome variables included disease-specific Survival, relapse-free survival, local control, and distant control. Results: Stage IV disease at presentation, aggressive lymphoma histology, the presence of prior or concurrent systemic lymphoma at presentation, and bilateral adnexal disease were significant predictors for reduced disease-specific survival, local control, and distant control. Multivariate analysis found that aggressive histology and bilateral adnexal disease had significantly reduced disease-specific Survival. Conclusions: The typical presentation of adnexal lymphoproliferative disease is with a painless mass, swelling, or proptosis; however, pain and inflammation occurred in 20% and 30% of patients, respectively. Stage at presentation, tumor histology, primary or secondary status, and whether the process was unilateral or bilateral were significant variables for disease outcome. In this study, distant spread of lymphoma was lower in patients who received greater than 20 Gy of orbital radiotherapy.
Resumo:
Aims This paper presents the recommendations, developed from a 3-year consultation process, for a program of research to underpin the development of diagnostic concepts and criteria in the Substance Use Disorders section of the Diagnostic and Statistical Manual of Mental Disorders (DSM) and potentially the relevant section of the next revision of the International Classification of Diseases (ICD). Methods A preliminary list of research topics was developed at the DSM-V Launch Conference in 2004. This led to the presentation of articles on these topics at a specific Substance Use Disorders Conference in February 2005, at the end of which a preliminary list of research questions was developed. This was further refined through an iterative process involving conference participants over the following year. Results Research questions have been placed into four categories: (1) questions that could be addressed immediately through secondary analyses of existing data sets; (2) items likely to require position papers to propose criteria or more focused questions with a view to subsequent analyses of existing data sets; (3) issues that could be proposed for literature reviews, but with a lower probability that these might progress to a data analytic phase; and (4) suggestions or comments that might not require immediate action, but that could be considered by the DSM-V and ICD 11 revision committees as part of their deliberations. Conclusions A broadly based research agenda for the development of diagnostic concepts and criteria for substance use disorders is presented.
Resumo:
Aims paper describes the background to the establishment of the Substance Use Disorders Workgroup, which was charged with developing the research agenda for the development of the next edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM). It summarizes 18 articles that were commissioned to inform that process. Methods A preliminary list of research topics, developed at the DSM-V Launch Conference in 2004, led to the identification of subjects that were subject to formal presentations and detailed discussion at the Substance Use Disorders Conference in February 2005. Results The 18 articles presented in this supplement examine: (1) categorical versus dimensional diagnoses; (2) the neurobiological basis of substance use disorders; (3) social and cultural perspectives; (4) the crosswalk between DSM-IV and the International Classification of Diseases Tenth Revision (ICD-10); (5) comorbidity of substance use disorders and mental health disorders; (6) subtypes of disorders; (7) issues in adolescence; (8) substance-specific criteria; (9) the place of non-substance addictive disorders; and (10) the available research resources. Conclusions In the final paper a broadly based research agenda for the development of diagnostic concepts and criteria for substance use disorders is presented.
Resumo:
Objective: This paper compares four techniques used to assess change in neuropsychological test scores before and after coronary artery bypass graft surgery (CABG), and includes a rationale for the classification of a patient as overall impaired. Methods: A total of 55 patients were tested before and after surgery on the MicroCog neuropsychological test battery. A matched control group underwent the same testing regime to generate test–retest reliabilities and practice effects. Two techniques designed to assess statistical change were used: the Reliable Change Index (RCI), modified for practice, and the Standardised Regression-based (SRB) technique. These were compared against two fixed cutoff techniques (standard deviation and 20% change methods). Results: The incidence of decline across test scores varied markedly depending on which technique was used to describe change. The SRB method identified more patients as declined on most measures. In comparison, the two fixed cutoff techniques displayed relatively reduced sensitivity in the detection of change. Conclusions: Overall change in an individual can be described provided the investigators choose a rational cutoff based on likely spread of scores due to chance. A cutoff value of ≥20% of test scores used provided acceptable probability based on the number of tests commonly encountered. Investigators must also choose a test battery that minimises shared variance among test scores.
Resumo:
Objective: To devise more-effective physical activity interventions, the mediating mechanisms yielding behavioral change need to be identified. The Baron-Kenny method is most commonly used. but has low statistical power and May not identify mechanisms of behavioral change in small-to-medium size Studies. More powerful statistical tests are available, Study Design and Setting: Inactive adults (N = 52) were randomized to either a print or a print-plus-telephone intervention. Walking and exercise-related social support Were assessed at baseline, after file intervention, and 4 weeks later. The Baron-Kenny and three alternative methods of mediational analysis (Freedman-Schatzkin; MacKinnon et al.: bootstrap method) were used to examine the effects of social support on initial behavior change and maintenance. Results: A significant mediational effect of social support on initial behavior change was indicated by the MacKinnon et al., bootstrap. and. marginally. Freedman-Schatzkin methods, but not by the Baron-Kenny method. No significant mediational effecl of social support on maintenance of walking was found. Conclusions: Methodologically rigorous intervention studies to identify mediators of change in physical activity are costly and labor intensive, and may not be feasible with large samples. The Use of statistically powerful tests of mediational effects in small-scale studies can inform the development of more effective interventions. (C) 2006 Elsevier Inc. All rights reserved.
Resumo:
Traditional vegetation mapping methods use high cost, labour-intensive aerial photography interpretation. This approach can be subjective and is limited by factors such as the extent of remnant vegetation, and the differing scale and quality of aerial photography over time. An alternative approach is proposed which integrates a data model, a statistical model and an ecological model using sophisticated Geographic Information Systems (GIS) techniques and rule-based systems to support fine-scale vegetation community modelling. This approach is based on a more realistic representation of vegetation patterns with transitional gradients from one vegetation community to another. Arbitrary, though often unrealistic, sharp boundaries can be imposed on the model by the application of statistical methods. This GIS-integrated multivariate approach is applied to the problem of vegetation mapping in the complex vegetation communities of the Innisfail Lowlands in the Wet Tropics bioregion of Northeastern Australia. The paper presents the full cycle of this vegetation modelling approach including sampling sites, variable selection, model selection, model implementation, internal model assessment, model prediction assessments, models integration of discrete vegetation community models to generate a composite pre-clearing vegetation map, independent data set model validation and model prediction's scale assessments. An accurate pre-clearing vegetation map of the Innisfail Lowlands was generated (0.83r(2)) through GIS integration of 28 separate statistical models. This modelling approach has good potential for wider application, including provision of. vital information for conservation planning and management; a scientific basis for rehabilitation of disturbed and cleared areas; a viable method for the production of adequate vegetation maps for conservation and forestry planning of poorly-studied areas. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
In empirical studies of Evolutionary Algorithms, it is usually desirable to evaluate and compare algorithms using as many different parameter settings and test problems as possible, in border to have a clear and detailed picture of their performance. Unfortunately, the total number of experiments required may be very large, which often makes such research work computationally prohibitive. In this paper, the application of a statistical method called racing is proposed as a general-purpose tool to reduce the computational requirements of large-scale experimental studies in evolutionary algorithms. Experimental results are presented that show that racing typically requires only a small fraction of the cost of an exhaustive experimental study.