337 resultados para statistical classification


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the development of statistical models for prediction of constituent concentration of riverine pollutants, which is a key step in load estimation from frequent flow rate data and less frequently collected concentration data. We consider how to capture the impacts of past flow patterns via the average discounted flow (ADF) which discounts the past flux based on the time lapsed - more recent fluxes are given more weight. However, the effectiveness of ADF depends critically on the choice of the discount factor which reflects the unknown environmental cumulating process of the concentration compounds. We propose to choose the discount factor by maximizing the adjusted R-2 values or the Nash-Sutcliffe model efficiency coefficient. The R2 values are also adjusted to take account of the number of parameters in the model fit. The resulting optimal discount factor can be interpreted as a measure of constituent exhaustion rate during flood events. To evaluate the performance of the proposed regression estimators, we examine two different sampling scenarios by resampling fortnightly and opportunistically from two real daily datasets, which come from two United States Geological Survey (USGS) gaging stations located in Des Plaines River and Illinois River basin. The generalized rating-curve approach produces biased estimates of the total sediment loads by -30% to 83%, whereas the new approaches produce relatively much lower biases, ranging from -24% to 35%. This substantial improvement in the estimates of the total load is due to the fact that predictability of concentration is greatly improved by the additional predictors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Power calculation and sample size determination are critical in designing environmental monitoring programs. The traditional approach based on comparing the mean values may become statistically inappropriate and even invalid when substantial proportions of the response values are below the detection limits or censored because strong distributional assumptions have to be made on the censored observations when implementing the traditional procedures. In this paper, we propose a quantile methodology that is robust to outliers and can also handle data with a substantial proportion of below-detection-limit observations without the need of imputing the censored values. As a demonstration, we applied the methods to a nutrient monitoring project, which is a part of the Perth Long-Term Ocean Outlet Monitoring Program. In this example, the sample size required by our quantile methodology is, in fact, smaller than that by the traditional t-test, illustrating the merit of our method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A review was carried out of the radiographs of twenty-five infants with birth weights under 1000 G, who survived for more than twenty-eight days; eighteen of these had enough suitable films for a survey of the progressive bone changes which occur in these infants, including estimation of humeral cortical cross-sectional area. The incidence of the changes has been assessed and a typical progression of radiographic appearances has been shown, with a suggested system of staging. All infants showed some loss of bone mineral, with frank changes of rickets occurring in forty-four percent. Aetiological factors are mainly concerned with the difficulty of supplying and ensuring absorption of sufficient bone mineral (calcium and phosphate) and vitamin D. Liver immaturity may be another factor. Disease states additional to prematurity accentuate the problem. Rib fractures occurring around 80–90 days post-nataEy commonly draw attention to the bone disorder and are probably the major clinical factor of importance; there is a high incidence of associated lung disease of uncertain pathology. Attention is drawn to possible confusion with other bone disorders in the post-natal period.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Being able to accurately predict the risk of falling is crucial in patients with Parkinson’s dis- ease (PD). This is due to the unfavorable effect of falls, which can lower the quality of life as well as directly impact on survival. Three methods considered for predicting falls are decision trees (DT), Bayesian networks (BN), and support vector machines (SVM). Data on a 1-year prospective study conducted at IHBI, Australia, for 51 people with PD are used. Data processing are conducted using rpart and e1071 packages in R for DT and SVM, con- secutively; and Bayes Server 5.5 for the BN. The results show that BN and SVM produce consistently higher accuracy over the 12 months evaluation time points (average sensitivity and specificity > 92%) than DT (average sensitivity 88%, average specificity 72%). DT is prone to imbalanced data so needs to adjust for the misclassification cost. However, DT provides a straightforward, interpretable result and thus is appealing for helping to identify important items related to falls and to generate fallers’ profiles.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Post-traumatic stress disorder (PTSD) is a debilitating psychiatric disorder that has a major impact on the ability to function effectively in daily life. PTSD may develop as a response to exposure to an event or events perceived as potentially harmful or life-threatening. It has high prevalence rates in the community, especially among vulnerable groups such as military personnel or those in emergency services. Despite extensive research in this field, the underlying mechanisms of the disorder remain largely unknown. The identification of risk factors for PTSD has posed a particular challenge as there can be delays in onset of the disorder, and most people who are exposed to traumatic events will not meet diagnostic criteria for PTSD. With the advent of the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM V), the classification for PTSD has changed from an anxiety disorder into the category of stress- and trauma-related disorders. This has the potential to refocus PTSD research on the nature of stress and the stress response relationship. This review focuses on some of the important findings from psychological and biological research based on early models of stress and resilience. Improving our understanding of PTSD by investigating both genetic and psychological risk and coping factors that influence stress response, as well as their interaction, may provide a basis for more effective and earlier intervention.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective Death certificates provide an invaluable source for cancer mortality statistics; however, this value can only be realised if accurate, quantitative data can be extracted from certificates – an aim hampered by both the volume and variable nature of certificates written in natural language. This paper proposes an automatic classification system for identifying cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. These features were used to train Support Vector Machine classifiers (one classifier for each cancer type). The classifiers were deployed in a cascaded architecture: the first level identified the presence of cancer (i.e., binary cancer/nocancer) and the second level identified the type of cancer (according to the ICD-10 classification system). A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. In addition, detailed feature analysis was performed to reveal the characteristics of a successful cancer classification model. Results The system was highly effective at identifying cancer as the underlying cause of death (F-measure 0.94). The system was also effective at determining the type of cancer for common cancers (F-measure 0.7). Rare cancers, for which there was little training data, were difficult to classify accurately (F-measure 0.12). Factors influencing performance were the amount of training data and certain ambiguous cancers (e.g., those in the stomach region). The feature analysis revealed a combination of features were important for cancer type classification, with SNOMED CT concept and oncology specific morphology features proving the most valuable. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this presentation, I reflect upon the global landscape surrounding the governance and classification of media content, at a time of rapid change in media platforms and services for content production and distribution, and contested cultural and social norms. I discuss the tensions and contradictions arising in the relationship between national, regional and global dimensions of media content distribution, as well as the changing relationships between state and non-state actors. These issues will be explored through consideration of issues such as: recent debates over film censorship; the review of the National Classification Scheme conducted by the Australian Law Reform Commission; online controversies such as the future of the Reddit social media site; and videos posted online by the militant group ISIS.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background The purpose of this presentation is to outline the relevance of the categorization of the load regime data to assess the functional output and usage of the prosthesis of lower limb amputees. The objectives are • To highlight the need for categorisation of activities of daily living • To present a categorization of load regime applied on residuum, • To present some descriptors of the four types of activity that could be detected, • To provide an example the results for a case. Methods The load applied on the osseointegrated fixation of one transfemoral amputee was recorded using a portable kinetic system for 5 hours. The load applied on the residuum was divided in four types of activities corresponding to inactivity, stationary loading, localized locomotion and directional locomotion as detailed in previously publications. Results The periods of directional locomotion, localized locomotion, and stationary loading occurred 44%, 34%, and 22% of recording time and each accounted for 51%, 38%, and 12% of the duration of the periods of activity, respectively. The absolute maximum force during directional locomotion, localized locomotion, and stationary loading was 19%, 15%, and 8% of the body weight on the anteroposterior axis, 20%, 19%, and 12% on the mediolateral axis, and 121%, 106%, and 99% on the long axis. A total of 2,783 gait cycles were recorded. Discussion Approximately 10% more gait cycles and 50% more of the total impulse than conventional analyses were identified. The proposed categorization and apparatus have the potential to complement conventional instruments, particularly for difficult cases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Convex potential minimisation is the de facto approach to binary classification. However, Long and Servedio [2008] proved that under symmetric label noise (SLN), minimisation of any convex potential over a linear function class can result in classification performance equivalent to random guessing. This ostensibly shows that convex losses are not SLN-robust. In this paper, we propose a convex, classification-calibrated loss and prove that it is SLN-robust. The loss avoids the Long and Servedio [2008] result by virtue of being negatively unbounded. The loss is a modification of the hinge loss, where one does not clamp at zero; hence, we call it the unhinged loss. We show that the optimal unhinged solution is equivalent to that of a strongly regularised SVM, and is the limiting solution for any convex potential; this implies that strong l2 regularisation makes most standard learners SLN-robust. Experiments confirm the unhinged loss’ SLN-robustness.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Introduction: It is unclear whether patients diagnosed according to International Classification of Headache Disorders criteria for migraine with aura (MA) and migraine without aura (MO) experience distinct disorders or whether their migraine subtypes are genetically related. Aim: Using a novel gene-based (statistical) approach, we aimed to identify individual genes and pathways associated both with MA and MO. Methods: Gene-based tests were performed using genome-wide association summary statistic results from the most recent International Headache Genetics Consortium study comparing 4505 MA cases with 34,813 controls and 4038 MO cases with 40,294 controls. After accounting for non-independence of gene-based test results, we examined the significance of the proportion of shared genes associated with MA and MO. Results: We found a significant overlap in genes associated with MA and MO. Of the total 1514 genes with a nominally significant gene-based p value (pgene-based ≤ 0.05) in the MA subgroup, 107 also produced pgene-based ≤ 0.05 in the MO subgroup. The proportion of overlapping genes is almost double the empirically derived null expectation, producing significant evidence of gene-based overlap (pleiotropy) (pbinomial-test = 1.5 × 10–4). Combining results across MA and MO, six genes produced genome-wide significant gene-based p values. Four of these genes (TRPM8, UFL1, FHL5 and LRP1) were located in close proximity to previously reported genome-wide significant SNPs for migraine, while two genes, TARBP2 and NPFF separated by just 259 bp on chromosome 12q13.13, represent a novel risk locus. The genes overlapping in both migraine types were enriched for functions related to inflammation, the cardiovascular system and connective tissue. Conclusions: Our results provide novel insight into the likely genes and biological mechanisms that underlie both MA and MO, and when combined with previous data, highlight the neuropeptide FF-amide peptide encoding gene (NPFF) as a novel candidate risk gene for both types of migraine.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Context: Identifying susceptibility genes for schizophrenia may be complicated by phenotypic heterogeneity, with some evidence suggesting that phenotypic heterogeneity reflects genetic heterogeneity. Objective: To evaluate the heritability and conduct genetic linkage analyses of empirically derived, clinically homogeneous schizophrenia subtypes. Design: Latent class and linkage analysis. Setting: Taiwanese field research centers. Participants: The latent class analysis included 1236 Han Chinese individuals with DSM-IV schizophrenia. These individuals were members of a large affected-sibling-pair sample of schizophrenia (606 ascertained families), original linkage analyses of which detected a maximum logarithm of odds (LOD) of 1.8 (z = 2.88) on chromosome 10q22.3. Main Outcome Measures: Multipoint exponential LOD scores by latent class assignment and parametric heterogeneity LOD scores. Results: Latent class analyses identified 4 classes, with 2 demonstrating familial aggregation. The first (LC2) described a group with severe negative symptoms, disorganization, and pronounced functional impairment, resembling “deficit schizophrenia.” The second (LC3) described a group with minimal functional impairment, mild or absent negative symptoms, and low disorganization. Using the negative/deficit subtype, we detected genome-wide significant linkage to 1q23-25 (LOD = 3.78, empiric genome-wide P = .01). This region was not detected using the DSM-IV schizophrenia diagnosis, but has been strongly implicated in schizophrenia pathogenesis by previous linkage and association studies.Variants in the 1q region may specifically increase risk for a negative/deficit schizophrenia subtype. Alternatively, these results may reflect increased familiality/heritability of the negative class, the presence of multiple 1q schizophrenia risk genes, or a pleiotropic 1q risk locus or loci, with stronger genotype-phenotype correlation with negative/deficit symptoms. Using the second familial latent class, we identified nominally significant linkage to the original 10q peak region. Conclusion: Genetic analyses of heritable, homogeneous phenotypes may improve the power of linkage and association studies of schizophrenia and thus have relevance to the design and analysis of genome-wide association studies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The past decade has brought a proliferation of statistical genetic (linkage) analysis techniques, incorporating new methodology and/or improvement of existing methodology in gene mapping, specifically targeted towards the localization of genes underlying complex disorders. Most of these techniques have been implemented in user-friendly programs and made freely available to the genetics community. Although certain packages may be more 'popular' than others, a common question asked by genetic researchers is 'which program is best for me?'. To help researchers answer this question, the following software review aims to summarize the main advantages and disadvantages of the popular GENEHUNTER package.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Environmental changes have put great pressure on biological systems leading to the rapid decline of biodiversity. To monitor this change and protect biodiversity, animal vocalizations have been widely explored by the aid of deploying acoustic sensors in the field. Consequently, large volumes of acoustic data are collected. However, traditional manual methods that require ecologists to physically visit sites to collect biodiversity data are both costly and time consuming. Therefore it is essential to develop new semi-automated and automated methods to identify species in automated audio recordings. In this study, a novel feature extraction method based on wavelet packet decomposition is proposed for frog call classification. After syllable segmentation, the advertisement call of each frog syllable is represented by a spectral peak track, from which track duration, dominant frequency and oscillation rate are calculated. Then, a k-means clustering algorithm is applied to the dominant frequency, and the centroids of clustering results are used to generate the frequency scale for wavelet packet decomposition (WPD). Next, a new feature set named adaptive frequency scaled wavelet packet decomposition sub-band cepstral coefficients is extracted by performing WPD on the windowed frog calls. Furthermore, the statistics of all feature vectors over each windowed signal are calculated for producing the final feature set. Finally, two well-known classifiers, a k-nearest neighbour classifier and a support vector machine classifier, are used for classification. In our experiments, we use two different datasets from Queensland, Australia (18 frog species from commercial recordings and field recordings of 8 frog species from James Cook University recordings). The weighted classification accuracy with our proposed method is 99.5% and 97.4% for 18 frog species and 8 frog species respectively, which outperforms all other comparable methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an effective classification method based on Support Vector Machines (SVM) in the context of activity recognition. Local features that capture both spatial and temporal information in activity videos have made significant progress recently. Efficient and effective features, feature representation and classification plays a crucial role in activity recognition. For classification, SVMs are popularly used because of their simplicity and efficiency; however the common multi-class SVM approaches applied suffer from limitations including having easily confused classes and been computationally inefficient. We propose using a binary tree SVM to address the shortcomings of multi-class SVMs in activity recognition. We proposed constructing a binary tree using Gaussian Mixture Models (GMM), where activities are repeatedly allocated to subnodes until every new created node contains only one activity. Then, for each internal node a separate SVM is learned to classify activities, which significantly reduces the training time and increases the speed of testing compared to popular the `one-against-the-rest' multi-class SVM classifier. Experiments carried out on the challenging and complex Hollywood dataset demonstrates comparable performance over the baseline bag-of-features method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Surveying threatened and invasive species to obtain accurate population estimates is an important but challenging task that requires a considerable investment in time and resources. Estimates using existing ground-based monitoring techniques, such as camera traps and surveys performed on foot, are known to be resource intensive, potentially inaccurate and imprecise, and difficult to validate. Recent developments in unmanned aerial vehicles (UAV), artificial intelligence and miniaturized thermal imaging systems represent a new opportunity for wildlife experts to inexpensively survey relatively large areas. The system presented in this paper includes thermal image acquisition as well as a video processing pipeline to perform object detection, classification and tracking of wildlife in forest or open areas. The system is tested on thermal video data from ground based and test flight footage, and is found to be able to detect all the target wildlife located in the surveyed area. The system is flexible in that the user can readily define the types of objects to classify and the object characteristics that should be considered during classification.