946 resultados para statistical classification


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the site classification of Bangalore Mahanagar Palike (BMP) area using geophysical data and the evaluation of spectral acceleration at ground level using probabilistic approach. Site classification has been carried out using experimental data from the shallow geophysical method of Multichannel Analysis of Surface wave (MASW). One-dimensional (1-D) MASW survey has been carried out at 58 locations and respective velocity profiles are obtained. The average shear wave velocity for 30 m depth (Vs(30)) has been calculated and is used for the site classification of the BMP area as per NEHRP (National Earthquake Hazards Reduction Program). Based on the Vs(30) values major part of the BMP area can be classified as ``site class D'', and ``site class C'. A smaller portion of the study area, in and around Lalbagh Park, is classified as ``site class B''. Further, probabilistic seismic hazard analysis has been carried out to map the seismic hazard in terms spectral acceleration (S-a) at rock and the ground level considering the site classes and six seismogenic sources identified. The mean annual rate of exceedance and cumulative probability hazard curve for S. have been generated. The quantified hazard values in terms of spectral acceleration for short period and long period are mapped for rock, site class C and D with 10% probability of exceedance in 50 years on a grid size of 0.5 km. In addition to this, the Uniform Hazard Response Spectrum (UHRS) at surface level has been developed for the 5% damping and 10% probability of exceedance in 50 years for rock, site class C and D These spectral acceleration and uniform hazard spectrums can be used to assess the design force for important structures and also to develop the design spectrum.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The majority of Australian weeds are exotic plant species that were intentionally introduced for a variety of horticultural and agricultural purposes. A border weed risk assessment system (WRA) was implemented in 1997 in order to reduce the high economic costs and massive environmental damage associated with introducing serious weeds. We review the behaviour of this system with regard to eight years of data collected from the assessment of species proposed for importation or held within genetic resource centres in Australia. From a taxonomic perspective, species from the Chenopodiaceae and Poaceae were most likely to be rejected and those from the Arecaceae and Flacourtiaceae were most likely to be accepted. Dendrogram analysis and classification and regression tree (TREE) models were also used to analyse the data. The latter revealed that a small subset of the 35 variables assessed was highly associated with the outcome of the original assessment. The TREE model examining all of the data contained just five variables: unintentional human dispersal, congeneric weed, weed elsewhere, tolerates or benefits from mutilation, cultivation or fire, and reproduction by vegetative propagation. It gave the same outcome as the full WRA model for 71% of species. Weed elsewhere was not the first splitting variable in this model, indicating that the WRA has a capacity for capturing species that have no history of weediness. A reduced TREE model (in which human-mediated variables had been removed) contained four variables: broad climate suitability, reproduction in less or than equal to 1 year, self-fertilisation, and tolerates and benefits from mutilation, cultivation or fire. It yielded the same outcome as the full WRA model for 65% of species. Data inconsistencies and the relative importance of questions are discussed, with some recommendations made for improving the use of the system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To facilitate marketing and export, the Australian macadamia industry requires accurate crop forecasts. Each year, two levels of crop predictions are produced for this industry. The first is an overall longer-term forecast based on tree census data of growers in the Australian Macadamia Society (AMS). This data set currently accounts for around 70% of total production, and is supplemented by our best estimates of non-AMS orchards. Given these total tree numbers, average yields per tree are needed to complete the long-term forecasts. Yields from regional variety trials were initially used, but were found to be consistently higher than the average yields that growers were obtaining. Hence, a statistical model was developed using growers' historical yields, also taken from the AMS database. This model accounted for the effects of tree age, variety, year, region and tree spacing, and explained 65% of the total variation in the yield per tree data. The second level of crop prediction is an annual climate adjustment of these overall long-term estimates, taking into account the expected effects on production of the previous year's climate. This adjustment is based on relative historical yields, measured as the percentage deviance between expected and actual production. The dominant climatic variables are observed temperature, evaporation, solar radiation and modelled water stress. Initially, a number of alternate statistical models showed good agreement within the historical data, with jack-knife cross-validation R2 values of 96% or better. However, forecasts varied quite widely between these alternate models. Exploratory multivariate analyses and nearest-neighbour methods were used to investigate these differences. For 2001-2003, the overall forecasts were in the right direction (when compared with the long-term expected values), but were over-estimates. In 2004 the forecast was well under the observed production, and in 2005 the revised models produced a forecast within 5.1% of the actual production. Over the first five years of forecasting, the absolute deviance for the climate-adjustment models averaged 10.1%, just outside the targeted objective of 10%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recently introduced generalized pencil of Sudarshan which gives an exact ray picture of wave optics is analysed in some situations of interest to wave optics. A relationship between ray dispersion and statistical inhomogeneity of the field is obtained. A paraxial approximation which preserves the rectilinear propagation character of the generalized pencils is presented. Under this approximation the pencils can be computed directly from the field conditions on a plane, without the necessity to compute the cross-spectral density function in the entire space as an intermediate quantity. The paraxial results are illustrated with examples. The pencils are shown to exhibit an interesting scaling behaviour in the far-zone. This scaling leads to a natural generalization of the Fraunhofer range criterion and of the classical van Cittert-Zernike theorem to planar sources of arbitrary state of coherence. The recently derived results of radiometry with partially coherent sources are shown to be simple consequences of this scaling.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Being able to accurately predict the risk of falling is crucial in patients with Parkinson’s dis- ease (PD). This is due to the unfavorable effect of falls, which can lower the quality of life as well as directly impact on survival. Three methods considered for predicting falls are decision trees (DT), Bayesian networks (BN), and support vector machines (SVM). Data on a 1-year prospective study conducted at IHBI, Australia, for 51 people with PD are used. Data processing are conducted using rpart and e1071 packages in R for DT and SVM, con- secutively; and Bayes Server 5.5 for the BN. The results show that BN and SVM produce consistently higher accuracy over the 12 months evaluation time points (average sensitivity and specificity > 92%) than DT (average sensitivity 88%, average specificity 72%). DT is prone to imbalanced data so needs to adjust for the misclassification cost. However, DT provides a straightforward, interpretable result and thus is appealing for helping to identify important items related to falls and to generate fallers’ profiles.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Post-traumatic stress disorder (PTSD) is a debilitating psychiatric disorder that has a major impact on the ability to function effectively in daily life. PTSD may develop as a response to exposure to an event or events perceived as potentially harmful or life-threatening. It has high prevalence rates in the community, especially among vulnerable groups such as military personnel or those in emergency services. Despite extensive research in this field, the underlying mechanisms of the disorder remain largely unknown. The identification of risk factors for PTSD has posed a particular challenge as there can be delays in onset of the disorder, and most people who are exposed to traumatic events will not meet diagnostic criteria for PTSD. With the advent of the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM V), the classification for PTSD has changed from an anxiety disorder into the category of stress- and trauma-related disorders. This has the potential to refocus PTSD research on the nature of stress and the stress response relationship. This review focuses on some of the important findings from psychological and biological research based on early models of stress and resilience. Improving our understanding of PTSD by investigating both genetic and psychological risk and coping factors that influence stress response, as well as their interaction, may provide a basis for more effective and earlier intervention.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective Death certificates provide an invaluable source for cancer mortality statistics; however, this value can only be realised if accurate, quantitative data can be extracted from certificates – an aim hampered by both the volume and variable nature of certificates written in natural language. This paper proposes an automatic classification system for identifying cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. These features were used to train Support Vector Machine classifiers (one classifier for each cancer type). The classifiers were deployed in a cascaded architecture: the first level identified the presence of cancer (i.e., binary cancer/nocancer) and the second level identified the type of cancer (according to the ICD-10 classification system). A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. In addition, detailed feature analysis was performed to reveal the characteristics of a successful cancer classification model. Results The system was highly effective at identifying cancer as the underlying cause of death (F-measure 0.94). The system was also effective at determining the type of cancer for common cancers (F-measure 0.7). Rare cancers, for which there was little training data, were difficult to classify accurately (F-measure 0.12). Factors influencing performance were the amount of training data and certain ambiguous cancers (e.g., those in the stomach region). The feature analysis revealed a combination of features were important for cancer type classification, with SNOMED CT concept and oncology specific morphology features proving the most valuable. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of near infrared (NIR) hyperspectral imaging and hyperspectral image analysis for distinguishing between hard, intermediate and soft maize kernels from inbred lines was evaluated. NIR hyperspectral images of two sets (12 and 24 kernels) of whole maize kernels were acquired using a Spectral Dimensions MatrixNIR camera with a spectral range of 960-1662 nm and a sisuChema SWIR (short wave infrared) hyperspectral pushbroom imaging system with a spectral range of 1000-2498 nm. Exploratory principal component analysis (PCA) was used on absorbance images to remove background, bad pixels and shading. On the cleaned images. PCA could be used effectively to find histological classes including glassy (hard) and floury (soft) endosperm. PCA illustrated a distinct difference between glassy and floury endosperm along principal component (PC) three on the MatrixNIR and PC two on the sisuChema with two distinguishable clusters. Subsequently partial least squares discriminant analysis (PLS-DA) was applied to build a classification model. The PLS-DA model from the MatrixNIR image (12 kernels) resulted in root mean square error of prediction (RMSEP) value of 0.18. This was repeated on the MatrixNIR image of the 24 kernels which resulted in RMSEP of 0.18. The sisuChema image yielded RMSEP value of 0.29. The reproducible results obtained with the different data sets indicate that the method proposed in this paper has a real potential for future classification uses.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this presentation, I reflect upon the global landscape surrounding the governance and classification of media content, at a time of rapid change in media platforms and services for content production and distribution, and contested cultural and social norms. I discuss the tensions and contradictions arising in the relationship between national, regional and global dimensions of media content distribution, as well as the changing relationships between state and non-state actors. These issues will be explored through consideration of issues such as: recent debates over film censorship; the review of the National Classification Scheme conducted by the Australian Law Reform Commission; online controversies such as the future of the Reddit social media site; and videos posted online by the militant group ISIS.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background The purpose of this presentation is to outline the relevance of the categorization of the load regime data to assess the functional output and usage of the prosthesis of lower limb amputees. The objectives are • To highlight the need for categorisation of activities of daily living • To present a categorization of load regime applied on residuum, • To present some descriptors of the four types of activity that could be detected, • To provide an example the results for a case. Methods The load applied on the osseointegrated fixation of one transfemoral amputee was recorded using a portable kinetic system for 5 hours. The load applied on the residuum was divided in four types of activities corresponding to inactivity, stationary loading, localized locomotion and directional locomotion as detailed in previously publications. Results The periods of directional locomotion, localized locomotion, and stationary loading occurred 44%, 34%, and 22% of recording time and each accounted for 51%, 38%, and 12% of the duration of the periods of activity, respectively. The absolute maximum force during directional locomotion, localized locomotion, and stationary loading was 19%, 15%, and 8% of the body weight on the anteroposterior axis, 20%, 19%, and 12% on the mediolateral axis, and 121%, 106%, and 99% on the long axis. A total of 2,783 gait cycles were recorded. Discussion Approximately 10% more gait cycles and 50% more of the total impulse than conventional analyses were identified. The proposed categorization and apparatus have the potential to complement conventional instruments, particularly for difficult cases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A project to allow the resource assessment of tidal wetland vegetation of western Cape York Peninsula, in north Queensland, was undertaken as part of the longterm assessment of the coastal fisheries resources of Queensland. The project incorporated a littoral invertebrate fauna component. Extending from May 1993 to December 1994, fieldwork was undertaken in May 1993, November 1993 and April 1994. The aims of this project were to: • obtain baseline information on the distribution of marine plants of western Cape York Peninsula; • commence a preliminary assessment of the littoral invertebrate fauna and their habitat requirements with a view to extending knowledge of their biogeographic affinities; • perform biogeographic classification of the tidal wetlands at a meso and local scale for marine conservation planning; • evaluate the conservation values of the areas investigated from the viewpoint of fisheries productivity and as habitat for important/threatened species. Dataset URL Link: Queensland Coastal Wetlands Resources Mapping data. [Dataset]

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Convex potential minimisation is the de facto approach to binary classification. However, Long and Servedio [2008] proved that under symmetric label noise (SLN), minimisation of any convex potential over a linear function class can result in classification performance equivalent to random guessing. This ostensibly shows that convex losses are not SLN-robust. In this paper, we propose a convex, classification-calibrated loss and prove that it is SLN-robust. The loss avoids the Long and Servedio [2008] result by virtue of being negatively unbounded. The loss is a modification of the hinge loss, where one does not clamp at zero; hence, we call it the unhinged loss. We show that the optimal unhinged solution is equivalent to that of a strongly regularised SVM, and is the limiting solution for any convex potential; this implies that strong l2 regularisation makes most standard learners SLN-robust. Experiments confirm the unhinged loss’ SLN-robustness.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Indo-West Pacific (IWP), from South Africa in the western Indian Ocean to the western Pacific Ocean, contains some of the most biologically diverse marine habitats on earth, including the greatest biodiversity of chondrichthyan fishes. The region encompasses various densities of human habitation leading to contrasts in the levels of exploitation experienced by chondrichthyans, which are targeted for local consumption and export. The demersal chondrichthyan, the zebra shark, Stegostoma fasciatum, is endemic to the IWP and has two current regional International Union for the Conservation of Nature (IUCN) Red List classifications that reflect differing levels of exploitation: ‘Least Concern’ and ‘Vulnerable’. In this study, we employed mitochondrial ND4 sequence data and 13 microsatellite loci to investigate the population genetic structure of 180 zebra sharks from 13 locations throughout the IWP to test the concordance of IUCN zones with demographic units that have conservation value. Mitochondrial and microsatellite data sets from samples collected throughout northern Australia and Southeast Asia concord with the regional IUCN classifications. However, we found evidence of genetic subdivision within these regions, including subdivision between locations connected by habitat suitable for migration. Furthermore, parametric FST analyses and Bayesian clustering analyses indicated that the primary genetic break within the IWP is not represented by the IUCN classifications but rather is congruent with the Indonesian throughflow current. Our findings indicate that recruitment to areas of high exploitation from nearby healthy populations in zebra sharks is likely to be minimal, and that severe localized depletions are predicted to occur in zebra shark populations throughout the IWP region.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genotype-environment interactions (GEI) limit genetic gain for complex traits such as tolerance to drought. Characterization of the crop environment is an important step in understanding GEI. A modelling approach is proposed here to characterize broadly (large geographic area, long-term period) and locally (field experiment) drought-related environmental stresses, which enables breeders to analyse their experimental trials with regard to the broad population of environments that they target. Water-deficit patterns experienced by wheat crops were determined for drought-prone north-eastern Australia, using the APSIM crop model to account for the interactions of crops with their environment (e.g. feedback of plant growth on water depletion). Simulations based on more than 100 years of historical climate data were conducted for representative locations, soils, and management systems, for a check cultivar, Hartog. The three main environment types identified differed in their patterns of simulated water stress around flowering and during grain-filling. Over the entire region, the terminal drought-stress pattern was most common (50% of production environments) followed by a flowering stress (24%), although the frequencies of occurrence of the three types varied greatly across regions, years, and management. This environment classification was applied to 16 trials relevant to late stages testing of a breeding programme. The incorporation of the independently-determined environment types in a statistical analysis assisted interpretation of the GEI for yield among the 18 representative genotypes by reducing the relative effect of GEI compared with genotypic variance, and helped to identify opportunities to improve breeding and germplasm-testing strategies for this region.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Microarrays have a wide range of applications in the biomedical field. From the beginning, arrays have mostly been utilized in cancer research, including classification of tumors into different subgroups and identification of clinical associations. In the microarray format, a collection of small features, such as different oligonucleotides, is attached to a solid support. The advantage of microarray technology is the ability to simultaneously measure changes in the levels of multiple biomolecules. Because many diseases, including cancer, are complex, involving an interplay between various genes and environmental factors, the detection of only a single marker molecule is usually insufficient for determining disease status. Thus, a technique that simultaneously collects information on multiple molecules allows better insights into a complex disease. Since microarrays can be custom-manufactured or obtained from a number of commercial providers, understanding data quality and comparability between different platforms is important to enable the use of the technology to areas beyond basic research. When standardized, integrated array data could ultimately help to offer a complete profile of the disease, illuminating mechanisms and genes behind disorders as well as facilitating disease diagnostics. In the first part of this work, we aimed to elucidate the comparability of gene expression measurements from different oligonucleotide and cDNA microarray platforms. We compared three different gene expression microarrays; one was a commercial oligonucleotide microarray and the others commercial and custom-made cDNA microarrays. The filtered gene expression data from the commercial platforms correlated better across experiments (r=0.78-0.86) than the expression data between the custom-made and either of the two commercial platforms (r=0.62-0.76). Although the results from different platforms correlated reasonably well, combining and comparing the measurements were not straightforward. The clone errors on the custom-made array and annotation and technical differences between the platforms introduced variability in the data. In conclusion, the different gene expression microarray platforms provided results sufficiently concordant for the research setting, but the variability represents a challenge for developing diagnostic applications for the microarrays. In the second part of the work, we performed an integrated high-resolution microarray analysis of gene copy number and expression in 38 laryngeal and oral tongue squamous cell carcinoma cell lines and primary tumors. Our aim was to pinpoint genes for which expression was impacted by changes in copy number. The data revealed that especially amplifications had a clear impact on gene expression. Across the genome, 14-32% of genes in the highly amplified regions (copy number ratio >2.5) had associated overexpression. The impact of decreased copy number on gene underexpression was less clear. Using statistical analysis across the samples, we systematically identified hundreds of genes for which an increased copy number was associated with increased expression. For example, our data implied that FADD and PPFIA1 were frequently overexpressed at the 11q13 amplicon in HNSCC. The 11q13 amplicon, including known oncogenes such as CCND1 and CTTN, is well-characterized in different type of cancers, but the roles of FADD and PPFIA1 remain obscure. Taken together, the integrated microarray analysis revealed a number of known as well as novel target genes in altered regions in HNSCC. The identified genes provide a basis for functional validation and may eventually lead to the identification of novel candidates for targeted therapy in HNSCC.