833 resultados para Data Warehouse
Resumo:
The wide range of contributing factors and circumstances surrounding crashes on road curves suggest that no single intervention can prevent these crashes. This paper presents a novel methodology, based on data mining techniques, to identify contributing factors and the relationship between them. It identifies contributing factors that influence the risk of a crash. Incident records, described using free text, from a large insurance company were analysed with rough set theory. Rough set theory was used to discover dependencies among data, and reasons using the vague, uncertain and imprecise information that characterised the insurance dataset. The results show that male drivers, who are between 50 and 59 years old, driving during evening peak hours are involved with a collision, had a lowest crash risk. Drivers between 25 and 29 years old, driving from around midnight to 6 am and in a new car has the highest risk. The analysis of the most significant contributing factors on curves suggests that drivers with driving experience of 25 to 42 years, who are driving a new vehicle have the highest crash cost risk, characterised by the vehicle running off the road and hitting a tree. This research complements existing statistically based tools approach to analyse road crashes. Our data mining approach is supported with proven theory and will allow road safety practitioners to effectively understand the dependencies between contributing factors and the crash type with the view to designing tailored countermeasures.
Resumo:
A data-driven background dataset refinement technique was recently proposed for SVM based speaker verification. This method selects a refined SVM background dataset from a set of candidate impostor examples after individually ranking examples by their relevance. This paper extends this technique to the refinement of the T-norm dataset for SVM-based speaker verification. The independent refinement of the background and T-norm datasets provides a means of investigating the sensitivity of SVM-based speaker verification performance to the selection of each of these datasets. Using refined datasets provided improvements of 13% in min. DCF and 9% in EER over the full set of impostor examples on the 2006 SRE corpus with the majority of these gains due to refinement of the T-norm dataset. Similar trends were observed for the unseen data of the NIST 2008 SRE.
Resumo:
1. Ecological data sets often use clustered measurements or use repeated sampling in a longitudinal design. Choosing the correct covariance structure is an important step in the analysis of such data, as the covariance describes the degree of similarity among the repeated observations. 2. Three methods for choosing the covariance are: the Akaike information criterion (AIC), the quasi-information criterion (QIC), and the deviance information criterion (DIC). We compared the methods using a simulation study and using a data set that explored effects of forest fragmentation on avian species richness over 15 years. 3. The overall success was 80.6% for the AIC, 29.4% for the QIC and 81.6% for the DIC. For the forest fragmentation study the AIC and DIC selected the unstructured covariance, whereas the QIC selected the simpler autoregressive covariance. Graphical diagnostics suggested that the unstructured covariance was probably correct. 4. We recommend using DIC for selecting the correct covariance structure.
Resumo:
This report provides an introduction to our analyses of secondary data with respect to violent acts and incidents relating to males living in rural settings in Australia. It clarifies important aspects of our overall approach primarily by concentrating on three elements that required early scoping and resolution. Firstly, a wide and inclusive view of violence which encompasses measures of violent acts and incidents and also data identifying risk taking behaviour and the consequences of violence is outlined and justified. Secondly, the classification used to make comparisons between the city and the bush together with associated caveats is outlined. The third element discussed is in relation to national injury data. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in five subsequent reports in this series.
Resumo:
This report focuses on our examination of extant data which have been sourced with respect to self-harm and suicide in Australia. Moreover, specific areas of concern regarding elevated rates of suicide for rural males and data anomalies which emerged during our examination of these data are discussed. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in complementary reports in this series.
Resumo:
This report focuses on our examination of extant data which have been sourced with respect to intentional violence perpetrated or experienced by males in regional and remote Australia. The nature of intentional violent acts can be physical, sexual or psychological or involve deprivation or neglect. We have presented under the headings of: self-harm including suicide; homicide; assault, sexual assault and the threat of assault; child abuse; other family and intimate partner violence; harassment, stalking and bullying; alcohol related social violence; and animal abuse. State variations in interpersonal violence are also presented. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in complementary reports in this series.
Resumo:
This report focuses on our examination of extant data which have been sourced with respect to unintentional serious and violent injuries to males living in regional and remote Australia. Such injuries typically might be caused by, for example, transport accidents, occupational exposures and hazards, burns and so on. Thus unintentional violent incidents cause physical trauma the consequences of which can sometimes lead to chronic conditions including psychological harm or substance abuse. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in complementary reports in this series.
Resumo:
This report focuses on our examination of extant data which have been sourced with respect to personally and socially risky behaviour associated with males living in regional and remote Australia . The AIHW (2008: PHE 97:89) defines personally risky behaviour, on the one hand, as working, swimming, boating, driving or operating hazardous machinery while intoxicated with alcohol or an illicit drug. Socially risky behaviour, on the other hand, is defined as creating a public disturbance, damaging property, stealing or verbally or physically abusing someone while intoxicated with alcohol or an illicit drug. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in complementary reports in this series.
Resumo:
This report considers extant data which have been sourced with respect to some of the consequences of violent acts and incidents and risky behaviour for males living in regional and remote Australia . This has been collated and presented under the headings: juvenile offenders; long-term health consequences; anxiety and repression; and other chronic disabilities. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in complementary reports in this series.
Resumo:
The ability to forecast machinery failure is vital to reducing maintenance costs, operation downtime and safety hazards. Recent advances in condition monitoring technologies have given rise to a number of prognostic models for forecasting machinery health based on condition data. Although these models have aided the advancement of the discipline, they have made only a limited contribution to developing an effective machinery health prognostic system. The literature review indicates that there is not yet a prognostic model that directly models and fully utilises suspended condition histories (which are very common in practice since organisations rarely allow their assets to run to failure); that effectively integrates population characteristics into prognostics for longer-range prediction in a probabilistic sense; which deduces the non-linear relationship between measured condition data and actual asset health; and which involves minimal assumptions and requirements. This work presents a novel approach to addressing the above-mentioned challenges. The proposed model consists of a feed-forward neural network, the training targets of which are asset survival probabilities estimated using a variation of the Kaplan-Meier estimator and a degradation-based failure probability density estimator. The adapted Kaplan-Meier estimator is able to model the actual survival status of individual failed units and estimate the survival probability of individual suspended units. The degradation-based failure probability density estimator, on the other hand, extracts population characteristics and computes conditional reliability from available condition histories instead of from reliability data. The estimated survival probability and the relevant condition histories are respectively presented as “training target” and “training input” to the neural network. The trained network is capable of estimating the future survival curve of a unit when a series of condition indices are inputted. Although the concept proposed may be applied to the prognosis of various machine components, rolling element bearings were chosen as the research object because rolling element bearing failure is one of the foremost causes of machinery breakdowns. Computer simulated and industry case study data were used to compare the prognostic performance of the proposed model and four control models, namely: two feed-forward neural networks with the same training function and structure as the proposed model, but neglected suspended histories; a time series prediction recurrent neural network; and a traditional Weibull distribution model. The results support the assertion that the proposed model performs better than the other four models and that it produces adaptive prediction outputs with useful representation of survival probabilities. This work presents a compelling concept for non-parametric data-driven prognosis, and for utilising available asset condition information more fully and accurately. It demonstrates that machinery health can indeed be forecasted. The proposed prognostic technique, together with ongoing advances in sensors and data-fusion techniques, and increasingly comprehensive databases of asset condition data, holds the promise for increased asset availability, maintenance cost effectiveness, operational safety and – ultimately – organisation competitiveness.
Resumo:
Established Monte Carlo user codes BEAMnrc and DOSXYZnrc permit the accurate and straightforward simulation of radiotherapy experiments and treatments delivered from multiple beam angles. However, when an electronic portal imaging detector (EPID) is included in these simulations, treatment delivery from non-zero beam angles becomes problematic. This study introduces CTCombine, a purpose-built code for rotating selected CT data volumes, converting CT numbers to mass densities, combining the results with model EPIDs and writing output in a form which can easily be read and used by the dose calculation code DOSXYZnrc. The geometric and dosimetric accuracy of CTCombine’s output has been assessed by simulating simple and complex treatments applied to a rotated planar phantom and a rotated humanoid phantom and comparing the resulting virtual EPID images with the images acquired using experimental measurements and independent simulations of equivalent phantoms. It is expected that CTCombine will be useful for Monte Carlo studies of EPID dosimetry as well as other EPID imaging applications.
Resumo:
Recent studies have shown that delusion-like experiences (DLEs) are common among general populations. This study investigates whether the prevalence of these experiences are linked to the embracing of New Age thought. Logistic regression analyses were performed using data derived from a large community sample of young adults (N = 3777). Belief in a spiritual or higher power other than God was found to be significantly associated with endorsement of 16 of 19 items from Peters et al. (1999b) Delusional Inventory following adjustment for a range of potential confounders, while belief in God was associated with endorsement of four items. A New Age conception of the divine appears to be strongly associated with a wide range of DLEs. Further research is needed to determine a causal link between New Age philosophy and DLEs (e.g. thought disturbance, suspiciousness, and delusions of grandeur).
Resumo:
This paper explores a method of comparative analysis and classification of data through perceived design affordances. Included is discussion about the musical potential of data forms that are derived through eco-structural analysis of musical features inherent in audio recordings of natural sounds. A system of classification of these forms is proposed based on their structural contours. The classifications include four primitive types; steady, iterative, unstable and impulse. The classification extends previous taxonomies used to describe the gestural morphology of sound. The methods presented are used to provide compositional support for eco-structuralism.
Resumo:
Definition of disease phenotype is a necessary preliminary to research into genetic causes of a complex disease. Clinical diagnosis of migraine is currently based on diagnostic criteria developed by the International Headache Society. Previously, we examined the natural clustering of these diagnostic symptoms using latent class analysis (LCA) and found that a four-class model was preferred. However, the classes can be ordered such that all symptoms progressively intensify, suggesting that a single continuous variable representing disease severity may provide a better model. Here, we compare two models: item response theory and LCA, each constructed within a Bayesian context. A deviance information criterion is used to assess model fit. We phenotyped our population sample using these models, estimated heritability and conducted genome-wide linkage analysis using Merlin-qtl. LCA with four classes was again preferred. After transformation, phenotypic trait values derived from both models are highly correlated (correlation = 0.99) and consequently results from subsequent genetic analyses were similar. Heritability was estimated at 0.37, while multipoint linkage analysis produced genome-wide significant linkage to chromosome 7q31-q33 and suggestive linkage to chromosomes 1 and 2. We argue that such continuous measures are a powerful tool for identifying genes contributing to migraine susceptibility.