188 resultados para Statistical inference
Resumo:
Concerns regarding groundwater contamination with nitrate and the long-term sustainability of groundwater resources have prompted the development of a multi-layered three dimensional (3D) geological model to characterise the aquifer geometry of the Wairau Plain, Marlborough District, New Zealand. The 3D geological model which consists of eight litho-stratigraphic units has been subsequently used to synthesise hydrogeological and hydrogeochemical data for different aquifers in an approach that aims to demonstrate how integration of water chemistry data within the physical framework of a 3D geological model can help to better understand and conceptualise groundwater systems in complex geological settings. Multivariate statistical techniques(e.g. Principal Component Analysis and Hierarchical Cluster Analysis) were applied to groundwater chemistry data to identify hydrochemical facies which are characteristic of distinct evolutionary pathways and a common hydrologic history of groundwaters. Principal Component Analysis on hydrochemical data demonstrated that natural water-rock interactions, redox potential and human agricultural impact are the key controls of groundwater quality in the Wairau Plain. Hierarchical Cluster Analysis revealed distinct hydrochemical water quality groups in the Wairau Plain groundwater system. Visualisation of the results of the multivariate statistical analyses and distribution of groundwater nitrate concentrations in the context of aquifer lithology highlighted the link between groundwater chemistry and the lithology of host aquifers. The methodology followed in this study can be applied in a variety of hydrogeological settings to synthesise geological, hydrogeological and hydrochemical data and present them in a format readily understood by a wide range of stakeholders. This enables a more efficient communication of the results of scientific studies to the wider community.
Resumo:
A wireless sensor network system must have the ability to tolerate harsh environmental conditions and reduce communication failures. In a typical outdoor situation, the presence of wind can introduce movement in the foliage. This motion of vegetation structures causes large and rapid signal fading in the communication link and must be accounted for when deploying a wireless sensor network system in such conditions. This thesis examines the fading characteristics experienced by wireless sensor nodes due to the effect of varying wind speed in a foliage obstructed transmission path. It presents extensive measurement campaigns at two locations with the approach of a typical wireless sensor networks configuration. The significance of this research lies in the varied approaches of its different experiments, involving a variety of vegetation types, scenarios and the use of different polarisations (vertical and horizontal). Non–line of sight (NLoS) scenario conditions investigate the wind effect based on different vegetation densities including that of the Acacia tree, Dogbane tree and tall grass. Whereas the line of sight (LoS) scenario investigates the effect of wind when the grass is swaying and affecting the ground-reflected component of the signal. Vegetation type and scenarios are envisaged to simulate real life working conditions of wireless sensor network systems in outdoor foliated environments. The results from the measurements are presented in statistical models involving first and second order statistics. We found that in most of the cases, the fading amplitude could be approximated by both Lognormal and Nakagami distribution, whose m parameter was found to depend on received power fluctuations. Lognormal distribution is known as the result of slow fading characteristics due to shadowing. This study concludes that fading caused by variations in received power due to wind in wireless sensor networks systems are found to be insignificant. There is no notable difference in Nakagami m values for low, calm, and windy wind speed categories. It is also shown in the second order analysis, the duration of the deep fades are very short, 0.1 second for 10 dB attenuation below RMS level for vertical polarization and 0.01 second for 10 dB attenuation below RMS level for horizontal polarization. Another key finding is that the received signal strength for horizontal polarisation demonstrates more than 3 dB better performances than the vertical polarisation for LoS and near LoS (thin vegetation) conditions and up to 10 dB better for denser vegetation conditions.
Resumo:
In this paper, we apply a simulation based approach for estimating transmission rates of nosocomial pathogens. In particular, the objective is to infer the transmission rate between colonised health-care practitioners and uncolonised patients (and vice versa) solely from routinely collected incidence data. The method, using approximate Bayesian computation, is substantially less computer intensive and easier to implement than likelihood-based approaches we refer to here. We find through replacing the likelihood with a comparison of an efficient summary statistic between observed and simulated data that little is lost in the precision of estimated transmission rates. Furthermore, we investigate the impact of incorporating uncertainty in previously fixed parameters on the precision of the estimated transmission rates.
Resumo:
Background This paper presents a novel approach to searching electronic medical records that is based on concept matching rather than keyword matching. Aim The concept-based approach is intended to overcome specific challenges we identified in searching medical records. Method Queries and documents were transformed from their term-based originals into medical concepts as defined by the SNOMED-CT ontology. Results Evaluation on a real-world collection of medical records showed our concept-based approach outperformed a keyword baseline by 25% in Mean Average Precision. Conclusion The concept-based approach provides a framework for further development of inference based search systems for dealing with medical data.
Resumo:
Glacial cycles during the Pleistocene reduced sea levels and created new land connections in northern Australia, where many currently isolated rivers also became connected via an extensive paleo-lake system, 'Lake Carpentaria'. However, the most recent period during which populations of freshwater species were connected by gene flow across Lake Carpentaria is debated: various 'Lake Carpentaria hypotheses' have been proposed. Here, we used a statistical phylogeographic approach to assess the timing of past population connectivity across the Carpentaria region in the obligate freshwater fish, Glossamia aprion. Results for this species indicate that the most recent period of genetic exchange across the Carpentaria region coincided with the mid- to late Pleistocene, a result shown previously for other freshwater and diadromous species. Based on these findings and published studies for various freshwater, diadromous and marine species, we propose a set of 'Lake Carpentaria' hypotheses to explain past population connectivity in aquatic species: (1) strictly freshwater species had widespread gene flow in the mid- to late Pleistocene before the last glacial maximum; (2) marine species were subdivided into eastern and western populations by land during Pleistocene glacial phases; and (3) past connectivity in diadromous species reflects the relative strength of their marine affinity.
Resumo:
With increasing rate of shipping traffic, the risk of collisions in busy and congested port waters is likely to rise. However, due to low collision frequencies in port waters, it is difficult to analyze such risk in a sound statistical manner. A convenient approach of investigating navigational collision risk is the application of the traffic conflict techniques, which have potential to overcome the difficulty of obtaining statistical soundness. This study aims at examining port water conflicts in order to understand the characteristics of collision risk with regard to vessels involved, conflict locations, traffic and kinematic conditions. A hierarchical binomial logit model, which considers the potential correlations between observation-units, i.e., vessels, involved in the same conflicts, is employed to evaluate the association of explanatory variables with conflict severity levels. Results show higher likelihood of serious conflicts for vessels of small gross tonnage or small overall length. The probability of serious conflict also increases at locations where vessels have more varied headings, such as traffic intersections and anchorages; becoming more critical at night time. Findings from this research should assist both navigators operating in port waters as well as port authorities overseeing navigational management.
Resumo:
The discovery of protein variation is an important strategy in disease diagnosis within the biological sciences. The current benchmark for elucidating information from multiple biological variables is the so called “omics” disciplines of the biological sciences. Such variability is uncovered by implementation of multivariable data mining techniques which come under two primary categories, machine learning strategies and statistical based approaches. Typically proteomic studies can produce hundreds or thousands of variables, p, per observation, n, depending on the analytical platform or method employed to generate the data. Many classification methods are limited by an n≪p constraint, and as such, require pre-treatment to reduce the dimensionality prior to classification. Recently machine learning techniques have gained popularity in the field for their ability to successfully classify unknown samples. One limitation of such methods is the lack of a functional model allowing meaningful interpretation of results in terms of the features used for classification. This is a problem that might be solved using a statistical model-based approach where not only is the importance of the individual protein explicit, they are combined into a readily interpretable classification rule without relying on a black box approach. Here we incorporate statistical dimension reduction techniques Partial Least Squares (PLS) and Principal Components Analysis (PCA) followed by both statistical and machine learning classification methods, and compared them to a popular machine learning technique, Support Vector Machines (SVM). Both PLS and SVM demonstrate strong utility for proteomic classification problems.
Resumo:
Effective, statistically robust sampling and surveillance strategies form an integral component of large agricultural industries such as the grains industry. Intensive in-storage sampling is essential for pest detection, Integrated Pest Management (IPM), to determine grain quality and to satisfy importing nation’s biosecurity concerns, while surveillance over broad geographic regions ensures that biosecurity risks can be excluded, monitored, eradicated or contained within an area. In the grains industry, a number of qualitative and quantitative methodologies for surveillance and in-storage sampling have been considered. Primarily, research has focussed on developing statistical methodologies for in storage sampling strategies concentrating on detection of pest insects within a grain bulk, however, the need for effective and statistically defensible surveillance strategies has also been recognised. Interestingly, although surveillance and in storage sampling have typically been considered independently, many techniques and concepts are common between the two fields of research. This review aims to consider the development of statistically based in storage sampling and surveillance strategies and to identify methods that may be useful for both surveillance and in storage sampling. We discuss the utility of new quantitative and qualitative approaches, such as Bayesian statistics, fault trees and more traditional probabilistic methods and show how these methods may be used in both surveillance and in storage sampling systems.
Resumo:
Quality oriented management systems and methods have become the dominant business and governance paradigm. From this perspective, satisfying customers’ expectations by supplying reliable, good quality products and services is the key factor for an organization and even government. During recent decades, Statistical Quality Control (SQC) methods have been developed as the technical core of quality management and continuous improvement philosophy and now are being applied widely to improve the quality of products and services in industrial and business sectors. Recently SQC tools, in particular quality control charts, have been used in healthcare surveillance. In some cases, these tools have been modified and developed to better suit the health sector characteristics and needs. It seems that some of the work in the healthcare area has evolved independently of the development of industrial statistical process control methods. Therefore analysing and comparing paradigms and the characteristics of quality control charts and techniques across the different sectors presents some opportunities for transferring knowledge and future development in each sectors. Meanwhile considering capabilities of Bayesian approach particularly Bayesian hierarchical models and computational techniques in which all uncertainty are expressed as a structure of probability, facilitates decision making and cost-effectiveness analyses. Therefore, this research investigates the use of quality improvement cycle in a health vii setting using clinical data from a hospital. The need of clinical data for monitoring purposes is investigated in two aspects. A framework and appropriate tools from the industrial context are proposed and applied to evaluate and improve data quality in available datasets and data flow; then a data capturing algorithm using Bayesian decision making methods is developed to determine economical sample size for statistical analyses within the quality improvement cycle. Following ensuring clinical data quality, some characteristics of control charts in the health context including the necessity of monitoring attribute data and correlated quality characteristics are considered. To this end, multivariate control charts from an industrial context are adapted to monitor radiation delivered to patients undergoing diagnostic coronary angiogram and various risk-adjusted control charts are constructed and investigated in monitoring binary outcomes of clinical interventions as well as postintervention survival time. Meanwhile, adoption of a Bayesian approach is proposed as a new framework in estimation of change point following control chart’s signal. This estimate aims to facilitate root causes efforts in quality improvement cycle since it cuts the search for the potential causes of detected changes to a tighter time-frame prior to the signal. This approach enables us to obtain highly informative estimates for change point parameters since probability distribution based results are obtained. Using Bayesian hierarchical models and Markov chain Monte Carlo computational methods, Bayesian estimators of the time and the magnitude of various change scenarios including step change, linear trend and multiple change in a Poisson process are developed and investigated. The benefits of change point investigation is revisited and promoted in monitoring hospital outcomes where the developed Bayesian estimator reports the true time of the shifts, compared to priori known causes, detected by control charts in monitoring rate of excess usage of blood products and major adverse events during and after cardiac surgery in a local hospital. The development of the Bayesian change point estimators are then followed in a healthcare surveillances for processes in which pre-intervention characteristics of patients are viii affecting the outcomes. In this setting, at first, the Bayesian estimator is extended to capture the patient mix, covariates, through risk models underlying risk-adjusted control charts. Variations of the estimator are developed to estimate the true time of step changes and linear trends in odds ratio of intensive care unit outcomes in a local hospital. Secondly, the Bayesian estimator is extended to identify the time of a shift in mean survival time after a clinical intervention which is being monitored by riskadjusted survival time control charts. In this context, the survival time after a clinical intervention is also affected by patient mix and the survival function is constructed using survival prediction model. The simulation study undertaken in each research component and obtained results highly recommend the developed Bayesian estimators as a strong alternative in change point estimation within quality improvement cycle in healthcare surveillances as well as industrial and business contexts. The superiority of the proposed Bayesian framework and estimators are enhanced when probability quantification, flexibility and generalizability of the developed model are also considered. The empirical results and simulations indicate that the Bayesian estimators are a strong alternative in change point estimation within quality improvement cycle in healthcare surveillances. The superiority of the proposed Bayesian framework and estimators are enhanced when probability quantification, flexibility and generalizability of the developed model are also considered. The advantages of the Bayesian approach seen in general context of quality control may also be extended in the industrial and business domains where quality monitoring was initially developed.
Resumo:
The gross overrepresentation of Indigenous peoples in prison populations suggests that sentencing may be a discriminatory process. Using findings from recent (1991–2011) multivariate statistical sentencing analyses from the United States, Canada, and Australia, we review the 3 key hypotheses advanced as plausible explanations for baseline sentencing discrepancies between Indigenous and non-Indigenous adult criminal defendants: (a) differential involvement, (b) negative discrimination, and (c) positive discrimination. Overall, the prior research shows strong support for the differential involvement thesis and some support for the discrimination theses (positive and negative). We argue that where discrimination is found, it may be explained by the lack of a more complete set of control variables in researchers’ multivariate models and/or differing political and social contexts.
Resumo:
Certain statistic and scientometric features of articles published in the journal “International Research in Geographical and Environmental Education” are examined in this paper, for the period 1992-2009, by applying nonparametric statistics and Shannon’s entropy (diversity) formula. The main findings of this analysis are: a) after 2004 the research priorities of researchers in geographical and environmental education seem to have changed, b) “teacher education” has been the most recurrent theme throughout these 18 years, followed by “values & attitudes” and “inquiry & problem solving” c) the themes “GIS” and “Sustainability” were the most “stable” throughout the 18 years, meaning that they maintained their ranks as publication priorities more than other themes, d) citations of IRGEE increase annually, e) the average thematic diversity of articles published during the period 1992-2009 is 82.7% of the maximum thematic diversity (very high), meaning that the Journal has the capacity to attract a wide readership for the 10 themes it has successfully covered throughout the 18 years of its publication.
Resumo:
The Clarence-Moreton Basin (CMB) covers approximately 26000 km2 and is the only sub-basin of the Great Artesian Basin (GAB) in which there is flow to both the south-west and the east, although flow to the south-west is predominant. In many parts of the basin, including catchments of the Bremer, Logan and upper Condamine Rivers in southeast Queensland, the Walloon Coal Measures are under exploration for Coal Seam Gas (CSG). In order to assess spatial variations in groundwater flow and hydrochemistry at a basin-wide scale, a 3D hydrogeological model of the Queensland section of the CMB has been developed using GoCAD modelling software. Prior to any large-scale CSG extraction, it is essential to understand the existing hydrochemical character of the different aquifers and to establish any potential linkage. To effectively use the large amount of water chemistry data existing for assessment of hydrochemical evolution within the different lithostratigraphic units, multivariate statistical techniques were employed.
Resumo:
Cancer poses an undeniable burden to the health and wellbeing of the Australian community. In a recent report commissioned by the Australian Institute for Health and Welfare(AIHW, 2010), one in every two Australians on average will be diagnosed with cancer by the age of 85, making cancer the second leading cause of death in 2007, preceded only by cardiovascular disease. Despite modest decreases in standardised combined cancer mortality over the past few decades, in part due to increased funding and access to screening programs, cancer remains a significant economic burden. In 2010, all cancers accounted for an estimated 19% of the country's total burden of disease, equating to approximately $3:8 billion in direct health system costs (Cancer Council Australia, 2011). Furthermore, there remains established socio-economic and other demographic inequalities in cancer incidence and survival, for example, by indigenous status and rurality. Therefore, in the interests of the nation's health and economic management, there is an immediate need to devise data-driven strategies to not only understand the socio-economic drivers of cancer but also facilitate the implementation of cost-effective resource allocation for cancer management...