999 resultados para Statistical investigations
Resumo:
This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application of our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within the context of the wireless telecommunication industry; consumption behaviour (as oppose to purchasing behaviour) is behaviour that has been performed so frequently that it become habitual and involves minimal intentions or decision making. Key variables investigated are the activity initialised timestamp and cell tower location as well as the activity type and usage quantity (e.g., voice call with duration in seconds); and the research focuses are on customers’ spatial and temporal usage behaviour. The main methodological emphasis is on the development of clustering models based on Gaussian mixture models (GMMs) which are fitted with the use of the recently developed variational Bayesian (VB) method. VB is an efficient deterministic alternative to the popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. The standard VBGMMalgorithm is extended by allowing component splitting such that it is robust to initial parameter choices and can automatically and efficiently determine the number of components. The new algorithm we propose allows more effective modelling of individuals’ highly heterogeneous and spiky spatial usage behaviour, or more generally human mobility patterns; the term spiky describes data patterns with large areas of low probability mixed with small areas of high probability. Customers are then characterised and segmented based on the fitted GMM which corresponds to how each of them uses the products/services spatially in their daily lives; this is essentially their likely lifestyle and occupational traits. Other significant research contributions include fitting GMMs using VB to circular data i.e., the temporal usage behaviour, and developing clustering algorithms suitable for high dimensional data based on the use of VB-GMM.
Resumo:
It is important to promote a sustainable development approach to ensure that economic, environmental and social developments are maintained in balance. Sustainable development and its implications are not just a global concern, it also affects Australia. In particular, rural Australian communities are facing various economic, environmental and social challenges. Thus, the need for sustainable development in rural regions is becoming increasingly important. To promote sustainable development, proper frameworks along with the associated tools optimised for the specific regions, need to be developed. This will ensure that the decisions made for sustainable development are evidence based, instead of subjective opinions. To address these issues, Queensland University of Technology (QUT), through an Australian Research Council (ARC) linkage grant, has initiated research into the development of a Rural Statistical Sustainability Framework (RSSF) to aid sustainable decision making in rural Queensland. This particular branch of the research developed a decision support tool that will become the integrating component of the RSSF. This tool is developed on the web-based platform to allow easy dissemination, quick maintenance and to minimise compatibility issues. The tool is developed based on MapGuide Open Source and it follows the three-tier architecture: Client tier, Web tier and the Server tier. The developed tool is interactive and behaves similar to a familiar desktop-based application. It has the capability to handle and display vector-based spatial data and can give further visual outputs using charts and tables. The data used in this tool is obtained from the QUT research team. Overall the tool implements four tasks to help in the decision-making process. These are the Locality Classification, Trend Display, Impact Assessment and Data Entry and Update. The developed tool utilises open source and freely available software and accounts for easy extensibility and long-term sustainability.
Resumo:
This paper argues for a renewed focus on statistical reasoning in the beginning school years, with opportunities for children to engage in data modelling. Some of the core components of data modelling are addressed. A selection of results from the first data modelling activity implemented during the second year (2010; second grade) of a current longitudinal study are reported. Data modelling involves investigations of meaningful phenomena, deciding what is worthy of attention (identifying complex attributes), and then progressing to organising, structuring, visualising, and representing data. Reported here are children's abilities to identify diverse and complex attributes, sort and classify data in different ways, and create and interpret models to represent their data.
Resumo:
The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.
Resumo:
This paper argues for a renewed focus on statistical reasoning in the elementary school years, with opportunities for children to engage in data modeling. Data modeling involves investigations of meaningful phenomena, deciding what is worthy of attention, and then progressing to organizing, structuring, visualizing, and representing data. Reported here are some findings from a two-part activity (Baxter Brown’s Picnic and Planning a Picnic) implemented at the end of the second year of a current three-year longitudinal study (grade levels 1-3). Planning a Picnic was also implemented in a grade 7 class to provide an opportunity for the different age groups to share their products. Addressed here are the grade 2 children’s predictions for missing data in Baxter Brown’s Picnic, the questions posed and representations created by both grade levels in Planning a Picnic, and the metarepresentational competence displayed in the grade levels’ sharing of their products for Planning a Picnic.
Resumo:
Purpose: James Clerk Maxwell is usually recognized as being the first, in 1854, to consider using inhomogeneous media in optical systems. However, some fifty years earlier Thomas Young, stimulated by his interest in the optics of the eye and accommodation, had already modeled some applications of gradient-index optics. These applications included using an axial gradient to provide spherical aberration-free optics and a spherical gradient to describe the optics of the atmosphere and the eye lens. We evaluated Young’s contributions. Method: We attempted to derive Young’s equations for axial and spherical refractive index gradients. Raytracing was used to confirm accuracy of formula. Results: We did not confirm Young’s equation for the axial gradient to provide aberration-free optics, but derived a slightly different equation. We confirmed the correctness of his equations for deviation of rays in a spherical gradient index and for the focal length of a lens with a nucleus of fixed index surrounded by a cortex of reducing index towards the edge. Young claimed that the equation for focal length applied to a lens with part of the constant index nucleus of the sphere removed, such that the loss of focal length was a quarter of the thickness removed, but this is not strictly correct. Conclusion: Young’s theoretical work in gradient-index optics received no acknowledgement from either his contemporaries or later authors. While his model of the eye lens is not an accurate physiological description of the human lens, with the index reducing least quickly at the edge, it represented a bold attempt to approximate the characteristics of the lens. Thomas Young’s work deserves wider recognition.
Resumo:
Purpose. To create a binocular statistical eye model based on previously measured ocular biometric data. Methods. Thirty-nine parameters were determined for a group of 127 healthy subjects (37 male, 90 female; 96.8% Caucasian) with an average age of 39.9 ± 12.2 years and spherical equivalent refraction of −0.98 ± 1.77 D. These parameters described the biometry of both eyes and the subjects' age. Missing parameters were complemented by data from a previously published study. After confirmation of the Gaussian shape of their distributions, these parameters were used to calculate their mean and covariance matrices. These matrices were then used to calculate a multivariate Gaussian distribution. From this, an amount of random biometric data could be generated, which were then randomly selected to create a realistic population of random eyes. Results. All parameters had Gaussian distributions, with the exception of the parameters that describe total refraction (i.e., three parameters per eye). After these non-Gaussian parameters were omitted from the model, the generated data were found to be statistically indistinguishable from the original data for the remaining 33 parameters (TOST [two one-sided t tests]; P < 0.01). Parameters derived from the generated data were also significantly indistinguishable from those calculated with the original data (P > 0.05). The only exception to this was the lens refractive index, for which the generated data had a significantly larger SD. Conclusions. A statistical eye model can describe the biometric variations found in a population and is a useful addition to the classic eye models.
Resumo:
Concerns regarding groundwater contamination with nitrate and the long-term sustainability of groundwater resources have prompted the development of a multi-layered three dimensional (3D) geological model to characterise the aquifer geometry of the Wairau Plain, Marlborough District, New Zealand. The 3D geological model which consists of eight litho-stratigraphic units has been subsequently used to synthesise hydrogeological and hydrogeochemical data for different aquifers in an approach that aims to demonstrate how integration of water chemistry data within the physical framework of a 3D geological model can help to better understand and conceptualise groundwater systems in complex geological settings. Multivariate statistical techniques(e.g. Principal Component Analysis and Hierarchical Cluster Analysis) were applied to groundwater chemistry data to identify hydrochemical facies which are characteristic of distinct evolutionary pathways and a common hydrologic history of groundwaters. Principal Component Analysis on hydrochemical data demonstrated that natural water-rock interactions, redox potential and human agricultural impact are the key controls of groundwater quality in the Wairau Plain. Hierarchical Cluster Analysis revealed distinct hydrochemical water quality groups in the Wairau Plain groundwater system. Visualisation of the results of the multivariate statistical analyses and distribution of groundwater nitrate concentrations in the context of aquifer lithology highlighted the link between groundwater chemistry and the lithology of host aquifers. The methodology followed in this study can be applied in a variety of hydrogeological settings to synthesise geological, hydrogeological and hydrochemical data and present them in a format readily understood by a wide range of stakeholders. This enables a more efficient communication of the results of scientific studies to the wider community.
Resumo:
A wireless sensor network system must have the ability to tolerate harsh environmental conditions and reduce communication failures. In a typical outdoor situation, the presence of wind can introduce movement in the foliage. This motion of vegetation structures causes large and rapid signal fading in the communication link and must be accounted for when deploying a wireless sensor network system in such conditions. This thesis examines the fading characteristics experienced by wireless sensor nodes due to the effect of varying wind speed in a foliage obstructed transmission path. It presents extensive measurement campaigns at two locations with the approach of a typical wireless sensor networks configuration. The significance of this research lies in the varied approaches of its different experiments, involving a variety of vegetation types, scenarios and the use of different polarisations (vertical and horizontal). Non–line of sight (NLoS) scenario conditions investigate the wind effect based on different vegetation densities including that of the Acacia tree, Dogbane tree and tall grass. Whereas the line of sight (LoS) scenario investigates the effect of wind when the grass is swaying and affecting the ground-reflected component of the signal. Vegetation type and scenarios are envisaged to simulate real life working conditions of wireless sensor network systems in outdoor foliated environments. The results from the measurements are presented in statistical models involving first and second order statistics. We found that in most of the cases, the fading amplitude could be approximated by both Lognormal and Nakagami distribution, whose m parameter was found to depend on received power fluctuations. Lognormal distribution is known as the result of slow fading characteristics due to shadowing. This study concludes that fading caused by variations in received power due to wind in wireless sensor networks systems are found to be insignificant. There is no notable difference in Nakagami m values for low, calm, and windy wind speed categories. It is also shown in the second order analysis, the duration of the deep fades are very short, 0.1 second for 10 dB attenuation below RMS level for vertical polarization and 0.01 second for 10 dB attenuation below RMS level for horizontal polarization. Another key finding is that the received signal strength for horizontal polarisation demonstrates more than 3 dB better performances than the vertical polarisation for LoS and near LoS (thin vegetation) conditions and up to 10 dB better for denser vegetation conditions.
Resumo:
Based on the molecular dynamics (MD) method, the single-crystalline copper nanowire with different surface defects is investigated through tension simulation. For comparison, the MD tension simulations of perfect nanowire are firstly carried out under different temperatures, strain rates, and sizes. It has concluded that the surface-volume ratio significantly affects the mechanical properties of nanowire. The surface defects on nanowires are then systematically studied in considering different defect orientation and distribution. It is found that the Young’s modulus is insensitive of surface defects. However, the yield strength and yield point show a significant decrease due to the different defects. Different defects are observed to serve as a dislocation source.
Creativity in policing: building the necessary skills to solve complex and protracted investigations
Resumo:
Despite an increased focus on proactive policing in recent years, criminal investigation is still perhaps the most important task of any law enforcement agency. As a result, the skills required to carry out a successful investigation or to be an ‘effective detective’ have been subjected to much attention and debate (Smith and Flanagan, 2000; Dean, 2000; Fahsing and Gottschalk, 2008:652). Stelfox (2008:303) states that “The service’s capacity to carry out investigations comprises almost entirely the expertise of investigators.” In this respect, Dean (2000) highlighted the need to profile criminal investigators in order to promote further understanding of the cognitive approaches they take to the process of criminal investigation. As a result of his research, Dean (2000) produced a theoretical framework of criminal investigation, which included four disparate cognitive or ‘thinking styles’. These styles were the ‘Method’, ‘Challenge’, ‘Skill’ and ‘Risk’. While the Method and Challenge styles deal with adherence to Standard Operating Procedures (SOPs) and the internal ‘drive’ that keeps an investigator going, the Skill and Risk styles both tap on the concept of creativity in policing. It is these two latter styles that provide the focus for this paper. This paper presents a brief discussion on Dean’s (2000) Skill and Risk styles before giving an overview of the broader literature on creativity in policing. The potential benefits of a creative approach as well as some hurdles which need to be overcome when proposing the integration of creativity within the policing sector are then discussed. Finally, the paper concludes by proposing further research into Dean’s (2000) skill and risk styles and also by stressing the need for significant changes to the structure and approach of the traditional policing organisation before creativity in policing is given the status it deserves.
Resumo:
Internal autopsies are invasive and result in the mutilation of the deceased person’s body. They are expensive and pose occupational health and safety risks. Accordingly, they should only be done for good cause. However, until recently, “full” internal autopsies have usually been undertaken in most coroners’ cases. There is a growing trend against this practice but it is meeting resistance from some pathologists who argue that any decision as to the extent of the autopsy should rest with them. This paper examines the origins of the coronial system to place in context the current approach to a death investigation and to review the debate about the role of an internal autopsy in the coronial system.
Resumo:
Glacial cycles during the Pleistocene reduced sea levels and created new land connections in northern Australia, where many currently isolated rivers also became connected via an extensive paleo-lake system, 'Lake Carpentaria'. However, the most recent period during which populations of freshwater species were connected by gene flow across Lake Carpentaria is debated: various 'Lake Carpentaria hypotheses' have been proposed. Here, we used a statistical phylogeographic approach to assess the timing of past population connectivity across the Carpentaria region in the obligate freshwater fish, Glossamia aprion. Results for this species indicate that the most recent period of genetic exchange across the Carpentaria region coincided with the mid- to late Pleistocene, a result shown previously for other freshwater and diadromous species. Based on these findings and published studies for various freshwater, diadromous and marine species, we propose a set of 'Lake Carpentaria' hypotheses to explain past population connectivity in aquatic species: (1) strictly freshwater species had widespread gene flow in the mid- to late Pleistocene before the last glacial maximum; (2) marine species were subdivided into eastern and western populations by land during Pleistocene glacial phases; and (3) past connectivity in diadromous species reflects the relative strength of their marine affinity.
Resumo:
With increasing rate of shipping traffic, the risk of collisions in busy and congested port waters is likely to rise. However, due to low collision frequencies in port waters, it is difficult to analyze such risk in a sound statistical manner. A convenient approach of investigating navigational collision risk is the application of the traffic conflict techniques, which have potential to overcome the difficulty of obtaining statistical soundness. This study aims at examining port water conflicts in order to understand the characteristics of collision risk with regard to vessels involved, conflict locations, traffic and kinematic conditions. A hierarchical binomial logit model, which considers the potential correlations between observation-units, i.e., vessels, involved in the same conflicts, is employed to evaluate the association of explanatory variables with conflict severity levels. Results show higher likelihood of serious conflicts for vessels of small gross tonnage or small overall length. The probability of serious conflict also increases at locations where vessels have more varied headings, such as traffic intersections and anchorages; becoming more critical at night time. Findings from this research should assist both navigators operating in port waters as well as port authorities overseeing navigational management.