924 resultados para Multivariate statistics
Resumo:
Basic mathematical skills are critical to a student’s ability to successfully undertake an introductory statistics course. Yet in business education this vitally important area of mathematics and statistics education is under-researched. The question therefore arises as to what level of mathematical skill a typical business studies student will possess as they enter the tertiary environment, and whether there are any common deficiencies that we can identify with a view to tackling the problem. This paper will focus on a study designed to measure the level of mathematical ability of first year business students. The results provide timely insight into a growing problem faced by many tertiary educators in this field.
Resumo:
“World food security … is at its lowest in half a century,” wrote Julian Cribb FTSE, a wellknown consultant in science communication and founding editor of www.sciencealert. com.au in the lead article in the 2008 ATSE Focus magazine issue entitled “Food for the world: the nation’s challenge”. Food security continues to be a key national and international concern and it is pleasing to see this issue of Focus again exploring aspects of the topic with the aim of continuing to raise awareness of issues and influencing relevant policy decisions. Statistics (or statistical science, more broadly) has been critical to the information and decision-making value chain needed to optimise agriculture and the food supply chain. The key steps are most often addressed by multidisciplinary research groups including statisticians in collaboration with life and physical scientists, agri-industry personnel and other relevant stakeholders.
Resumo:
In this paper we propose a new multivariate GARCH model with time-varying conditional correlation structure. The time-varying conditional correlations change smoothly between two extreme states of constant correlations according to a predetermined or exogenous transition variable. An LM–test is derived to test the constancy of correlations and LM- and Wald tests to test the hypothesis of partially constant correlations. Analytical expressions for the test statistics and the required derivatives are provided to make computations feasible. An empirical example based on daily return series of five frequently traded stocks in the S&P 500 stock index completes the paper.
Resumo:
Australia has a significantly higher suicide rate than England. Rather than accepting that this ‘statistical fact’ is a direct reflection of some positivist truth, this paper begins with the premise that how suicide is counted depends upon what counts as suicide. This study involves semi-structured interviews with coroners both in Australia and England, as well as observations at inquests. Important differences between the two coronial systems include: first, quite different logics of operation; second, the burden of proof for reaching a finding of suicide is significantly higher in England; and third, the presence of family members at English inquests results in far greater pressure being brought to bear upon coroners. These combined factors result in a reduced likelihood of English coroners reaching a finding of suicide. The conclusions are twofold. First, this research supports existing criticisms of comparative suicide statistics. Second, this research adds theoretical weight to criticisms of positivist analyses of social phenomena.
Resumo:
Interpolation techniques for spatial data have been applied frequently in various fields of geosciences. Although most conventional interpolation methods assume that it is sufficient to use first- and second-order statistics to characterize random fields, researchers have now realized that these methods cannot always provide reliable interpolation results, since geological and environmental phenomena tend to be very complex, presenting non-Gaussian distribution and/or non-linear inter-variable relationship. This paper proposes a new approach to the interpolation of spatial data, which can be applied with great flexibility. Suitable cross-variable higher-order spatial statistics are developed to measure the spatial relationship between the random variable at an unsampled location and those in its neighbourhood. Given the computed cross-variable higher-order spatial statistics, the conditional probability density function (CPDF) is approximated via polynomial expansions, which is then utilized to determine the interpolated value at the unsampled location as an expectation. In addition, the uncertainty associated with the interpolation is quantified by constructing prediction intervals of interpolated values. The proposed method is applied to a mineral deposit dataset, and the results demonstrate that it outperforms kriging methods in uncertainty quantification. The introduction of the cross-variable higher-order spatial statistics noticeably improves the quality of the interpolation since it enriches the information that can be extracted from the observed data, and this benefit is substantial when working with data that are sparse or have non-trivial dependence structures.
Resumo:
Yield in cultivated cotton (Gossypium spp.) is affected by the number and distribution of fibres initiated on the seed surface but, apart from simple statistical summaries, little has been done to assess this phenotype quantitatively. Here we use two types of spatial statistics to describe and quantify differences in patterning of cotton ovule fibre initials (FI). The following five different species of Gossypium were analysed: G. hirsutum L., G. barbadense L., G. arboreum, G. raimondii Ulbrich. and G. trilobum (DC.) Skovsted. Scanning electron micrographs of FIs were taken on the day of anthesis. Cell centres for fibre and epidermal cells were digitised and analysed by spatial statistics methods appropriate for marked point processes and tessellations. Results were consistent with previously published reports of fibre number and spacing. However, it was shown that the spatial distributions of FIs in all of species examined exhibit regularity, and are not completely random as previously implied. The regular arrangement indicates FIs do not appear independently of each other and we surmise there may be some form of mutual inhibition specifying fibre-initial development. It is concluded that genetic control of FIs differs from that of stomata, another well studied plant idioblast. Since spatial statistics show clear species differences in the distribution of FIs within this genus, they provide a useful method for phenotyping cotton. © CSIRO 2007.
Resumo:
The Galilee and Eromanga basins are sub-basins of the Great Artesian Basin (GAB). In this study, a multivariate statistical approach (hierarchical cluster analysis, principal component analysis and factor analysis) is carried out to identify hydrochemical patterns and assess the processes that control hydrochemical evolution within key aquifers of the GAB in these basins. The results of the hydrochemical assessment are integrated into a 3D geological model (previously developed) to support the analysis of spatial patterns of hydrochemistry, and to identify the hydrochemical and hydrological processes that control hydrochemical variability. In this area of the GAB, the hydrochemical evolution of groundwater is dominated by evapotranspiration near the recharge area resulting in a dominance of the Na–Cl water types. This is shown conceptually using two selected cross-sections which represent discrete groundwater flow paths from the recharge areas to the deeper parts of the basins. With increasing distance from the recharge area, a shift towards a dominance of carbonate (e.g. Na–HCO3 water type) has been observed. The assessment of hydrochemical changes along groundwater flow paths highlights how aquifers are separated in some areas, and how mixing between groundwater from different aquifers occurs elsewhere controlled by geological structures, including between GAB aquifers and coal bearing strata of the Galilee Basin. The results of this study suggest that distinct hydrochemical differences can be observed within the previously defined Early Cretaceous–Jurassic aquifer sequence of the GAB. A revision of the two previously recognised hydrochemical sequences is being proposed, resulting in three hydrochemical sequences based on systematic differences in hydrochemistry, salinity and dominant hydrochemical processes. The integrated approach presented in this study which combines different complementary multivariate statistical techniques with a detailed assessment of the geological framework of these sedimentary basins, can be adopted in other complex multi-aquifer systems to assess hydrochemical evolution and its geological controls.
Resumo:
Three core components in developing children’s understanding and appreciation of data — establish a context, pose and answer statistical questions, represent and interpret data — lay the foundation for the fourth component: use data to enhance existing context.
Resumo:
The majority of sugar mill locomotives are equipped with GPS devices from which locomotive position data is stored. Locomotive run information (e.g. start times, run destinations and activities) is electronically stored in software called TOTools. The latest software development allows TOTools to interpret historical GPS information by combining this data with run information recorded in TOTools and geographic information from a GIS application called MapInfo. As a result, TOTools is capable of summarising run activity details such as run start and finish times and shunt activities with great accuracy. This paper presents 15 reports developed to summarise run activities and speed information. The reports will be of use pre-season to assist in developing the next year's schedule and for determining priorities for investment in the track infrastructure. They will also be of benefit during the season to closely monitor locomotive run performance against the existing schedule.
Resumo:
Experts are increasingly being called upon to quantify their knowledge, particularly in situations where data is not yet available or of limited relevance. In many cases this involves asking experts to estimate probabilities. For example experts, in ecology or related fields, might be called upon to estimate probabilities of incidence or abundance of species, and how they relate to environmental factors. Although many ecologists undergo some training in statistics at undergraduate and postgraduate levels, this does not necessarily focus on interpretations of probabilities. More accurate elicitation can be obtained by training experts prior to elicitation, and if necessary tailoring elicitation to address the expert’s strengths and weaknesses. Here we address the first step of diagnosing conceptual understanding of probabilities. We refer to the psychological literature which identifies several common biases or fallacies that arise during elicitation. These form the basis for developing a diagnostic questionnaire, as a tool for supporting accurate elicitation, particularly when several experts or elicitors are involved. We report on a qualitative assessment of results from a pilot of this questionnaire. These results raise several implications for training experts, not only prior to elicitation, but more strategically by targeting them whilst still undergraduate or postgraduate students.
Resumo:
This paper presents an efficient noniterative method for distribution state estimation using conditional multivariate complex Gaussian distribution (CMCGD). In the proposed method, the mean and standard deviation (SD) of the state variables is obtained in one step considering load uncertainties, measurement errors, and load correlations. In this method, first the bus voltages, branch currents, and injection currents are represented by MCGD using direct load flow and a linear transformation. Then, the mean and SD of bus voltages, or other states, are calculated using CMCGD and estimation of variance method. The mean and SD of pseudo measurements, as well as spatial correlations between pseudo measurements, are modeled based on the historical data for different levels of load duration curve. The proposed method can handle load uncertainties without using time-consuming approaches such as Monte Carlo. Simulation results of two case studies, six-bus, and a realistic 747-bus distribution network show the effectiveness of the proposed method in terms of speed, accuracy, and quality against the conventional approach.
Resumo:
Several genetic variants are thought to influence white matter (WM) integrity, measured with diffusion tensor imaging (DTI). Voxel based methods can test genetic associations, but heavy multiple comparisons corrections are required to adjust for searching the whole brain and for all genetic variants analyzed. Thus, genetic associations are hard to detect even in large studies. Using a recently developed multi-SNP analysis, we examined the joint predictive power of a group of 18 cholesterol-related single nucleotide polymorphisms (SNPs) on WM integrity, measured by fractional anisotropy. To boost power, we limited the analysis to brain voxels that showed significant associations with total serum cholesterol levels. From this space, we identified two genes with effects that replicated in individual voxel-wise analyses of the whole brain. Multivariate analyses of genetic variants on a reduced anatomical search space may help to identify SNPs with strongest effects on the brain from a broad panel of genes.
Resumo:
This review is focused on the impact of chemometrics for resolving data sets collected from investigations of the interactions of small molecules with biopolymers. These samples have been analyzed with various instrumental techniques, such as fluorescence, ultraviolet–visible spectroscopy, and voltammetry. The impact of two powerful and demonstrably useful multivariate methods for resolution of complex data—multivariate curve resolution–alternating least squares (MCR–ALS) and parallel factor analysis (PARAFAC)—is highlighted through analysis of applications involving the interactions of small molecules with the biopolymers, serum albumin, and deoxyribonucleic acid. The outcomes illustrated that significant information extracted by the chemometric methods was unattainable by simple, univariate data analysis. In addition, although the techniques used to collect data were confined to ultraviolet–visible spectroscopy, fluorescence spectroscopy, circular dichroism, and voltammetry, data profiles produced by other techniques may also be processed. Topics considered including binding sites and modes, cooperative and competitive small molecule binding, kinetics, and thermodynamics of ligand binding, and the folding and unfolding of biopolymers. Applications of the MCR–ALS and PARAFAC methods reviewed were primarily published between 2008 and 2013.
Resumo:
The practice of statistics is the focus of the world in which professional statisticians live. To understand meaningfully what this practice is about, students need to engage in it themselves. Acknowledging the limitations of a genuine classroom setting, this study attempted to expose four classes of year 5 students (n=91) to an authentic experience of the practice of statistics. Setting an overall context of people’s habits that are considered environmentally friendly, the students sampled their class and set criteria for being environmentally friendly based on questions from the Australian Bureau of Statistics CensusAtSchool site. They then analysed the data and made decisions, acknowledging their degree of certainty, about three populations based on their criteria: their class, year 5 students in their school and year 5 students in Australia. The next step was to collect a random sample the size of their class from an Australian Bureau of Statistics ‘population’, analyse it and again make a decision about Australian year 5 students. At the end, they suggested what further research they might do. The analysis of students’ responses gives insight into primary students’ capacity to appreciate and understand decision making, and to participate in the practice of statistics, a topic that has received very little attention in the literature. Based on the total possible score of 23 from student workbook entries, 80 % of students achieved at least a score of 11.