957 resultados para Statistical approach
Resumo:
Citation information: Armstrong RA, Davies LN, Dunne MCM & Gilmartin B. Statistical guidelines for clinical studies of human vision. Ophthalmic Physiol Opt 2011, 31, 123-136. doi: 10.1111/j.1475-1313.2010.00815.x ABSTRACT: Statistical analysis of data can be complex and different statisticians may disagree as to the correct approach leading to conflict between authors, editors, and reviewers. The objective of this article is to provide some statistical advice for contributors to optometric and ophthalmic journals, to provide advice specifically relevant to clinical studies of human vision, and to recommend statistical analyses that could be used in a variety of circumstances. In submitting an article, in which quantitative data are reported, authors should describe clearly the statistical procedures that they have used and to justify each stage of the analysis. This is especially important if more complex or 'non-standard' analyses have been carried out. The article begins with some general comments relating to data analysis concerning sample size and 'power', hypothesis testing, parametric and non-parametric variables, 'bootstrap methods', one and two-tail testing, and the Bonferroni correction. More specific advice is then given with reference to particular statistical procedures that can be used on a variety of types of data. Where relevant, examples of correct statistical practice are given with reference to recently published articles in the optometric and ophthalmic literature.
Resumo:
This thesis presents an analysis of the stability of complex distribution networks. We present a stability analysis against cascading failures. We propose a spin [binary] model, based on concepts of statistical mechanics. We test macroscopic properties of distribution networks with respect to various topological structures and distributions of microparameters. The equilibrium properties of the systems are obtained in a statistical mechanics framework by application of the replica method. We demonstrate the validity of our approach by comparing it with Monte Carlo simulations. We analyse the network properties in terms of phase diagrams and found both qualitative and quantitative dependence of the network properties on the network structure and macroparameters. The structure of the phase diagrams points at the existence of phase transition and the presence of stable and metastable states in the system. We also present an analysis of robustness against overloading in the distribution networks. We propose a model that describes a distribution process in a network. The model incorporates the currents between any connected hubs in the network, local constraints in the form of Kirchoff's law and a global optimizational criterion. The flow of currents in the system is driven by the consumption. We study two principal types of model: infinite and finite link capacity. The key properties are the distributions of currents in the system. We again use a statistical mechanics framework to describe the currents in the system in terms of macroscopic parameters. In order to obtain observable properties we apply the replica method. We are able to assess the criticality of the level of demand with respect to the available resources and the architecture of the network. Furthermore, the parts of the system, where critical currents may emerge, can be identified. This, in turn, provides us with the characteristic description of the spread of the overloading in the systems.
Resumo:
Exploratory analysis of petroleum geochemical data seeks to find common patterns to help distinguish between different source rocks, oils and gases, and to explain their source, maturity and any intra-reservoir alteration. However, at the outset, one is typically faced with (a) a large matrix of samples, each with a range of molecular and isotopic properties, (b) a spatially and temporally unrepresentative sampling pattern, (c) noisy data and (d) often, a large number of missing values. This inhibits analysis using conventional statistical methods. Typically, visualisation methods like principal components analysis are used, but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this paper we introduce a complementary approach based on a non-linear probabilistic model. Generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, while also dealing with missing data. We show how using generative topographic mapping also provides an optimal method with which to replace missing values in two geochemical datasets, particularly where a large proportion of the data is missing.
Resumo:
We consider a variation of the prototype combinatorial optimization problem known as graph colouring. Our optimization goal is to colour the vertices of a graph with a fixed number of colours, in a way to maximize the number of different colours present in the set of nearest neighbours of each given vertex. This problem, which we pictorially call palette-colouring, has been recently addressed as a basic example of a problem arising in the context of distributed data storage. Even though it has not been proved to be NP-complete, random search algorithms find the problem hard to solve. Heuristics based on a naive belief propagation algorithm are observed to work quite well in certain conditions. In this paper, we build upon the mentioned result, working out the correct belief propagation algorithm, which needs to take into account the many-body nature of the constraints present in this problem. This method improves the naive belief propagation approach at the cost of increased computational effort. We also investigate the emergence of a satisfiable-to-unsatisfiable 'phase transition' as a function of the vertex mean degree, for different ensembles of sparse random graphs in the large size ('thermodynamic') limit.
Resumo:
The objective of Total Productive Maintenance (TPM) is to maximise plant and equipment effectiveness, to create a sense of ownership for operators, and promote continuous improvement through small group activities involving production, engineering and maintenance personnel. This paper describes and analyses a case study of TPM implementation at a newspaper printing house in Singapore. However, rather than adopting more conventional implementation methods such as employing consultants or through a project using external training, a unique approach was adopted based on Action Research using a spiral of cycles of planning, acting observing and reflecting. An Action Research team of company personnel was specially formed to undertake the necessary fieldwork. The team subsequently assisted with administering the resulting action plan. The main sources of maintenance and operational data were from interviews with shop floor workers, participative observation and reviews conducted with members of the team. Content analysis using appropriate statistical techniques was used to test the significance of changes in performance between the start and completion of the TPM programme. The paper identifies the characteristics associated with the Action Research method when used to implement TPM and discusses the applicability of the approach in related industries and processes.
Resumo:
We present a data based statistical study on the effects of seasonal variations in the growth rates of the gastro-intestinal (GI) parasitic infection in livestock. The alluded growth rate is estimated through the variation in the number of eggs per gram (EPG) of faeces in animals. In accordance with earlier studies, our analysis too shows that rainfall is the dominant variable in determining EPG infection rates compared to other macro-parameters like temperature and humidity. Our statistical analysis clearly indicates an oscillatory dependence of EPG levels on rainfall fluctuations. Monsoon recorded the highest infection with a comparative increase of at least 2.5 times compared to the next most infected period (summer). A least square fit of the EPG versus rainfall data indicates an approach towards a super diffusive (i. e. root mean square displacement growing faster than the square root of the elapsed time as obtained for simple diffusion) infection growth pattern regime for low rainfall regimes (technically defined as zeroth level dependence) that gets remarkably augmented for large rainfall zones. Our analysis further indicates that for low fluctuations in temperature (true on the bulk data), EPG level saturates beyond a critical value of the rainfall, a threshold that is expected to indicate the onset of the nonlinear regime. The probability density functions (PDFs) of the EPG data show oscillatory behavior in the large rainfall regime (greater than 500 mm), the frequency of oscillation, once again, being determined by the ambient wetness (rainfall, and humidity). Data recorded over three pilot projects spanning three measures of rainfall and humidity bear testimony to the universality of this statistical argument. © 2013 Chattopadhyay and Bandyopadhyay.
Resumo:
Membrane proteins, which constitute approximately 20% of most genomes, are poorly tractable targets for experimental structure determination, thus analysis by prediction and modelling makes an important contribution to their on-going study. Membrane proteins form two main classes: alpha helical and beta barrel trans-membrane proteins. By using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we addressed alpha-helical topology prediction. This method has accuracies of 77.4% for prokaryotic proteins and 61.4% for eukaryotic proteins. The method described here represents an important advance in the computational determination of membrane protein topology and offers a useful, and complementary, tool for the analysis of membrane proteins for a range of applications.
Resumo:
Membrane proteins, which constitute approximately 20% of most genomes, form two main classes: alpha helical and beta barrel transmembrane proteins. Using methods based on Bayesian Networks, a powerful approach for statistical inference, we have sought to address beta-barrel topology prediction. The beta-barrel topology predictor reports individual strand accuracies of 88.6%. The method outlined here represents a potentially important advance in the computational determination of membrane protein topology.
Resumo:
In this letter, we propose an analytical approach to model uplink intercell interference (ICI) in hexagonal grid based orthogonal frequency division multiple access (OFMDA) cellular networks. The key idea is that the uplink ICI from individual cells is approximated with a lognormal distribution with statistical parameters being determined analytically. Accordingly, the aggregated uplink ICI is approximated with another lognormal distribution and its statistical parameters can be determined from those of individual cells using Fenton-Wilkson method. Analytic expressions of uplink ICI are derived with two traditional frequency reuse schemes, namely integer frequency reuse schemes with factor 1 (IFR-1) and factor 3 (IFR-3). Uplink fractional power control and lognormal shadowing are modeled. System performances in terms of signal to interference plus noise ratio (SINR) and spectrum efficiency are also derived. The proposed model has been validated by simulations. © 2013 IEEE.
Resumo:
Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure.
Resumo:
The computational mechanics approach has been applied to the orientational behavior of water molecules in a molecular dynamics simulated water–Na + system. The distinctively different statistical complexity of water molecules in the bulk and in the first solvation shell of the ion is demonstrated. It is shown that the molecules undergo more complex orientational motion when surrounded by other water molecules compared to those constrained by the electric field of the ion. However the spatial coordinates of the oxygen atom shows the opposite complexity behavior in that complexity is higher for the solvation shell molecules. New information about the dynamics of water molecules in the solvation shell is provided that is additional to that given by traditional methods of analysis.
Resumo:
We report results of an experimental study, complemented by detailed statistical analysis of the experimental data, on the development of a more effective control method of drug delivery using a pH sensitive acrylic polymer. New copolymers based on acrylic acid and fatty acid are constructed from dodecyl castor oil and a tercopolymer based on methyl methacrylate, acrylic acid and acryl amide were prepared using this new approach. Water swelling characteristics of fatty acid, acrylic acid copolymer and tercopolymer respectively in acid and alkali solutions have been studied by a step-change method. The antibiotic drug cephalosporin and paracetamol have also been incorporated into the polymer blend through dissolution with the release of the antibiotic drug being evaluated in bacterial stain media and buffer solution. Our results show that the rate of release of paracetamol getss affected by the pH factor and also by the nature of polymer blend. Our experimental data have later been statistically analyzed to quantify the precise nature of polymer decay rates on the pH density of the relevant polymer solvents. The time evolution of the polymer decay rates indicate a marked transition from a linear to a strictly non-linear regime depending on the whether the chosen sample is a general copolymer (linear) or a tercopolymer (non-linear). Non-linear data extrapolation techniques have been used to make probabilistic predictions about the variation in weight percentages of retained polymers at all future times, thereby quantifying the degree of efficacy of the new method of drug delivery.
Resumo:
The accurate identification of T-cell epitopes remains a principal goal of bioinformatics within immunology. As the immunogenicity of peptide epitopes is dependent on their binding to major histocompatibility complex (MHC) molecules, the prediction of binding affinity is a prerequisite to the reliable prediction of epitopes. The iterative self-consistent (ISC) partial-least-squares (PLS)-based additive method is a recently developed bioinformatic approach for predicting class II peptide−MHC binding affinity. The ISC−PLS method overcomes many of the conceptual difficulties inherent in the prediction of class II peptide−MHC affinity, such as the binding of a mixed population of peptide lengths due to the open-ended class II binding site. The method has applications in both the accurate prediction of class II epitopes and the manipulation of affinity for heteroclitic and competitor peptides. The method is applied here to six class II mouse alleles (I-Ab, I-Ad, I-Ak, I-As, I-Ed, and I-Ek) and included peptides up to 25 amino acids in length. A series of regression equations highlighting the quantitative contributions of individual amino acids at each peptide position was established. The initial model for each allele exhibited only moderate predictivity. Once the set of selected peptide subsequences had converged, the final models exhibited a satisfactory predictive power. Convergence was reached between the 4th and 17th iterations, and the leave-one-out cross-validation statistical terms - q2, SEP, and NC - ranged between 0.732 and 0.925, 0.418 and 0.816, and 1 and 6, respectively. The non-cross-validated statistical terms r2 and SEE ranged between 0.98 and 0.995 and 0.089 and 0.180, respectively. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made freely available online (http://www.jenner.ac.uk/MHCPred).
Resumo:
When Recurrent Neural Networks (RNN) are going to be used as Pattern Recognition systems, the problem to be considered is how to impose prescribed prototype vectors ξ^1,ξ^2,...,ξ^p as fixed points. The synaptic matrix W should be interpreted as a sort of sign correlation matrix of the prototypes, In the classical approach. The weak point in this approach, comes from the fact that it does not have the appropriate tools to deal efficiently with the correlation between the state vectors and the prototype vectors The capacity of the net is very poor because one can only know if one given vector is adequately correlated with the prototypes or not and we are not able to know what its exact correlation degree. The interest of our approach lies precisely in the fact that it provides these tools. In this paper, a geometrical vision of the dynamic of states is explained. A fixed point is viewed as a point in the Euclidean plane R2. The retrieving procedure is analyzed trough statistical frequency distribution of the prototypes. The capacity of the net is improved and the spurious states are reduced. In order to clarify and corroborate the theoretical results, together with the formal theory, an application is presented
Resumo:
Motivation: Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function, and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Results: Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from “first passage probability distribution” to summarize statistics of ensemble averaged amino acid propensity values. In this paper, we introduce and elaborate this approach.