987 resultados para statistical methodology
Resumo:
The field of natural language processing (NLP) has seen a dramatic shift in both research direction and methodology in the past several years. In the past, most work in computational linguistics tended to focus on purely symbolic methods. Recently, more and more work is shifting toward hybrid methods that combine new empirical corpus-based methods, including the use of probabilistic and information-theoretic techniques, with traditional symbolic methods. This work is made possible by the recent availability of linguistic databases that add rich linguistic annotation to corpora of natural language text. Already, these methods have led to a dramatic improvement in the performance of a variety of NLP systems with similar improvement likely in the coming years. This paper focuses on these trends, surveying in particular three areas of recent progress: part-of-speech tagging, stochastic parsing, and lexical semantics.
Resumo:
National Highway Traffic Safety Administration, Washington, D.C.
Resumo:
Properties of computing Boolean circuits composed of noisy logical gates are studied using the statistical physics methodology. A formula-growth model that gives rise to random Boolean functions is mapped onto a spin system, which facilitates the study of their typical behavior in the presence of noise. Bounds on their performance, derived in the information theory literature for specific gates, are straightforwardly retrieved, generalized and identified as the corresponding macroscopic phase transitions. The framework is employed for deriving results on error-rates at various function-depths and function sensitivity, and their dependence on the gate-type and noise model used. These are difficult to obtain via the traditional methods used in this field.
Resumo:
Networking encompasses a variety of tasks related to the communication of information on networks; it has a substantial economic and societal impact on a broad range of areas including transportation systems, wired and wireless communications and a range of Internet applications. As transportation and communication networks become increasingly more complex, the ever increasing demand for congestion control, higher traffic capacity, quality of service, robustness and reduced energy consumption requires new tools and methods to meet these conflicting requirements. The new methodology should serve for gaining better understanding of the properties of networking systems at the macroscopic level, as well as for the development of new principled optimization and management algorithms at the microscopic level. Methods of statistical physics seem best placed to provide new approaches as they have been developed specifically to deal with nonlinear large-scale systems. This review aims at presenting an overview of tools and methods that have been developed within the statistical physics community and that can be readily applied to address the emerging problems in networking. These include diffusion processes, methods from disordered systems and polymer physics, probabilistic inference, which have direct relevance to network routing, file and frequency distribution, the exploration of network structures and vulnerability, and various other practical networking applications. © 2013 IOP Publishing Ltd.
Resumo:
Background: Age-related macular disease is the leading cause of blind registration in the developed world. One aetiological hypothesis involves oxidation, and the intrinsic vulnerability of the retina to damage via this process. This has prompted interest in the role of antioxidants, particularly the carotenoids lutein and zeaxanthin, in the prevention and treatment of this eye disease. Methods: The aim of this randomised controlled trial is to determine the effect of a nutritional supplement containing lutein, vitamins A, C and E, zinc, and copper on measures of visual function in people with and without age-related macular disease. Outcome measures are distance and near visual acuity, contrast sensitivity, colour vision, macular visual field, glare recovery, and fundus photography. Randomisation is achieved via a random number generator, and masking achieved by third party coding of the active and placebo containers. Data collection will take place at nine and 18 months, and statistical analysis will employ Student's t test. Discussion: A paucity of treatment modalities for age-related macular disease has prompted research into the development of prevention strategies. A positive effect on normals may be indicative of a role of nutritional supplementation in preventing or delaying onset of the condition. An observed benefit in the age-related macular disease group may indicate a potential role of supplementation in prevention of progression, or even a degree reversal of the visual effects caused by this condition.
Resumo:
Methods for the calculation of complexity have been investigated as a possible alternative for the analysis of the dynamics of molecular systems. “Computational mechanics” is the approach chosen to describe emergent behavior in molecular systems that evolve in time. A novel algorithm has been developed for symbolization of a continuous physical trajectory of a dynamic system. A method for calculating statistical complexity has been implemented and tested on representative systems. It is shown that the computational mechanics approach is suitable for analyzing the dynamic complexity of molecular systems and offers new insight into the process.
Resumo:
2000 Mathematics Subject Classification: 62H30
Resumo:
Recently, the occurrence of multiple events in static tests has been investigated by checking the statistical distribution of the difference between the addresses of the words containing bitflips. That method has been successfully applied to Field Programmable Gate Arrays (FPGAs) and the original authors indicate that it is also valid for SRAMs. This paper presents a modified methodology that is based on checking the XORed addresses with bitflips, rather than on the difference. Irradiation tests on CMOS 130 & 90 nm SRAMs with 14-MeV neutrons have been performed to validate this methodology. Results in high-altitude environments are also presented and cross-checked with theoretical predictions. In addition, this methodology has also been used to detect modifications in the organization of said memories. Theoretical predictions have been validated with actual data provided by the manufacturer.
Resumo:
This research is funded by UK Medical Research Council grant number MR/L011115/1. We would like to thank the 105 experts in behaviour change who have committed their time and offered their expertise for study 2 of this research. We are also very grateful to all those who sent us peer-reviewed behaviour change intervention descriptions for study 1. Finally, we would like thank Dr. Emma Beard and Dr. Dan Dediu for their statistical input and to all the researchers, particularly Holly Walton, who have assisted in the coding of papers for study 1.
Resumo:
This work presents a computational, called MOMENTS, code developed to be used in process control to determine a characteristic transfer function to industrial units when radiotracer techniques were been applied to study the unit´s performance. The methodology is based on the measuring the residence time distribution function (RTD) and calculate the first and second temporal moments of the tracer data obtained by two scintillators detectors NaI positioned to register a complete tracer movement inside the unit. Non linear regression technique has been used to fit various mathematical models and a statistical test was used to select the best result to the transfer function. Using the code MOMENTS, twelve different models can be used to fit a curve and calculate technical parameters to the unit.
Resumo:
In an effort to achieve greater consistency and comparability in state‐wide seat belt use reporting, the National Highway Traffic Safety Administration (NHTSA) issued new requirements in 2011 for observing and reporting future seat belt use. The requirements included the involvement of a qualified statistician in the sampling and weighting portions of the process as well as a variety of operational details. The Iowa Governor’s Traffic Safety Bureau contracted with Iowa State University’s Survey & Behavioral Research Services (SBRS) in 2011 to develop the study design and data collection plan for the State of Iowa annual survey that would meet the new requirements of the NHTSA. A seat belt survey plan for Iowa was developed by SBRS with statistical expertise provided by Zhengyuan Zhu, Ph.D., Associate Professor of Statistics at Iowa State University. The Iowa plan was submitted to NHTSA in December of 2011 and official approval was received on March 19, 2012.
Resumo:
This dissertation proposes statistical methods to formulate, estimate and apply complex transportation models. Two main problems are part of the analyses conducted and presented in this dissertation. The first method solves an econometric problem and is concerned with the joint estimation of models that contain both discrete and continuous decision variables. The use of ordered models along with a regression is proposed and their effectiveness is evaluated with respect to unordered models. Procedure to calculate and optimize the log-likelihood functions of both discrete-continuous approaches are derived, and difficulties associated with the estimation of unordered models explained. Numerical approximation methods based on the Genz algortithm are implemented in order to solve the multidimensional integral associated with the unordered modeling structure. The problems deriving from the lack of smoothness of the probit model around the maximum of the log-likelihood function, which makes the optimization and the calculation of standard deviations very difficult, are carefully analyzed. A methodology to perform out-of-sample validation in the context of a joint model is proposed. Comprehensive numerical experiments have been conducted on both simulated and real data. In particular, the discrete-continuous models are estimated and applied to vehicle ownership and use models on data extracted from the 2009 National Household Travel Survey. The second part of this work offers a comprehensive statistical analysis of free-flow speed distribution; the method is applied to data collected on a sample of roads in Italy. A linear mixed model that includes speed quantiles in its predictors is estimated. Results show that there is no road effect in the analysis of free-flow speeds, which is particularly important for model transferability. A very general framework to predict random effects with few observations and incomplete access to model covariates is formulated and applied to predict the distribution of free-flow speed quantiles. The speed distribution of most road sections is successfully predicted; jack-knife estimates are calculated and used to explain why some sections are poorly predicted. Eventually, this work contributes to the literature in transportation modeling by proposing econometric model formulations for discrete-continuous variables, more efficient methods for the calculation of multivariate normal probabilities, and random effects models for free-flow speed estimation that takes into account the survey design. All methods are rigorously validated on both real and simulated data.
Resumo:
Betacyanins are betalain pigments that display a red-violet colour which have been reported to be three times stronger than the red-violet dye produced by anthocyanins [1]. The applications of betacyanins cover a wide range of matrices, mainly as additives or ingredients in the food industry, cosmetics, pharmaceuticals and livestock feed. Although, being less commonly used than anthocyanins and carotenoids, betacyanins are stable between pH 3 to 7 and suitable for colouring in low acid matrices. In addition, betacyanins have been reported to display interesting medicinal character as powerful antioxidant and chemopreventive compounds either in vitro or in vivo models [2]. Betacyanins are obtained mainly from the red beet of Beta vulgaris plant (between I 0 to 20 mg per I 00 g pulp) but alternative primary sources are needed [3]. In addition, independently of the source used, the effect of the variables that affect the extraction of betacyanins have not been properly described and quantified. Therefore, the aim of this study was to identifY and optimize the conditions that maximize betacyanins extraction using the tepals of Gomphrena globosa L. flowers as an alternative source. Assisted by the statistical technique of response surface methodology, an experimental design was developed for testing the significant explanatory variables of the extraction (time, temperature, solid-liquid ratio and ethanolwater ratio). The identification was performed using high-performance liquid chromatography coupled with a photodiode array detector and mass spectrometry with electron spray ionization (HPLC-PDAMS/ ESI) and the response was measured by the quantification of these compounds using HPLC-PDA. Afterwards, a response surface analysis was performed to evaluate the results. The major betacyanin compounds identified were gomphrenin 11 and Ill and isogomphrenin IJ and Ill. The highest total betacyanins content was obtained by using the following conditions: 45 min of extraction. time, 35•c, 35 g/L of solid-liquid ratio and 25% of ethanol. These values would not be found without optimizing the conditions of the betacyanins extraction, which moreover showed contrary trends to what it has been described in the scientific bibliography. More specifically, concerning the time and temperature variables, an increase of both values (from the common ones used in the bibliography) showed a considerable improvement on the betacyanins extraction yield without displaying any type of degradation patterns.
Resumo:
Ergosterol, a molecule with high commercial value, is the most abundant mycosterol in Agaricus bisporus L. To replace common conventional extraction techniques (e.g. Soxhlet), the present study reports the optimal ultrasound-assisted extraction conditions for ergosterol. After preliminary tests, the results showed that solvents, time and ultrasound power altered the extraction efficiency. Using response surface methodology, models were developed to investigate the favourable experimental conditions that maximize the extraction efficiency. All statistical criteria demonstrated the validity of the proposed models. Overall, ultrasound-assisted extraction with ethanol at 375 W during 15 min proved to be as efficient as the Soxhlet extraction, yielding 671.5 ± 0.5mg ergosterol/100 g dw. However, with n-hexane extracts with higher purity (mg ergosterol/g extract) were obtained. Finally, it was proposed for the removal of the saponification step, which simplifies the extraction process and makes it more feasible for its industrial transference.
Resumo:
Statistical approaches to study extreme events require, by definition, long time series of data. In many scientific disciplines, these series are often subject to variations at different temporal scales that affect the frequency and intensity of their extremes. Therefore, the assumption of stationarity is violated and alternative methods to conventional stationary extreme value analysis (EVA) must be adopted. Using the example of environmental variables subject to climate change, in this study we introduce the transformed-stationary (TS) methodology for non-stationary EVA. This approach consists of (i) transforming a non-stationary time series into a stationary one, to which the stationary EVA theory can be applied, and (ii) reverse transforming the result into a non-stationary extreme value distribution. As a transformation, we propose and discuss a simple time-varying normalization of the signal and show that it enables a comprehensive formulation of non-stationary generalized extreme value (GEV) and generalized Pareto distribution (GPD) models with a constant shape parameter. A validation of the methodology is carried out on time series of significant wave height, residual water level, and river discharge, which show varying degrees of long-term and seasonal variability. The results from the proposed approach are comparable with the results from (a) a stationary EVA on quasi-stationary slices of non-stationary series and (b) the established method for non-stationary EVA. However, the proposed technique comes with advantages in both cases. For example, in contrast to (a), the proposed technique uses the whole time horizon of the series for the estimation of the extremes, allowing for a more accurate estimation of large return levels. Furthermore, with respect to (b), it decouples the detection of non-stationary patterns from the fitting of the extreme value distribution. As a result, the steps of the analysis are simplified and intermediate diagnostics are possible. In particular, the transformation can be carried out by means of simple statistical techniques such as low-pass filters based on the running mean and the standard deviation, and the fitting procedure is a stationary one with a few degrees of freedom and is easy to implement and control. An open-source MAT-LAB toolbox has been developed to cover this methodology, which is available at https://github.com/menta78/tsEva/(Mentaschi et al., 2016).