939 resultados para Statistical Language Model
Resumo:
Doctoral Dissertation for PhD degree in Industrial and Systems Engineering
Resumo:
This paper proposes and validates a model-driven software engineering technique for spreadsheets. The technique that we envision builds on the embedding of spreadsheet models under a widely used spreadsheet system. This means that we enable the creation and evolution of spreadsheet models under a spreadsheet system. More precisely, we embed ClassSheets, a visual language with a syntax similar to the one offered by common spreadsheets, that was created with the aim of specifying spreadsheets. Our embedding allows models and their conforming instances to be developed under the same environment. In practice, this convenient environment enhances evolution steps at the model level while the corresponding instance is automatically co-evolved.Finally,wehave designed and conducted an empirical study with human users in order to assess our technique in production environments. The results of this study are promising and suggest that productivity gains are realizable under our model-driven spreadsheet development setting.
Resumo:
Dissertação de mestrado em Estatística
Resumo:
A partir de las últimas décadas se ha impulsado el desarrollo y la utilización de los Sistemas de Información Geográficos (SIG) y los Sistemas de Posicionamiento Satelital (GPS) orientados a mejorar la eficiencia productiva de distintos sistemas de cultivos extensivos en términos agronómicos, económicos y ambientales. Estas nuevas tecnologías permiten medir variabilidad espacial de propiedades del sitio como conductividad eléctrica aparente y otros atributos del terreno así como el efecto de las mismas sobre la distribución espacial de los rendimientos. Luego, es posible aplicar el manejo sitio-específico en los lotes para mejorar la eficiencia en el uso de los insumos agroquímicos, la protección del medio ambiente y la sustentabilidad de la vida rural. En la actualidad, existe una oferta amplia de recursos tecnológicos propios de la agricultura de precisión para capturar variación espacial a través de los sitios dentro del terreno. El óptimo uso del gran volumen de datos derivado de maquinarias de agricultura de precisión depende fuertemente de las capacidades para explorar la información relativa a las complejas interacciones que subyacen los resultados productivos. La covariación espacial de las propiedades del sitio y el rendimiento de los cultivos ha sido estudiada a través de modelos geoestadísticos clásicos que se basan en la teoría de variables regionalizadas. Nuevos desarrollos de modelos estadísticos contemporáneos, entre los que se destacan los modelos lineales mixtos, constituyen herramientas prometedoras para el tratamiento de datos correlacionados espacialmente. Más aún, debido a la naturaleza multivariada de las múltiples variables registradas en cada sitio, las técnicas de análisis multivariado podrían aportar valiosa información para la visualización y explotación de datos georreferenciados. La comprensión de las bases agronómicas de las complejas interacciones que se producen a la escala de lotes en producción, es hoy posible con el uso de éstas nuevas tecnologías. Los objetivos del presente proyecto son: (l) desarrollar estrategias metodológicas basadas en la complementación de técnicas de análisis multivariados y geoestadísticas, para la clasificación de sitios intralotes y el estudio de interdependencias entre variables de sitio y rendimiento; (ll) proponer modelos mixtos alternativos, basados en funciones de correlación espacial de los términos de error que permitan explorar patrones de correlación espacial de los rendimientos intralotes y las propiedades del suelo en los sitios delimitados. From the last decades the use and development of Geographical Information Systems (GIS) and Satellite Positioning Systems (GPS) is highly promoted in cropping systems. Such technologies allow measuring spatial variability of site properties including electrical conductivity and others soil features as well as their impact on the spatial variability of yields. Therefore, site-specific management could be applied to improve the efficiency in the use of agrochemicals, the environmental protection, and the sustainability of the rural life. Currently, there is a wide offer of technological resources to capture spatial variation across sites within field. However, the optimum use of data coming from the precision agriculture machineries strongly depends on the capabilities to explore the information about the complex interactions underlying the productive outputs. The covariation between spatial soil properties and yields from georeferenced data has been treated in a graphical manner or with standard geostatistical approaches. New statistical modeling capabilities from the Mixed Linear Model framework are promising to deal with correlated data such those produced by the precision agriculture. Moreover, rescuing the multivariate nature of the multiple data collected at each site, several multivariate statistical approaches could be crucial tools for data analysis with georeferenced data. Understanding the basis of complex interactions at the scale of production field is now within reach the use of these new techniques. Our main objectives are: (1) to develop new statistical strategies, based on the complementarities of geostatistics and multivariate methods, useful to classify sites within field grown with grain crops and analyze the interrelationships of several soil and yield variables, (2) to propose mixed linear models to predict yield according spatial soil variability and to build contour maps to promote a more sustainable agriculture.
Resumo:
Univariate statistical control charts, such as the Shewhart chart, do not satisfy the requirements for process monitoring on a high volume automated fuel cell manufacturing line. This is because of the number of variables that require monitoring. The risk of elevated false alarms, due to the nature of the process being high volume, can present problems if univariate methods are used. Multivariate statistical methods are discussed as an alternative for process monitoring and control. The research presented is conducted on a manufacturing line which evaluates the performance of a fuel cell. It has three stages of production assembly that contribute to the final end product performance. The product performance is assessed by power and energy measurements, taken at various time points throughout the discharge testing of the fuel cell. The literature review performed on these multivariate techniques are evaluated using individual and batch observations. Modern techniques using multivariate control charts on Hotellings T2 are compared to other multivariate methods, such as Principal Components Analysis (PCA). The latter, PCA, was identified as the most suitable method. Control charts such as, scores, T2 and DModX charts, are constructed from the PCA model. Diagnostic procedures, using Contribution plots, for out of control points that are detected using these control charts, are also discussed. These plots enable the investigator to perform root cause analysis. Multivariate batch techniques are compared to individual observations typically seen on continuous processes. Recommendations, for the introduction of multivariate techniques that would be appropriate for most high volume processes, are also covered.
Resumo:
Ever since the appearance of the ARCH model [Engle(1982a)], an impressive array of variance specifications belonging to the same class of models has emerged [i.e. Bollerslev's (1986) GARCH; Nelson's (1990) EGARCH]. This recent domain has achieved very successful developments. Nevertheless, several empirical studies seem to show that the performance of such models is not always appropriate [Boulier(1992)]. In this paper we propose a new specification: the Quadratic Moving Average Conditional heteroskedasticity model. Its statistical properties, such as the kurtosis and the symmetry, as well as two estimators (Method of Moments and Maximum Likelihood) are studied. Two statistical tests are presented, the first one tests for homoskedasticity and the second one, discriminates between ARCH and QMACH specification. A Monte Carlo study is presented in order to illustrate some of the theoretical results. An empirical study is undertaken for the DM-US exchange rate.
Resumo:
One of the main implications of the efficient market hypothesis (EMH) is that expected future returns on financial assets are not predictable if investors are risk neutral. In this paper we argue that financial time series offer more information than that this hypothesis seems to supply. In particular we postulate that runs of very large returns can be predictable for small time periods. In order to prove this we propose a TAR(3,1)-GARCH(1,1) model that is able to describe two different types of extreme events: a first type generated by large uncertainty regimes where runs of extremes are not predictable and a second type where extremes come from isolated dread/joy events. This model is new in the literature in nonlinear processes. Its novelty resides on two features of the model that make it different from previous TAR methodologies. The regimes are motivated by the occurrence of extreme values and the threshold variable is defined by the shock affecting the process in the preceding period. In this way this model is able to uncover dependence and clustering of extremes in high as well as in low volatility periods. This model is tested with data from General Motors stocks prices corresponding to two crises that had a substantial impact in financial markets worldwide; the Black Monday of October 1987 and September 11th, 2001. By analyzing the periods around these crises we find evidence of statistical significance of our model and thereby of predictability of extremes for September 11th but not for Black Monday. These findings support the hypotheses of a big negative event producing runs of negative returns in the first case, and of the burst of a worldwide stock market bubble in the second example. JEL classification: C12; C15; C22; C51 Keywords and Phrases: asymmetries, crises, extreme values, hypothesis testing, leverage effect, nonlinearities, threshold models
Resumo:
PURPOSE: The purpose of this study was to develop a mathematical model (sine model, SIN) to describe fat oxidation kinetics as a function of the relative exercise intensity [% of maximal oxygen uptake (%VO2max)] during graded exercise and to determine the exercise intensity (Fatmax) that elicits maximal fat oxidation (MFO) and the intensity at which the fat oxidation becomes negligible (Fatmin). This model included three independent variables (dilatation, symmetry, and translation) that incorporated primary expected modulations of the curve because of training level or body composition. METHODS: Thirty-two healthy volunteers (17 women and 15 men) performed a graded exercise test on a cycle ergometer, with 3-min stages and 20-W increments. Substrate oxidation rates were determined using indirect calorimetry. SIN was compared with measured values (MV) and with other methods currently used [i.e., the RER method (MRER) and third polynomial curves (P3)]. RESULTS: There was no significant difference in the fitting accuracy between SIN and P3 (P = 0.157), whereas MRER was less precise than SIN (P < 0.001). Fatmax (44 +/- 10% VO2max) and MFO (0.37 +/- 0.16 g x min(-1)) determined using SIN were significantly correlated with MV, P3, and MRER (P < 0.001). The variable of dilatation was correlated with Fatmax, Fatmin, and MFO (r = 0.79, r = 0.67, and r = 0.60, respectively, P < 0.001). CONCLUSIONS: The SIN model presents the same precision as other methods currently used in the determination of Fatmax and MFO but in addition allows calculation of Fatmin. Moreover, the three independent variables are directly related to the main expected modulations of the fat oxidation curve. SIN, therefore, seems to be an appropriate tool in analyzing fat oxidation kinetics obtained during graded exercise.
Resumo:
Acute and chronic respiratory failure is one of the major and potentially life-threatening features in individuals with myotonic dystrophy type 1 (DM1). Despite several clinical demonstrations showing respiratory problems in DM1 patients, the mechanisms are still not completely understood. This study was designed to investigate whether the DMSXL transgenic mouse model for DM1 exhibits respiratory disorders and, if so, to identify the pathological changes underlying these respiratory problems. Using pressure plethysmography, we assessed the breathing function in control mice and DMSXL mice generated after large expansions of the CTG repeat in successive generations of DM1 transgenic mice. Statistical analysis of breathing function measurements revealed a significant decrease in the most relevant respiratory parameters in DMSXL mice, indicating impaired respiratory function. Histological and morphometric analysis showed pathological changes in diaphragmatic muscle of DMSXL mice, characterized by an increase in the percentage of type I muscle fibers, the presence of central nuclei, partial denervation of end-plates (EPs) and a significant reduction in their size, shape complexity and density of acetylcholine receptors, all of which reflect a possible breakdown in communication between the diaphragmatic muscles fibers and the nerve terminals. Diaphragm muscle abnormalities were accompanied by an accumulation of mutant DMPK RNA foci in muscle fiber nuclei. Moreover, in DMSXL mice, the unmyelinated phrenic afferents are significantly lower. Also in these mice, significant neuronopathy was not detected in either cervical phrenic motor neurons or brainstem respiratory neurons. Because EPs are involved in the transmission of action potentials and the unmyelinated phrenic afferents exert a modulating influence on the respiratory drive, the pathological alterations affecting these structures might underlie the respiratory impairment detected in DMSXL mice. Understanding mechanisms of respiratory deficiency should guide pharmaceutical and clinical research towards better therapy for the respiratory deficits associated with DM1.
Resumo:
In this paper we propose a novel empirical extension of the standard market microstructure order flow model. The main idea is that heterogeneity of beliefs in the foreign exchange market can cause model instability and such instability has not been fully accounted for in the existing empirical literature. We investigate this issue using two di¤erent data sets and focusing on out- of-sample forecasts. Forecasting power is measured using standard statistical tests and, additionally, using an alternative approach based on measuring the economic value of forecasts after building a portfolio of assets. We nd there is a substantial economic value on conditioning on the proposed models.
Resumo:
This paper considers the instrumental variable regression model when there is uncertainty about the set of instruments, exogeneity restrictions, the validity of identifying restrictions and the set of exogenous regressors. This uncertainty can result in a huge number of models. To avoid statistical problems associated with standard model selection procedures, we develop a reversible jump Markov chain Monte Carlo algorithm that allows us to do Bayesian model averaging. The algorithm is very exible and can be easily adapted to analyze any of the di¤erent priors that have been proposed in the Bayesian instrumental variables literature. We show how to calculate the probability of any relevant restriction (e.g. the posterior probability that over-identifying restrictions hold) and discuss diagnostic checking using the posterior distribution of discrepancy vectors. We illustrate our methods in a returns-to-schooling application.
Resumo:
The paper proposes a general model that will encompass trade and social benefits of a common language, a preference for a variety of languages, the fundamental role of translators, an emotional attachment to maternal language, and the threat that globalization poses to the vast majority of languages. With respect to people’s emotional attachment, the model considers minorities to suffer losses from the subordinate status of their language. In addition, the model treats the threat to minority language as coming from the failure of the parents in the minority to transmit their maternal language (durably) to their children. Some familiar results occur. In particular, we encounter the usual social inefficiencies of decentralized solutions to language learning when the sole benefits of the learning are communicative benefits (though translation intervenes). However, these social inefficiencies assume a totally different air when the con-sumer gains of variety are brought in. One fundamental aim of the paper is to bring together contributions to the economics of language from labor economics, network externalities and international trade that are typically treated separately.
Resumo:
The paper proposes a general model that will encompass trade and social benefits of a common language, a preference for a variety of languages, the fundamental role of translators, an emo-tional attachment to maternal language, and the threat that globalization poses to the vast ma-jority of languages. With respect to people’s emotional attachment, the model considers minor-ities to suffer losses from the subordinate status of their language. In addition, the model treats the threat to minority language as coming from the failure of the parents in the minority to transmit their maternal language (durably) to their children. Some familiar results occur. In particular, we encounter the usual social inefficiencies of decentralized solutions to language learning when the sole benefits of the learning are communicative benefits (though translation intervenes). However, these social inefficiencies assume a totally different air when the con-sumer gains of variety are brought in. One fundamental aim of the paper is to bring together contributions to the economics of language from labor economics, network externalities and international trade that are typically treated separately.
Resumo:
This paper introduces a State Space approach to explain the dynamics of rent growth, expected returns and Price-Rent ratio in housing markets. According to the present value model, movements in price to rent ratio should be matched by movements in expected returns and expected rent growth. The state space framework assume that both variables follow an autoregressive process of order one. The model is applied to the US and UK housing market, which yields series of the latent variables given the behaviour of the Price-Rent ratio. Resampling techniques and bootstrapped likelihood ratios show that expected returns tend to be highly persistent compared to rent growth. The Öltered expected returns is considered in a simple predictability of excess returns model with high statistical predictability evidenced for the UK. Overall, it is found that the present value model tends to have strong statistical predictability in the UK housing markets.
Resumo:
How far has English already spread? How much further can we expect it to go? In response to the first question, this chapter tries to identify the areas of life where English already serves as a lingua franca in the world (more or less) and those where the language faces sharp competition and does not threaten to marginalize the other major languages. The former areas of life are international safety, the internal business of international organizations, internal communication within the international news industry, international sports and science. The latter areas are the press, television, the internet, publishing and international trade. As to the second question, about the future prospects of English, the chapter argues that the advance of English will depend heavily on the motives to learn the other major languages in the world as well. Based on the empirical evidence, the same model applies to the incentives to learn English and these other languages. On the important topic of welfare, the cultural market is the single one where it is arguable that the progress of English has gone too far. English dominance in the song, the cinema and the best-seller is indeed extraordinary and difficult to reconcile with the evidence popular attachments to home languages, which is otherwise strong and apparent.