87 resultados para Statistical Language Model
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Estudi comparatiu de les dues traduccions catalanes publicades de David Copperfield de Charles Dickens. La primera de Josep Carner feta l’any 1930 però publicada el 1964 i la segona obra deJoan Sellent del 2003. L’anàlisi mostra que ambdues traduccions, excel·lents, reflecteixen una evolució singular i força accelerada del model de llengua que els traductors fan arribar al seu públic, que és reflex de la complexa història de la llengua catalana del segle xx, que encara s’had’escriure.
Resumo:
The protein shells, or capsids, of nearly all spherelike viruses adopt icosahedral symmetry. In the present Letter, we propose a statistical thermodynamic model for viral self-assembly. We find that icosahedral symmetry is not expected for viral capsids constructed from structurally identical protein subunits and that this symmetry requires (at least) two internal switching configurations of the protein. Our results indicate that icosahedral symmetry is not a generic consequence of free energy minimization but requires optimization of internal structural parameters of the capsid proteins
Resumo:
Based on the experience of the Maria Montessori Teaching Resources Center in Alghero, the Algherese adaptation of 'Tintín al país de l'or negre' (Barcelona, 1995), coordinated by the author of the present article, constitues one of the most successful publications in terms of the creation and utilization of a useful Algherese language model for schoools. On the basis of criteria presented in the paper 'L'ensenyament del català a l'Alguer i la qüestió del model de llengua' at a meeting of the Philological Section of the Institut d'Estudis Catalans in 2000, the model employed applies Catalan spelling standards but, at the same time, is in greater alignment with the Catalan spoken in Alghero from a morphological, lexical and syntactic point of view. the article reflects upon two possible methods of codification, e.g. of Algherese, put forward by the sociolinguist Enrico Chessa who in the process regarded the disparities and differences between spelling and phonetic transcription.
Resumo:
El present treball constitueix un estudi i una anàlisi sobre les dades obtingudes del buidatge dels manlleus recollits en dos números de la revista catalana Time Out Barcelona. L’objectiu és aprofundir en el tema dels préstecs i extreure algunes conclusions sobre els tipus d’unitats manllevades i el seu tractament.
Resumo:
The protein shells, or capsids, of nearly all spherelike viruses adopt icosahedral symmetry. In the present Letter, we propose a statistical thermodynamic model for viral self-assembly. We find that icosahedral symmetry is not expected for viral capsids constructed from structurally identical protein subunits and that this symmetry requires (at least) two internal switching configurations of the protein. Our results indicate that icosahedral symmetry is not a generic consequence of free energy minimization but requires optimization of internal structural parameters of the capsid proteins
Resumo:
A comparative study of the two published Catalan translations of Charles Dickens’ David Copperfield, the first translated by Josep Carner in 1930 but not published until 1964, and the second by Joan Sellent in 2003. The analysis shows that both translations, magnificent, reflect the unique evolution of the language model that the translators pass on to their readers, an evolution shaped by the complex history of the Catalan language in the 20th century, which is still to be written
Resumo:
The performance of the SAOP potential for the calculation of NMR chemical shifts was evaluated. SAOP results show considerable improvement with respect to previous potentials, like VWN or BP86, at least for the carbon, nitrogen, oxygen, and fluorine chemical shifts. Furthermore, a few NMR calculations carried out on third period atoms (S, P, and Cl) improved when using the SAOP potential
Resumo:
In this article we present a hybrid approach for automatic summarization of Spanish medical texts. There are a lot of systems for automatic summarization using statistics or linguistics, but only a few of them combining both techniques. Our idea is that to reach a good summary we need to use linguistic aspects of texts, but as well we should benefit of the advantages of statistical techniques. We have integrated the Cortex (Vector Space Model) and Enertex (statistical physics) systems coupled with the Yate term extractor, and the Disicosum system (linguistics). We have compared these systems and afterwards we have integrated them in a hybrid approach. Finally, we have applied this hybrid system over a corpora of medical articles and we have evaluated their performances obtaining good results.
Resumo:
In the scope of the European project Hydroptimet, INTERREG IIIB-MEDOCC programme, limited area model (LAM) intercomparison of intense events that produced many damages to people and territory is performed. As the comparison is limited to single case studies, the work is not meant to provide a measure of the different models' skill, but to identify the key model factors useful to give a good forecast on such a kind of meteorological phenomena. This work focuses on the Spanish flash-flood event, also known as "Montserrat-2000" event. The study is performed using forecast data from seven operational LAMs, placed at partners' disposal via the Hydroptimet ftp site, and observed data from Catalonia rain gauge network. To improve the event analysis, satellite rainfall estimates have been also considered. For statistical evaluation of quantitative precipitation forecasts (QPFs), several non-parametric skill scores based on contingency tables have been used. Furthermore, for each model run it has been possible to identify Catalonia regions affected by misses and false alarms using contingency table elements. Moreover, the standard "eyeball" analysis of forecast and observed precipitation fields has been supported by the use of a state-of-the-art diagnostic method, the contiguous rain area (CRA) analysis. This method allows to quantify the spatial shift forecast error and to identify the error sources that affected each model forecasts. High-resolution modelling and domain size seem to have a key role for providing a skillful forecast. Further work is needed to support this statement, including verification using a wider observational data set.
Resumo:
A configurational model for silicon oxide damaged after a high-dose ion implantation of a nonreactive species is presented. Based on statistics of silicon-centered tetrahedra, the model takes into account not only the closest environment of a given silicon atom, but also the second neighborhood, so it is specified whether the oxygen attached to one given silicon is bridging two tetrahedra or not. The frequencies and intensities of infrared vibrational bands have been calculated by averaging over the distributions and these results are in agreement with the ones obtained from infrared experimental spectra. Likewise, the chemical shifts obtained from x-ray photoelectron spectroscopy (XPS) analysis are similar to the reported values for the charge-transfer model of SiOx compounds.
Resumo:
Ever since the appearance of the ARCH model [Engle(1982a)], an impressive array of variance specifications belonging to the same class of models has emerged [i.e. Bollerslev's (1986) GARCH; Nelson's (1990) EGARCH]. This recent domain has achieved very successful developments. Nevertheless, several empirical studies seem to show that the performance of such models is not always appropriate [Boulier(1992)]. In this paper we propose a new specification: the Quadratic Moving Average Conditional heteroskedasticity model. Its statistical properties, such as the kurtosis and the symmetry, as well as two estimators (Method of Moments and Maximum Likelihood) are studied. Two statistical tests are presented, the first one tests for homoskedasticity and the second one, discriminates between ARCH and QMACH specification. A Monte Carlo study is presented in order to illustrate some of the theoretical results. An empirical study is undertaken for the DM-US exchange rate.
Resumo:
One of the main implications of the efficient market hypothesis (EMH) is that expected future returns on financial assets are not predictable if investors are risk neutral. In this paper we argue that financial time series offer more information than that this hypothesis seems to supply. In particular we postulate that runs of very large returns can be predictable for small time periods. In order to prove this we propose a TAR(3,1)-GARCH(1,1) model that is able to describe two different types of extreme events: a first type generated by large uncertainty regimes where runs of extremes are not predictable and a second type where extremes come from isolated dread/joy events. This model is new in the literature in nonlinear processes. Its novelty resides on two features of the model that make it different from previous TAR methodologies. The regimes are motivated by the occurrence of extreme values and the threshold variable is defined by the shock affecting the process in the preceding period. In this way this model is able to uncover dependence and clustering of extremes in high as well as in low volatility periods. This model is tested with data from General Motors stocks prices corresponding to two crises that had a substantial impact in financial markets worldwide; the Black Monday of October 1987 and September 11th, 2001. By analyzing the periods around these crises we find evidence of statistical significance of our model and thereby of predictability of extremes for September 11th but not for Black Monday. These findings support the hypotheses of a big negative event producing runs of negative returns in the first case, and of the burst of a worldwide stock market bubble in the second example. JEL classification: C12; C15; C22; C51 Keywords and Phrases: asymmetries, crises, extreme values, hypothesis testing, leverage effect, nonlinearities, threshold models
Resumo:
This paper explores the earnings return to Catalan knowledge for public and private workers in Catalonia. In doing so, we allow for a double simultaneous selection process. We consider, on the one hand, the non-random allocation of workers into one sector or another, and on the other, the potential self-selection into Catalan proficiency. In addition, when correcting the earnings equations, we take into account the correlation between the two selectivity rules. Our findings suggest that the apparent higher language return for public sector workers is entirely accounted for by selection effects, whereas knowledge of Catalan has a significant positive return in the private sector, which is somewhat higher when the selection processes are taken into account.
Resumo:
This paper examines the impact of ethnic divisions on conflict. The analysis relies on a theoretical model of conflict (Esteban and Ray, 2010) in which equilibrium conflict is shown to be accurately described by a linear function of just three distributional indices of ethnic diversity: the Gini coefficient, the Hirschman-Herfindahl fractionalization index, and a measure of polarization. Based on a dataset constructed by James Fearon and data from Ethnologue on ethno-linguistic groups and the "linguistic distances" between them, we compute the three distribution indices. Our results show that ethnic polarization is a highly significant correlate of conflict. Fractionalization is also significant in some of the statistical exercises, but the Gini coefficient never is. In particular, inter-group distances computed from language and embodied in polarization measures turn out to be extremely important correlates of ethnic conflict.
Resumo:
We explore in depth the validity of a recently proposed scaling law for earthquake inter-event time distributions in the case of the Southern California, using the waveform cross-correlation catalog of Shearer et al. Two statistical tests are used: on the one hand, the standard two-sample Kolmogorov-Smirnov test is in agreement with the scaling of the distributions. On the other hand, the one-sample Kolmogorov-Smirnov statistic complemented with Monte Carlo simulation of the inter-event times, as done by Clauset et al., supports the validity of the gamma distribution as a simple model of the scaling function appearing on the scaling law, for rescaled inter-event times above 0.01, except for the largest data set (magnitude greater than 2). A discussion of these results is provided.