85 resultados para Statistical Language Model
Resumo:
The chapter examines how far medieval economic crises can be identified by analysing the residuals from a simultaneous equation model of the medieval English economy. High inflation, falls in gross domestic product and large intermittent changes in wage rates are all considered as potential indicators of crisis. Potential causal factors include bad harvests, wars and political instability. The chapter suggests that crises arose when a combination of different problems overwhelmed the capacity of government to address them. It may therefore be a mistake to look for a single cause of any crisis. The coincidence of separate problems is a more plausible explanation of many crises.
Resumo:
Second language acquisition researchers often face particular challenges when attempting to generalize study findings to the wider learner population. For example, language learners constitute a heterogeneous group, and it is not always clear how a study’s findings may generalize to other individuals who may differ in terms of language background and proficiency, among many other factors. In this paper, we provide an overview of how mixed-effects models can be used to help overcome these and other issues in the field of second language acquisition. We provide an overview of the benefits of mixed-effects models and a practical example of how mixed-effects analyses can be conducted. Mixed-effects models provide second language researchers with a powerful statistical tool in the analysis of a variety of different types of data.
Resumo:
Background: Concerted evolution is normally used to describe parallel changes at different sites in a genome, but it is also observed in languages where a specific phoneme changes to the same other phoneme in many words in the lexicon—a phenomenon known as regular sound change. We develop a general statistical model that can detect concerted changes in aligned sequence data and apply it to study regular sound changes in the Turkic language family. Results: Linguistic evolution, unlike the genetic substitutional process, is dominated by events of concerted evolutionary change. Our model identified more than 70 historical events of regular sound change that occurred throughout the evolution of the Turkic language family, while simultaneously inferring a dated phylogenetic tree. Including regular sound changes yielded an approximately 4-fold improvement in the characterization of linguistic change over a simpler model of sporadic change, improved phylogenetic inference, and returned more reliable and plausible dates for events on the phylogenies. The historical timings of the concerted changes closely follow a Poisson process model, and the sound transition networks derived from our model mirror linguistic expectations. Conclusions: We demonstrate that a model with no prior knowledge of complex concerted or regular changes can nevertheless infer the historical timings and genealogical placements of events of concerted change from the signals left in contemporary data. Our model can be applied wherever discrete elements—such as genes, words, cultural trends, technologies, or morphological traits—can change in parallel within an organism or other evolving group.
Resumo:
This chapter has two goals: (a) to discuss the Spanish-Portuguese interface in current formal language acquisition research and (b) to highlight the contributions of this language pairing in the emerging field of formal third language acquisition. The authors discuss two L3 acquisition studies (Montrul, Dias, & Santos, 2011; Giancaspro, Halloran, & Iverson, in press) examining Differential Object Marking, a morphological case marker present in Spanish but not in Portuguese, arguing that the results show how data from Spanish-English bilinguals learning Brazilian Portuguese as an L3 illuminate the deterministic role of structural and typological similarity in linguistic transfer. The data provide supportive evidence for only one of three existing L3 transfer models: the Typological Proximity Model (Rothman, 2010, 2011, 2013).
Resumo:
The present study investigates the parsing of pre-nominal relative clauses (RCs) in children for the first time with a realtime methodology that reveals moment-to-moment processing patterns as the sentence unfolds. A self-paced listening experiment with Turkish-speaking children (aged 5–8) and adults showed that both groups display a sign of processing cost both in subject and object RCs at different points through the flow of the utterance when integrating the cues that are uninformative (i.e., ambiguous in function) and that are structurally and probabilistically unexpected. Both groups show a processing facilitation as soon as the morphosyntactic dependencies are completed and parse the unbounded dependencies rapidly using the morphosyntactic cues rather than waiting for the clause-final filler. These findings show that five-year-old children show similar patterns to adults in processing the morphosyntactic cues incrementally and in forming expectations about the rest of the utterance on the basis of the probabilistic model of their language.
Resumo:
A statistical-dynamical downscaling method is used to estimate future changes of wind energy output (Eout) of a benchmark wind turbine across Europe at the regional scale. With this aim, 22 global climate models (GCMs) of the Coupled Model Intercomparison Project Phase 5 (CMIP5) ensemble are considered. The downscaling method uses circulation weather types and regional climate modelling with the COSMO-CLM model. Future projections are computed for two time periods (2021–2060 and 2061–2100) following two scenarios (RCP4.5 and RCP8.5). The CMIP5 ensemble mean response reveals a more likely than not increase of mean annual Eout over Northern and Central Europe and a likely decrease over Southern Europe. There is some uncertainty with respect to the magnitude and the sign of the changes. Higher robustness in future changes is observed for specific seasons. Except from the Mediterranean area, an ensemble mean increase of Eout is simulated for winter and a decreasing for the summer season, resulting in a strong increase of the intra-annual variability for most of Europe. The latter is, in particular, probable during the second half of the 21st century under the RCP8.5 scenario. In general, signals are stronger for 2061–2100 compared to 2021–2060 and for RCP8.5 compared to RCP4.5. Regarding changes of the inter-annual variability of Eout for Central Europe, the future projections strongly vary between individual models and also between future periods and scenarios within single models. This study showed for an ensemble of 22 CMIP5 models that changes in the wind energy potentials over Europe may take place in future decades. However, due to the uncertainties detected in this research, further investigations with multi-model ensembles are needed to provide a better quantification and understanding of the future changes.
Resumo:
The predictability of high impact weather events on multiple time scales is a crucial issue both in scientific and socio-economic terms. In this study, a statistical-dynamical downscaling (SDD) approach is applied to an ensemble of decadal hindcasts obtained with the Max-Planck-Institute Earth System Model (MPI-ESM) to estimate the decadal predictability of peak wind speeds (as a proxy for gusts) over Europe. Yearly initialized decadal ensemble simulations with ten members are investigated for the period 1979–2005. The SDD approach is trained with COSMO-CLM regional climate model simulations and ERA-Interim reanalysis data and applied to the MPI-ESM hindcasts. The simulations for the period 1990–1993, which was characterized by several windstorm clusters, are analyzed in detail. The anomalies of the 95 % peak wind quantile of the MPI-ESM hindcasts are in line with the positive anomalies in reanalysis data for this period. To evaluate both the skill of the decadal predictability system and the added value of the downscaling approach, quantile verification skill scores are calculated for both the MPI-ESM large-scale wind speeds and the SDD simulated regional peak winds. Skill scores are predominantly positive for the decadal predictability system, with the highest values for short lead times and for (peak) wind speeds equal or above the 75 % quantile. This provides evidence that the analyzed hindcasts and the downscaling technique are suitable for estimating wind and peak wind speeds over Central Europe on decadal time scales. The skill scores for SDD simulated peak winds are slightly lower than those for large-scale wind speeds. This behavior can be largely attributed to the fact that peak winds are a proxy for gusts, and thus have a higher variability than wind speeds. The introduced cost-efficient downscaling technique has the advantage of estimating not only wind speeds but also estimates peak winds (a proxy for gusts) and can be easily applied to large ensemble datasets like operational decadal prediction systems.
Resumo:
With the development of convection-permitting numerical weather prediction the efficient use of high resolution observations in data assimilation is becoming increasingly important. The operational assimilation of these observations, such as Dopplerradar radial winds, is now common, though to avoid violating the assumption of un- correlated observation errors the observation density is severely reduced. To improve the quantity of observations used and the impact that they have on the forecast will require the introduction of the full, potentially correlated, error statistics. In this work, observation error statistics are calculated for the Doppler radar radial winds that are assimilated into the Met Office high resolution UK model using a diagnostic that makes use of statistical averages of observation-minus-background and observation-minus-analysis residuals. This is the first in-depth study using the diagnostic to estimate both horizontal and along-beam correlated observation errors. By considering the new results obtained it is found that the Doppler radar radial wind error standard deviations are similar to those used operationally and increase as the observation height increases. Surprisingly the estimated observation error correlation length scales are longer than the operational thinning distance. They are dependent on both the height of the observation and on the distance of the observation away from the radar. Further tests show that the long correlations cannot be attributed to the use of superobservations or the background error covariance matrix used in the assimilation. The large horizontal correlation length scales are, however, in part, a result of using a simplified observation operator.
Resumo:
Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalizing constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood functions. Several methods that permit exact, or close to exact, simulation from the posterior distribution have recently been developed. However, estimating the evidence and Bayes’ factors for these models remains challenging in general. This paper describes new random weight importance sampling and sequential Monte Carlo methods for estimating BFs that use simulation to circumvent the evaluation of the intractable likelihood, and compares them to existing methods. In some cases we observe an advantage in the use of biased weight estimates. An initial investigation into the theoretical and empirical properties of this class of methods is presented. Some support for the use of biased estimates is presented, but we advocate caution in the use of such estimates.
Resumo:
This paper examines the relationship between language, culture, and identity in a corpus of gay personal ads collected from two publications in Hong Kong in the three years before the 1997 transition of sovereignty. Gay personal ads are seen äs an "island of discourse," whose marginal nature is reflected in the use of language and in turn reflects issues of marginalization in the larger social context. Using Fairclough's (1992, 1993) three- dimensional model for critical discourse analysis, an attempt is made to uncover the relationship between text structure and issues ofpower/ideology in the society that produces the texts. On the level of text, it was found that structural components, particularly the degree of grammatical elaboration, differ according to the stated race or cultural background of the authors and their targets. On the level of discourse practice, authors were found to appropriate a variety of "voices"from the larger culture arena, the use of which amplifies or limits the participation of particular classes of individuals. Finally, on the level of social practice, the ads were found to reflect and re-create both the racial stereotypes and heterosexist ideology found in the dominant culture.