989 resultados para Text similarity measures
Resumo:
Flow and turbulence above urban terrain is more complex than above rural terrain, due to the different momentum and heat transfer characteristics that are affected by the presence of buildings (e.g. pressure variations around buildings). The applicability of similarity theory (as developed over rural terrain) is tested using observations of flow from a sonic anemometer located at 190.3 m height in London, U.K. using about 6500 h of data. Turbulence statistics—dimensionless wind speed and temperature, standard deviations and correlation coefficients for momentum and heat transfer—were analysed in three ways. First, turbulence statistics were plotted as a function only of a local stability parameter z/Λ (where Λ is the local Obukhov length and z is the height above ground); the σ_i/u_* values (i = u, v, w) for neutral conditions are 2.3, 1.85 and 1.35 respectively, similar to canonical values. Second, analysis of urban mixed-layer formulations during daytime convective conditions over London was undertaken, showing that atmospheric turbulence at high altitude over large cities might not behave dissimilarly from that over rural terrain. Third, correlation coefficients for heat and momentum were analyzed with respect to local stability. The results give confidence in using the framework of local similarity for turbulence measured over London, and perhaps other cities. However, the following caveats for our data are worth noting: (i) the terrain is reasonably flat, (ii) building heights vary little over a large area, and (iii) the sensor height is above the mean roughness sublayer depth.
Resumo:
Research on the production of relative clauses (RCs) has shown that in English, although children start using intransitive RCs at an earlier age, more complex, bi-propositional object RCs appear later (Hamburger & Crain, 1982; Diessel and Tomasello, 2005), and children use resumptive pronouns both in acceptable and unacceptable ways (McKee, McDaniel, & Snedeker, 1998; McKee & McDaniel, 2001). To date, it is unclear whether or not the same picture emerges in Turkish, a language with an SOV word-order and overt case marking. Some studies suggested that subject RCs are more frequent in adults and children (Slobin, 1986) and yield a better performance than object RCs (Özcan, 1996), but others reported the opposite pattern (Ekmekçi, 1990). Our study addresses this issue in Turkish children and adults, and uses participants’ errors to account for the emerging asymmetry between subject and object RCs. 37 5-to-8 year old monolingual Turkish children and 23 adult controls participated in a novel elicitation task involving cards, each consisting of four different pictures (see Figure 1). There were two sets of cards, one for the participant and one for the researcher. The former had animals with accessories (e.g., a hat) whereas the latter had no accessories. Participants were instructed to hold their card without showing it to the researcher and describe the animals with particular accessories. This prompted the use of subject and object RCs. The researcher had to identify the animals in her card (see Figure 2). A preliminary repeated measures ANOVA with the factor Group (pre-school, primary-school children) showed no differences between the groups in the use of RCs (p>.1), who were therefore collapsed into one for further analyses. A repeated measures ANOVA with the factors Group (children, adults) and RC-Type (Subject, Object) showed that children used fewer RCs than adults (F(1,58)=7.54, p<.01), and both groups used fewer object than subject RCs (F(1,58)=22.46, p<.001), but there was no Group by RC-Type interaction (see Figure 3). A similar ANOVA on the rate of grammatical RCs showed a main effect of Group (F(1,58)=77.25, p<.001), a main effect of RC-Type (F(1,58)=66.33, p<.001), and an interaction of Group by RC-Type (F(1,58)=64.6, p<.001) (see Figure 4). Children made more errors than adults in object RCs (F(1,58)=87.01, p<.001), and children made more errors in object compared to subject RCs (F(1,36)=106.35, p<.001), but adults did not show this asymmetry. The error analysis revealed that children systematically avoided the object-relativizing morpheme –DIK, which requires possessive agreement with the genitive-marked subject. They also used resumptive pronouns and resumptive full-DPs in the extraction site similarly to English children (see Figure 5). These findings are in line with Slobin (1986) and Özcan (1996). Children’s errors suggest that they avoid morphosyntactic complexity of object RCs and try to preserve the canonical word order by inserting resumptive pronouns in the extraction site. Finally, cross-linguistic similarity in the acquisition of RCs in typologically different languages suggests a higher accessibility of subject RCs both at the structural (Keenan and Comrie, 1977) and conceptual level (Bock and Warren, 1986).
Resumo:
This study examines the relation between corporate social performance and stock returns in the UK. We closely evaluate the interactions between social and financial performance with a set of disaggregated social performance indicators for environment, employment, and community activities instead of using an aggregate measure. While scores on a composite social performance indicator are negatively related to stock returns, we find the poor financial reward offered by such firms is attributable to their good social performance on the environment and, to a lesser extent, the community aspects. Considerable abnormal returns are available from holding a portfolio of the socially least desirable stocks. These relationships between social and financial performance can be rationalized by multi-factor models for explaining the cross-sectional variation in returns, but not by industry effects.
Resumo:
Traditionally, the measure of risk used in portfolio optimisation models is the variance. However, alternative measures of risk have many theoretical and practical advantages and it is peculiar therefore that they are not used more frequently. This may be because of the difficulty in deciding which measure of risk is best and any attempt to compare different risk measures may be a futile exercise until a common risk measure can be identified. To overcome this, another approach is considered, comparing the portfolio holdings produced by different risk measures, rather than the risk return trade-off. In this way we can see whether the risk measures used produce asset allocations that are essentially the same or very different. The results indicate that the portfolio compositions produced by different risk measures vary quite markedly from measure to measure. These findings have a practical consequence for the investor or fund manager because they suggest that the choice of model depends very much on the individual’s attitude to risk rather than any theoretical and/or practical advantages of one model over another.
Resumo:
Using UK equity index data, this paper considers the impact of news on time varying measures of beta, the usual measure of undiversifiable risk. The empirical model implies that beta depends on news about the market and news about the sector. The asymmetric response of beta to news about the market is consistent across all sectors considered. Recent research is divided as to whether abnormalities in equity returns arise from changes in expected returns in an efficient market or over-reactions to new information. The evidence suggests that such abnormalities may be due to changes in expected returns caused by time-variation and asymmetry in beta.
Resumo:
The recent increase in short messaging system (SMS) text messaging, often using abbreviated, non-conventional ‘textisms’ (e.g. ‘2nite’), in school-aged children has raised fears of negative consequences of such technology for literacy. The current research used a paradigm developed by Dixon and Kaminska, who showed that exposure to phonetically plausible misspellings (e.g. ‘recieve’) negatively affected subsequent spelling performance, though this was true only with adults, not children. The current research extends this work to directly investigate the effects of exposure to textisms, misspellings and correctly spelledwords on adults’ spelling. Spelling of a set of key words was assessed both before and after an exposure phase where participants read the same key words, presented either as textisms (e.g. ‘2nite’), correctly spelled (e.g. ‘tonight’) or misspelled (e.g. 'tonite’)words. Analysis showed that scores decreased from pre- to post-test following exposure to misspellings, whereas performance improved following exposure to correctly spelled words and, interestingly, to textisms. Data suggest that exposure to textisms, unlike misspellings, had a positive effect on adults’ spelling. These findings are interpreted in light of other recent research suggesting a positive relationship between texting and some literacy measures in school-aged children.
Resumo:
ABSTRACT Non-Gaussian/non-linear data assimilation is becoming an increasingly important area of research in the Geosciences as the resolution and non-linearity of models are increased and more and more non-linear observation operators are being used. In this study, we look at the effect of relaxing the assumption of a Gaussian prior on the impact of observations within the data assimilation system. Three different measures of observation impact are studied: the sensitivity of the posterior mean to the observations, mutual information and relative entropy. The sensitivity of the posterior mean is derived analytically when the prior is modelled by a simplified Gaussian mixture and the observation errors are Gaussian. It is found that the sensitivity is a strong function of the value of the observation and proportional to the posterior variance. Similarly, relative entropy is found to be a strong function of the value of the observation. However, the errors in estimating these two measures using a Gaussian approximation to the prior can differ significantly. This hampers conclusions about the effect of the non-Gaussian prior on observation impact. Mutual information does not depend on the value of the observation and is seen to be close to its Gaussian approximation. These findings are illustrated with the particle filter applied to the Lorenz ’63 system. This article is concluded with a discussion of the appropriateness of these measures of observation impact for different situations.
Resumo:
In this study two new measures of lexical diversity are tested for the first time on French. The usefulness of these measures, MTLD (McCarthy and Jarvis (2010 and this volume) ) and HD-D (McCarthy and Jarvis 2007), in predicting different aspects of language proficiency is assessed and compared with D (Malvern and Richards 1997; Malvern, Richards, Chipere and Durán 2004) and Maas (1972) in analyses of stories told by two groups of learners (n=41) of two different proficiency levels and one group of native speakers of French (n=23). The importance of careful lemmatization in studies of lexical diversity which involve highly inflected languages is also demonstrated. The paper shows that the measures of lexical diversity under study are valid proxies for language ability in that they explain up to 62 percent of the variance in French C-test scores, and up to 33 percent of the variance in a measure of complexity. The paper also provides evidence that dependence on segment size continues to be a problem for the measures of lexical diversity discussed in this paper. The paper concludes that limiting the range of text lengths or even keeping text length constant is the safest option in analysing lexical diversity.
Resumo:
Many different performance measures have been developed to evaluate field predictions in meteorology. However, a researcher or practitioner encountering a new or unfamiliar measure may have difficulty in interpreting its results, which may lead to them avoiding new measures and relying on those that are familiar. In the context of evaluating forecasts of extreme events for hydrological applications, this article aims to promote the use of a range of performance measures. Some of the types of performance measures that are introduced in order to demonstrate a six-step approach to tackle a new measure. Using the example of the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble precipitation predictions for the Danube floods of July and August 2002, to show how to use new performance measures with this approach and the way to choose between different performance measures based on their suitability for the task at hand is shown. Copyright © 2008 Royal Meteorological Society
Resumo:
Drought characterisation is an intrinsically spatio-temporal problem. A limitation of previous approaches to characterisation is that they discard much of the spatio-temporal information by reducing events to a lower-order subspace. To address this, an explicit 3-dimensional (longitude, latitude, time) structure-based method is described in which drought events are defined by a spatially and temporarily coherent set of points displaying standardised precipitation below a given threshold. Geometric methods can then be used to measure similarity between individual drought structures. Groupings of these similarities provide an alternative to traditional methods for extracting recurrent space-time signals from geophysical data. The explicit consideration of structure encourages the construction of summary statistics which relate to the event geometry. Example measures considered are the event volume, centroid, and aspect ratio. The utility of a 3-dimensional approach is demonstrated by application to the analysis of European droughts (15 °W to 35°E, and 35 °N to 70°N) for the period 1901–2006. Large-scale structure is found to be abundant with 75 events identified lasting for more than 3 months and spanning at least 0.5 × 106 km2. Near-complete dissimilarity is seen between the individual drought structures, and little or no regularity is found in the time evolution of even the most spatially similar drought events. The spatial distribution of the event centroids and the time evolution of the geographic cross-sectional areas strongly suggest that large area, sustained droughts result from the combination of multiple small area (∼106 km2) short duration (∼3 months) events. The small events are not found to occur independently in space. This leads to the hypothesis that local water feedbacks play an important role in the aggregation process.
Resumo:
Background: Exposure to solar ultraviolet-B (UV-B) radiation is a major source of vitamin D3. Chemistry climate models project decreases in ground-level solar erythemal UV over the current century. It is unclear what impact this will have on vitamin D status at the population level. The purpose of this study was to measure the association between ground-level solar UV-B and serum concentrations of 25-hydroxyvitamin D (25(OH)D) using a secondary analysis of the 2007 to 2009 Canadian Health Measures Survey (CHMS). Methods: Blood samples collected from individuals aged 12 to 79 years sampled across Canada were analyzed for 25(OH)D (n=4,398). Solar UV-B irradiance was calculated for the 15 CHMS collection sites using the Tropospheric Ultraviolet and Visible Radiation Model. Multivariable linear regression was used to evaluate the association between 25(OH)D and solar UV-B adjusted for other predictors and to explore effect modification. Results: Cumulative solar UV-B irradiance averaged over 91 days (91-day UV-B) prior to blood draw correlated significantly with 25(OH)D. Independent of other predictors, a 1 kJ/m 2 increase in 91-day UV-B was associated with a significant 0.5 nmol/L (95% CI 0.3-0.8) increase in mean 25(OH)D (P =0.0001). The relationship was stronger among younger individuals and those spending more time outdoors. Based on current projections of decreases in ground-level solar UV-B, we predict less than a 1 nmol/L decrease in mean 25(OH)D for the population. Conclusions: In Canada, cumulative exposure to ambient solar UV-B has a small but significant association with 25(OH)D concentrations. Public health messages to improve vitamin D status should target safe sun exposure with sunscreen use, and also enhanced dietary and supplemental intake and maintenance of a healthy body weight.
Resumo:
This paper introduces a novel approach for free-text keystroke dynamics authentication which incorporates the use of the keyboard’s key-layout. The method extracts timing features from specific key-pairs. The Euclidean distance is then utilized to find the level of similarity between a user’s profile data and his/her test data. The results obtained from this method are reasonable for free-text authentication while maintaining the maximum level of user relaxation. Moreover, it has been proven in this study that flight time yields better authentication results when compared with dwell time. In particular, the results were obtained with only one training sample for the purpose of practicality and ease of real life application.
Resumo:
Treffers-Daller and Korybski propose to operationalize language dominance on the basis of measures of lexical diversity, as computed, in this particular study, on transcripts of stories told by Polish-English bilinguals in each of their languages They compute four different Indices of Language Dominance (ILD) on the basis of two different measures of lexical diversity, the Index of Guiraud (Guiraud, 1954) and HD-D (McCarthy & Jarvis, 2007). They compare simple indices, which are based on subtracting scores from one language from scores for another language, to more complex indices based on the formula Birdsong borrowed from the field of handedness, namely the ratio of (Difference in Scores) / (Sum of Scores). Positive scores on each of these Indices of Language Dominance mean that informants are more English-dominant and negative scores that they are more Polish-dominant. The authors address the difficulty of comparing scores across languages by carefully lemmatizing the data. Following Flege, Mackay and Piske (2002) they also look into the validity of these indices by investigating to what extent they can predict scores on other, independently measured variables. They use correlations and regression analysis for this, which has the advantage that the dominance indices are used as continuous variables and arbitrary cut-off points between balanced and dominant bilinguals need not be chosen. However, they also show how the computation of z-scores can help facilitate a discussion about the appropriateness of different cut-off points across different data sets and measurement scales in those cases where researchers consider it necessary to make categorial distinctions between balanced and dominant bilinguals. Treffers-Daller and Korybski correlate the ILD scores with four other variables, namely Length of Residence in the UK, attitudes towards English and life in the UK, frequency of usage of English at home and frequency of code-switching. They found that the indices correlated significantly with most of these variables, but there were clear differences between the Guiraud-based indices and the HDD-based indices. In a regression analysis three of the measures were also found to be a significant predictor of English language usage at home. They conclude that the correlations and the regression analyses lend strong support to the validity of their approach to language dominance.