926 resultados para UNIVARIATE


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this dissertation, I present an overall methodological framework for studying linguistic alternations, focusing specifically on lexical variation in denoting a single meaning, that is, synonymy. As the practical example, I employ the synonymous set of the four most common Finnish verbs denoting THINK, namely ajatella, miettiä, pohtia and harkita ‘think, reflect, ponder, consider’. As a continuation to previous work, I describe in considerable detail the extension of statistical methods from dichotomous linguistic settings (e.g., Gries 2003; Bresnan et al. 2007) to polytomous ones, that is, concerning more than two possible alternative outcomes. The applied statistical methods are arranged into a succession of stages with increasing complexity, proceeding from univariate via bivariate to multivariate techniques in the end. As the central multivariate method, I argue for the use of polytomous logistic regression and demonstrate its practical implementation to the studied phenomenon, thus extending the work by Bresnan et al. (2007), who applied simple (binary) logistic regression to a dichotomous structural alternation in English. The results of the various statistical analyses confirm that a wide range of contextual features across different categories are indeed associated with the use and selection of the selected think lexemes; however, a substantial part of these features are not exemplified in current Finnish lexicographical descriptions. The multivariate analysis results indicate that the semantic classifications of syntactic argument types are on the average the most distinctive feature category, followed by overall semantic characterizations of the verb chains, and then syntactic argument types alone, with morphological features pertaining to the verb chain and extra-linguistic features relegated to the last position. In terms of overall performance of the multivariate analysis and modeling, the prediction accuracy seems to reach a ceiling at a Recall rate of roughly two-thirds of the sentences in the research corpus. The analysis of these results suggests a limit to what can be explained and determined within the immediate sentential context and applying the conventional descriptive and analytical apparatus based on currently available linguistic theories and models. The results also support Bresnan’s (2007) and others’ (e.g., Bod et al. 2003) probabilistic view of the relationship between linguistic usage and the underlying linguistic system, in which only a minority of linguistic choices are categorical, given the known context – represented as a feature cluster – that can be analytically grasped and identified. Instead, most contexts exhibit degrees of variation as to their outcomes, resulting in proportionate choices over longer stretches of usage in texts or speech.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation: Recently, many univariate and several multivariate approaches have been suggested for testing differential expression of gene sets between different phenotypes. However, despite a wealth of literature studying their performance on simulated and real biological data, still there is a need to quantify their relative performance when they are testing different null hypotheses.

Results: In this article, we compare the performance of univariate and multivariate tests on both simulated and biological data. In the simulation study we demonstrate that high correlations equally affect the power of both, univariate as well as multivariate tests. In addition, for most of them the power is similarly affected by the dimensionality of the gene set and by the percentage of genes in the set, for which expression is changing between two phenotypes. The application of different test statistics to biological data reveals that three statistics (sum of squared t-tests, Hotelling's T2, N-statistic), testing different null hypotheses, find some common but also some complementing differentially expressed gene sets under specific settings. This demonstrates that due to complementing null hypotheses each test projects on different aspects of the data and for the analysis of biological data it is beneficial to use all three tests simultaneously instead of focusing exclusively on just one.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present methods for detecting phase synchronization of two unidirectionally coupled, self-sustained noisy oscillators from a signal of the driven oscillator alone. One method detects soft phase locking; another hard phase locking. Both are applied to the problem of detecting phase synchronization in von Karman vortex flow meters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The assimilation of observations with a forecast is often heavily influenced by the description of the error covariances associated with the forecast. When a temperature inversion is present at the top of the boundary layer (BL), a significant part of the forecast error may be described as a vertical positional error (as opposed to amplitude error normally dealt with in data assimilation). In these cases, failing to account for positional error explicitly is shown t o r esult in an analysis for which the inversion structure is erroneously weakened and degraded. In this article, a new assimilation scheme is proposed to explicitly include the positional error associated with an inversion. This is done through the introduction of an extra control variable to allow position errors in the a priori to be treated simultaneously with the usual amplitude errors. This new scheme, referred to as the ‘floating BL scheme’, is applied to the one-dimensional (vertical) variational assimilation of temperature. The floating BL scheme is tested with a series of idealised experiments a nd with real data from radiosondes. For each idealised experiment, the floating BL scheme gives an analysis which has the inversion structure and position in agreement with the truth, and outperforms the a ssimilation which accounts only for forecast a mplitude error. When the floating BL scheme is used to assimilate a l arge sample of radiosonde data, its ability to give an analysis with an inversion height in better agreement with that observed is confirmed. However, it is found that the use of Gaussian statistics is an inappropriate description o f t he error statistics o f t he extra c ontrol variable. This problem is alleviated by incorporating a non-Gaussian description of the new control variable in the new scheme. Anticipated challenges in implementing the scheme operationally are discussed towards the end of the article.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this study was to analyze the weight at birth (BW) and adjusted at 205 (W205), 365 (W365) and 550 (W55O) days in beef buffaloes from Brazil, using two approaches: parametric, by normal distribution, and non-parametric, by kernel function, and thus estimating the genetic, environmental and phenotypic correlation among traits. Information of 5,169 animals at birth (BW), 3,792 at 205 days (W205), 3.883 at 365 days (W365) and 1,524 at 550 days of age (W550) were used. The birth weight distribution presented an evident discrepancy in relation to the normal distribution. However, W205, W365 and W550 presented normal distributions. The birth weight presented weak genetic, environmental, and phenotypic associations with the other weight measurements. on the other hand, the weight traits at 205, 365, 550 days of age showed a high genetic correlation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this article, we evaluate the performance of the T2 chart based on the principal components (PC chart) and the simultaneous univariate control charts based on the original variables (SU X̄ charts) or based on the principal components (SUPC charts). The main reason to consider the PC chart lies on the dimensionality reduction. However, depending on the disturbance and on the way the original variables are related, the chart is very slow in signaling, except when all variables are negatively correlated and the principal component is wisely selected. Comparing the SU X̄, the SUPC and the T 2 charts we conclude that the SU X̄ charts (SUPC charts) have a better overall performance when the variables are positively (negatively) correlated. We also develop the expression to obtain the power of two S 2 charts designed for monitoring the covariance matrix. These joint S2 charts are, in the majority of the cases, more efficient than the generalized variance |S| chart.