906 resultados para General Linear Methods
Resumo:
A novel near-infrared spectroscopy (NIRS) method has been researched and developed for the simultaneous analyses of the chemical components and associated properties of mint (Mentha haplocalyx Briq.) tea samples. The common analytes were: total polysaccharide content, total flavonoid content, total phenolic content, and total antioxidant activity. To resolve the NIRS data matrix for such analyses, least squares support vector machines was found to be the best chemometrics method for prediction, although it was closely followed by the radial basis function/partial least squares model. Interestingly, the commonly used partial least squares was unsatisfactory in this case. Additionally, principal component analysis and hierarchical cluster analysis were able to distinguish the mint samples according to their four geographical provinces of origin, and this was further facilitated with the use of the chemometrics classification methods-K-nearest neighbors, linear discriminant analysis, and partial least squares discriminant analysis. In general, given the potential savings with sampling and analysis time as well as with the costs of special analytical reagents required for the standard individual methods, NIRS offered a very attractive alternative for the simultaneous analysis of mint samples.
Resumo:
Stability analyses have been widely used to better understand the mechanism of traffic jam formation. In this paper, we consider the impact of cooperative systems (a.k.a. connected vehicles) on traffic dynamics and, more precisely, on flow stability. Cooperative systems are emerging technologies enabling communication between vehicles and/or with the infrastructure. In a distributed communication framework, equipped vehicles are able to send and receive information to/from other equipped vehicles. Here, the effects of cooperative traffic are modeled through a general bilateral multianticipative car-following law that improves cooperative drivers' perception of their surrounding traffic conditions within a given communication range. Linear stability analyses are performed for a broad class of car-following models. They point out different stability conditions in both multianticipative and nonmultianticipative situations. To better understand what happens in unstable conditions, information on the shock wave structure is studied in the weakly nonlinear regime by the mean of the reductive perturbation method. The shock wave equation is obtained for generic car-following models by deriving the Korteweg de Vries equations. We then derive traffic-state-dependent conditions for the sign of the solitary wave (soliton) amplitude. This analytical result is verified through simulations. Simulation results confirm the validity of the speed estimate. The variation of the soliton amplitude as a function of the communication range is provided. The performed linear and weakly nonlinear analyses help justify the potential benefits of vehicle-integrated communication systems and provide new insights supporting the future implementation of cooperative systems.
Resumo:
Background Qualitative research is increasingly being recognised as a vital aspect of primary healthcare research. Teaching and learning how to conduct qualitative research is especially important for general practitioners and other clinicians in the professional educational setting. This article examines a case study of postgraduate professional education in qualitative research for clinicians, for the purpose of enabling a robust discussion around teaching and learning in medicine and the health sciences. Method A series of three workshops was delivered for primary healthcare academics. The workshops were evaluated using a quantitative survey and qualitative free-text responses to enable descriptive analyses. Results Participants found qualitative philosophy and theory the most difficult areas to engage with, and learning qualitative coding and analysis was considered the easiest to learn. Discussion Key elements for successful teaching were identified, including the use of adult learning principles, the value of an experienced facilitator and an awareness of the impact of clinical subcultures on learning.
Resumo:
INTRODUCTION Although the high heritability of BMD variation has long been established, few genes have been conclusively shown to affect the variation of BMD in the general population. Extreme truncate selection has been proposed as a more powerful alternative to unselected cohort designs in quantitative trait association studies. We sought to test these theoretical predictions in studies of the bone densitometry measures BMD, BMC, and femoral neck area, by investigating their association with members of the Wnt pathway, some of which have previously been shown to be associated with BMD in much larger cohorts, in a moderate-sized extreme truncate selected cohort (absolute value BMD Z-scores = 1.5-4.0; n = 344). MATERIALS AND METHODS Ninety-six tag-single nucleotide polymorphism (SNPs) lying in 13 Wnt signaling pathway genes were selected to tag common genetic variation (minor allele frequency [MAF] > 5% with an r(2) > 0.8) within 5 kb of all exons of 13 Wnt signaling pathway genes. The genes studied included LRP1, LRP5, LRP6, Wnt3a, Wnt7b, Wnt10b, SFRP1, SFRP2, DKK1, DKK2, FZD7, WISP3, and SOST. Three hundred forty-four cases with either high or low BMD were genotyped by Illumina Goldengate microarray SNP genotyping methods. Association was tested either by Cochrane-Armitage test for dichotomous variables or by linear regression for quantitative traits. RESULTS Strong association was shown with LRP5, polymorphisms of which have previously been shown to influence total hip BMD (minimum p = 0.0006). In addition, polymorphisms of the Wnt antagonist, SFRP1, were significantly associated with BMD and BMC (minimum p = 0.00042). Previously reported associations of LRP1, LRP6, and SOST with BMD were confirmed. Two other Wnt pathway genes, Wnt3a and DKK2, also showed nominal association with BMD. CONCLUSIONS This study shows that polymorphisms of multiple members of the Wnt pathway are associated with BMD variation. Furthermore, this study shows in a practical trial that study designs involving extreme truncate selection and moderate sample sizes can robustly identify genes of relevant effect sizes involved in BMD variation in the general population. This has implications for the design of future genome-wide studies of quantitative bone phenotypes relevant to osteoporosis.
Resumo:
Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. Results: The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification. Conclusion: A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational – experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.
Resumo:
In this paper, expressions for convolution multiplication properties of DCT IV and DST IV are derived starting from equivalent DFT representations. Using these expressions methods for implementing linear filtering through block convolution in the DCT IV and DST IV domain are proposed. Techniques developed for DCT IV and DST IV are further extended to MDCT and MDST where the filter implementation is near exact for symmetric filters and approximate for non-symmetric filters. No additional overlapping is required for implementing the symmetric filtering in the MDCT domain and hence the proposed algorithm is computationally competitive with DFT based systems. Moreover, inherent 50% overlap between the adjacent frames used for MDCT/MDST domain reduces the blocking artifacts due to block processing or quantization. The techniques are computationally efficient for symmetric filters and provides a new alternative to DFT based convolution.
Resumo:
Aim There are limited studies documenting the frequency and reason for attendance to primary health care services in Australian children, particularly for urban Aboriginal and Torres Strait Islander children. This study describes health service utilisation in this population in an urban setting. Methods An ongoing prospective cohort study of Aboriginal and Torres Strait Islander children aged <5 years registered with an urban Aboriginal and Torres Strait Islander primary health care centre in Brisbane, Australia. Detailed demographic, clinical, health service utilisation and risk factor data are collected by Aboriginal researchers at enrolment and monthly for a period of 12 months on each child. The incidence of health service utilisation was calculated according to the Poisson distribution. Results Between 14 February 2013 and 31 October 2014, 118 children were recruited, providing data for 535 child-months of observation. Ninety-one percent of children were Aboriginal, 4% Torres Strait Islander and 5% were both Aboriginal and Torres Strait Islander. The incidence of presentations to see a doctor for any reason was 43.9 episodes/100 child months (95%CI 38.4 – 49.9) The most common reasons for presentation were for immunisations (23%), respiratory illnesses (19%) and for Australian Government funded Indigenous child health check (16%). The primary health services used, for majority of these visits were Aboriginal and Torres Strait Islander specific medical services (61%). Conclusions Within a cultural-specific service for an urban Aboriginal and Torres Strait Islander people, there is a high frequency of childhood attendance at for primary health care services. Well-health checks and respiratory illnesses were the most common reasons. The high proportion of visits for well child services suggests a potential for opportunistic health promotion, education and early interventions across a range of child health issues.
Resumo:
Objective To discuss generalized estimating equations as an extension of generalized linear models by commenting on the paper of Ziegler and Vens "Generalized Estimating Equations. Notes on the Choice of the Working Correlation Matrix". Methods Inviting an international group of experts to comment on this paper. Results Several perspectives have been taken by the discussants. Econometricians have established parallels to the generalized method of moments (GMM). Statisticians discussed model assumptions and the aspect of missing data Applied statisticians; commented on practical aspects in data analysis. Conclusions In general, careful modeling correlation is encouraged when considering estimation efficiency and other implications, and a comparison of choosing instruments in GMM and generalized estimating equations, (GEE) would be worthwhile. Some theoretical drawbacks of GEE need to be further addressed and require careful analysis of data This particularly applies to the situation when data are missing at random.
Resumo:
In analysis of longitudinal data, the variance matrix of the parameter estimates is usually estimated by the 'sandwich' method, in which the variance for each subject is estimated by its residual products. We propose smooth bootstrap methods by perturbing the estimating functions to obtain 'bootstrapped' realizations of the parameter estimates for statistical inference. Our extensive simulation studies indicate that the variance estimators by our proposed methods can not only correct the bias of the sandwich estimator but also improve the confidence interval coverage. We applied the proposed method to a data set from a clinical trial of antibiotics for leprosy.
Resumo:
We propose an iterative estimating equations procedure for analysis of longitudinal data. We show that, under very mild conditions, the probability that the procedure converges at an exponential rate tends to one as the sample size increases to infinity. Furthermore, we show that the limiting estimator is consistent and asymptotically efficient, as expected. The method applies to semiparametric regression models with unspecified covariances among the observations. In the special case of linear models, the procedure reduces to iterative reweighted least squares. Finite sample performance of the procedure is studied by simulations, and compared with other methods. A numerical example from a medical study is considered to illustrate the application of the method.
Resumo:
The method of generalized estimating equation-, (GEEs) has been criticized recently for a failure to protect against misspecification of working correlation models, which in some cases leads to loss of efficiency or infeasibility of solutions. However, the feasibility and efficiency of GEE methods can be enhanced considerably by using flexible families of working correlation models. We propose two ways of constructing unbiased estimating equations from general correlation models for irregularly timed repeated measures to supplement and enhance GEE. The supplementary estimating equations are obtained by differentiation of the Cholesky decomposition of the working correlation, or as score equations for decoupled Gaussian pseudolikelihood. The estimating equations are solved with computational effort equivalent to that required for a first-order GEE. Full details and analytic expressions are developed for a generalized Markovian model that was evaluated through simulation. Large-sample ".sandwich" standard errors for working correlation parameter estimates are derived and shown to have good performance. The proposed estimating functions are further illustrated in an analysis of repeated measures of pulmonary function in children.
Resumo:
Statistical methods are often used to analyse commercial catch and effort data to provide standardised fishing effort and/or a relative index of fish abundance for input into stock assessment models. Achieving reliable results has proved difficult in Australia's Northern Prawn Fishery (NPF), due to a combination of such factors as the biological characteristics of the animals, some aspects of the fleet dynamics, and the changes in fishing technology. For this set of data, we compared four modelling approaches (linear models, mixed models, generalised estimating equations, and generalised linear models) with respect to the outcomes of the standardised fishing effort or the relative index of abundance. We also varied the number and form of vessel covariates in the models. Within a subset of data from this fishery, modelling correlation structures did not alter the conclusions from simpler statistical models. The random-effects models also yielded similar results. This is because the estimators are all consistent even if the correlation structure is mis-specified, and the data set is very large. However, the standard errors from different models differed, suggesting that different methods have different statistical efficiency. We suggest that there is value in modelling the variance function and the correlation structure, to make valid and efficient statistical inferences and gain insight into the data. We found that fishing power was separable from the indices of prawn abundance only when we offset the impact of vessel characteristics at assumed values from external sources. This may be due to the large degree of confounding within the data, and the extreme temporal changes in certain aspects of individual vessels, the fleet and the fleet dynamics.
Resumo:
This dissertation is a theoretical study of finite-state based grammars used in natural language processing. The study is concerned with certain varieties of finite-state intersection grammars (FSIG) whose parsers define regular relations between surface strings and annotated surface strings. The study focuses on the following three aspects of FSIGs: (i) Computational complexity of grammars under limiting parameters In the study, the computational complexity in practical natural language processing is approached through performance-motivated parameters on structural complexity. Each parameter splits some grammars in the Chomsky hierarchy into an infinite set of subset approximations. When the approximations are regular, they seem to fall into the logarithmic-time hierarchyand the dot-depth hierarchy of star-free regular languages. This theoretical result is important and possibly relevant to grammar induction. (ii) Linguistically applicable structural representations Related to the linguistically applicable representations of syntactic entities, the study contains new bracketing schemes that cope with dependency links, left- and right branching, crossing dependencies and spurious ambiguity. New grammar representations that resemble the Chomsky-Schützenberger representation of context-free languages are presented in the study, and they include, in particular, representations for mildly context-sensitive non-projective dependency grammars whose performance-motivated approximations are linear time parseable. (iii) Compilation and simplification of linguistic constraints Efficient compilation methods for certain regular operations such as generalized restriction are presented. These include an elegant algorithm that has already been adopted as the approach in a proprietary finite-state tool. In addition to the compilation methods, an approach to on-the-fly simplifications of finite-state representations for parse forests is sketched. These findings are tightly coupled with each other under the theme of locality. I argue that the findings help us to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing. Avainsanat: syntactic parsing, finite-state automata, dependency grammar, first-order logic, linguistic performance, star-free regular approximations, mildly context-sensitive grammars
Resumo:
The paper presents two new algorithms for the direct parallel solution of systems of linear equations. The algorithms employ a novel recursive doubling technique to obtain solutions to an nth-order system in n steps with no more than 2n(n −1) processors. Comparing their performance with the Gaussian elimination algorithm (GE), we show that they are almost 100% faster than the latter. This speedup is achieved by dispensing with all the computation involved in the back-substitution phase of GE. It is also shown that the new algorithms exhibit error characteristics which are superior to GE. An n(n + 1) systolic array structure is proposed for the implementation of the new algorithms. We show that complete solutions can be obtained, through these single-phase solution methods, in 5n−log2n−4 computational steps, without the need for intermediate I/O operations.
Resumo:
In this dissertation, I present an overall methodological framework for studying linguistic alternations, focusing specifically on lexical variation in denoting a single meaning, that is, synonymy. As the practical example, I employ the synonymous set of the four most common Finnish verbs denoting THINK, namely ajatella, miettiä, pohtia and harkita ‘think, reflect, ponder, consider’. As a continuation to previous work, I describe in considerable detail the extension of statistical methods from dichotomous linguistic settings (e.g., Gries 2003; Bresnan et al. 2007) to polytomous ones, that is, concerning more than two possible alternative outcomes. The applied statistical methods are arranged into a succession of stages with increasing complexity, proceeding from univariate via bivariate to multivariate techniques in the end. As the central multivariate method, I argue for the use of polytomous logistic regression and demonstrate its practical implementation to the studied phenomenon, thus extending the work by Bresnan et al. (2007), who applied simple (binary) logistic regression to a dichotomous structural alternation in English. The results of the various statistical analyses confirm that a wide range of contextual features across different categories are indeed associated with the use and selection of the selected think lexemes; however, a substantial part of these features are not exemplified in current Finnish lexicographical descriptions. The multivariate analysis results indicate that the semantic classifications of syntactic argument types are on the average the most distinctive feature category, followed by overall semantic characterizations of the verb chains, and then syntactic argument types alone, with morphological features pertaining to the verb chain and extra-linguistic features relegated to the last position. In terms of overall performance of the multivariate analysis and modeling, the prediction accuracy seems to reach a ceiling at a Recall rate of roughly two-thirds of the sentences in the research corpus. The analysis of these results suggests a limit to what can be explained and determined within the immediate sentential context and applying the conventional descriptive and analytical apparatus based on currently available linguistic theories and models. The results also support Bresnan’s (2007) and others’ (e.g., Bod et al. 2003) probabilistic view of the relationship between linguistic usage and the underlying linguistic system, in which only a minority of linguistic choices are categorical, given the known context – represented as a feature cluster – that can be analytically grasped and identified. Instead, most contexts exhibit degrees of variation as to their outcomes, resulting in proportionate choices over longer stretches of usage in texts or speech.