940 resultados para Factor Analysis, Statistical


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A study of planktonic foraminiferal assemblages from 19 stations in the neritic and oceanic regions off the Coromandel Coast, Bay of Bengal has been made using a multivariate statistical method termed as factor analysis. On the basis of abundance, 17 foraminiferal species, species were clustered into 5 groups with row normalisation and varimax rotation for Q-mode factor analysis. The 19 stations were also grouped into 5 groups with only 2 groups statistically significant using column normalisation and varimax rotation for R-mode analysis. This assemblage grouping method is suitable because groups of species/stations can explain the maximum amount of variation in them in relation to prevailing environmental conditions in the area of study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical shape analysis techniques commonly employed in the medical imaging community, such as active shape models or active appearance models, rely on principal component analysis (PCA) to decompose shape variability into a reduced set of interpretable components. In this paper we propose principal factor analysis (PFA) as an alternative and complementary tool to PCA providing a decomposition into modes of variation that can be more easily interpretable, while still being a linear efficient technique that performs dimensionality reduction (as opposed to independent component analysis, ICA). The key difference between PFA and PCA is that PFA models covariance between variables, rather than the total variance in the data. The added value of PFA is illustrated on 2D landmark data of corpora callosa outlines. Then, a study of the 3D shape variability of the human left femur is performed. Finally, we report results on vector-valued 3D deformation fields resulting from non-rigid registration of ventricles in MRI of the brain.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tumor microenvironmental stresses, such as hypoxia and lactic acidosis, play important roles in tumor progression. Although gene signatures reflecting the influence of these stresses are powerful approaches to link expression with phenotypes, they do not fully reflect the complexity of human cancers. Here, we describe the use of latent factor models to further dissect the stress gene signatures in a breast cancer expression dataset. The genes in these latent factors are coordinately expressed in tumors and depict distinct, interacting components of the biological processes. The genes in several latent factors are highly enriched in chromosomal locations. When these factors are analyzed in independent datasets with gene expression and array CGH data, the expression values of these factors are highly correlated with copy number alterations (CNAs) of the corresponding BAC clones in both the cell lines and tumors. Therefore, variation in the expression of these pathway-associated factors is at least partially caused by variation in gene dosage and CNAs among breast cancers. We have also found the expression of two latent factors without any chromosomal enrichment is highly associated with 12q CNA, likely an instance of "trans"-variations in which CNA leads to the variations in gene expression outside of the CNA region. In addition, we have found that factor 26 (1q CNA) is negatively correlated with HIF-1alpha protein and hypoxia pathways in breast tumors and cell lines. This agrees with, and for the first time links, known good prognosis associated with both a low hypoxia signature and the presence of CNA in this region. Taken together, these results suggest the possibility that tumor segmental aneuploidy makes significant contributions to variation in the lactic acidosis/hypoxia gene signatures in human cancers and demonstrate that latent factor analysis is a powerful means to uncover such a linkage.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Various scientific studies have explored the causes of violent behaviour from different perspectives, with psychological tests, in particular, applied to the analysis of crime factors. The relationship between bi-factors has also been extensively studied including the link between age and crime. In reality, many factors interact to contribute to criminal behaviour and as such there is a need to have a greater level of insight into its complex nature. In this article we analyse violent crime information systems containing data on psychological, environmental and genetic factors. Our approach combines elements of rough set theory with fuzzy logic and particle swarm optimisation to yield an algorithm and methodology that can effectively extract multi-knowledge from information systems. The experimental results show that our approach outperforms alternative genetic algorithm and dynamic reduct-based techniques for reduct identification and has the added advantage of identifying multiple reducts and hence multi-knowledge (rules). Identified rules are consistent with classical statistical analysis of violent crime data and also reveal new insights into the interaction between several factors. As such, the results are helpful in improving our understanding of the factors contributing to violent crime and in highlighting the existence of hidden and intangible relationships between crime factors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hydrogeological research usually includes some statistical studies devised to elucidate mean background state, characterise relationships among different hydrochemical parameters, and show the influence of human activities. These goals are achieved either by means of a statistical approach or by mixing models between end-members. Compositional data analysis has proved to be effective with the first approach, but there is no commonly accepted solution to the end-member problem in a compositional framework. We present here a possible solution based on factor analysis of compositions illustrated with a case study. We find two factors on the compositional bi-plot fitting two non-centered orthogonal axes to the most representative variables. Each one of these axes defines a subcomposition, grouping those variables that lay nearest to it. With each subcomposition a log-contrast is computed and rewritten as an equilibrium equation. These two factors can be interpreted as the isometric log-ratio coordinates (ilr) of three hidden components, that can be plotted in a ternary diagram. These hidden components might be interpreted as end-members. We have analysed 14 molarities in 31 sampling stations all along the Llobregat River and its tributaries, with a monthly measure during two years. We have obtained a bi-plot with a 57% of explained total variance, from which we have extracted two factors: factor G, reflecting geological background enhanced by potash mining; and factor A, essentially controlled by urban and/or farming wastewater. Graphical representation of these two factors allows us to identify three extreme samples, corresponding to pristine waters, potash mining influence and urban sewage influence. To confirm this, we have available analysis of diffused and widespread point sources identified in the area: springs, potash mining lixiviates, sewage, and fertilisers. Each one of these sources shows a clear link with one of the extreme samples, except fertilisers due to the heterogeneity of their composition. This approach is a useful tool to distinguish end-members, and characterise them, an issue generally difficult to solve. It is worth note that the end-member composition cannot be fully estimated but only characterised through log-ratio relationships among components. Moreover, the influence of each endmember in a given sample must be evaluated in relative terms of the other samples. These limitations are intrinsic to the relative nature of compositional data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Factor analysis as frequent technique for multivariate data inspection is widely used also for compositional data analysis. The usual way is to use a centered logratio (clr) transformation to obtain the random vector y of dimension D. The factor model is then y = Λf + e (1) with the factors f of dimension k < D, the error term e, and the loadings matrix Λ. Using the usual model assumptions (see, e.g., Basilevsky, 1994), the factor analysis model (1) can be written as Cov(y) = ΛΛT + ψ (2) where ψ = Cov(e) has a diagonal form. The diagonal elements of ψ as well as the loadings matrix Λ are estimated from an estimation of Cov(y). Given observed clr transformed data Y as realizations of the random vector y. Outliers or deviations from the idealized model assumptions of factor analysis can severely effect the parameter estimation. As a way out, robust estimation of the covariance matrix of Y will lead to robust estimates of Λ and ψ in (2), see Pison et al. (2003). Well known robust covariance estimators with good statistical properties, like the MCD or the S-estimators (see, e.g. Maronna et al., 2006), rely on a full-rank data matrix Y which is not the case for clr transformed data (see, e.g., Aitchison, 1986). The isometric logratio (ilr) transformation (Egozcue et al., 2003) solves this singularity problem. The data matrix Y is transformed to a matrix Z by using an orthonormal basis of lower dimension. Using the ilr transformed data, a robust covariance matrix C(Z) can be estimated. The result can be back-transformed to the clr space by C(Y ) = V C(Z)V T where the matrix V with orthonormal columns comes from the relation between the clr and the ilr transformation. Now the parameters in the model (2) can be estimated (Basilevsky, 1994) and the results have a direct interpretation since the links to the original variables are still preserved. The above procedure will be applied to data from geochemistry. Our special interest is on comparing the results with those of Reimann et al. (2002) for the Kola project data

Relevância:

100.00% 100.00%

Publicador:

Resumo:


Purpose – The purpose of this paper is to investigate and uncover key determinants that could explain partners' commitment to risk management in public-private partnership projects so that partners' risk management commitment is taken into the consideration of optimal risk allocation strategies.

Design/methodology/approach – Based on an extensive literature review and an examination of the purchasing power parity (PPP) market, an industry-wide questionnaire survey was conducted to collect the data for a confirmatory factor analysis. Necessary statistical tests are conducted to ensure the validity of the analysis results.

Findings – The factor analysis results show that the procedure of confirmatory factor analysis is statistically appropriate and satisfactory. As a result, partners' organizational commitment to risk management in public-private partnerships can now be determined by a set of components, namely general attitude to a risk, perceived one's own ability to manage a risk, and the perceived reward for bearing a risk.

Practical implications – It is recommended, based on the empirical results shown in this paper, that, in addition to partners' risk management capability, decision-makers, both from public and private sectors, should also seriously consider partners' risk management commitment. Both factors influence the formation of optimal risk allocation strategies, either by their individual or interacting effects. Future research may therefore explore how to form optimal risk allocation strategies by integrating organizational capability and commitment, the determinants and measurement of which have been established in this study.

Originality/value – This paper makes an original contribution to the general body of knowledge on risk allocation in large-scale infrastructure projects in Australia adopting the procurement method of public-private partnership. In particular, this paper has innovatively established a measurement model of organisational commitment to risk management, which is crucial to determining optimal risk allocation strategies and in turn achieving project success. The score coefficients of all obtained components can be used to construct components by linear combination so that commitment to risk management can be measured. Previous research has barely focused on this topic.


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multimedia content understanding research requires rigorous approach to deal with the complexity of the data. At the crux of this problem is the method to deal with multilevel data whose structure exists at multiple scales and across data sources. A common example is modeling tags jointly with images to improve retrieval, classification and tag recommendation. Associated contextual observation, such as metadata, is rich that can be exploited for content analysis. A major challenge is the need for a principal approach to systematically incorporate associated media with the primary data source of interest. Taking a factor modeling approach, we propose a framework that can discover low-dimensional structures for a primary data source together with other associated information. We cast this task as a subspace learning problem under the framework of Bayesian nonparametrics and thus the subspace dimensionality and the number of clusters are automatically learnt from data instead of setting these parameters a priori. Using Beta processes as the building block, we construct random measures in a hierarchical structure to generate multiple data sources and capture their shared statistical at the same time. The model parameters are inferred efficiently using a novel combination of Gibbs and slice sampling. We demonstrate the applicability of the proposed model in three applications: image retrieval, automatic tag recommendation and image classification. Experiments using two real-world datasets show that our approach outperforms various state-of-the-art related methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Exploratory factor analysis (hereafter, factor analysis) is a complex statistical method that is integral to many fields of research. Using factor analysis requires researchers to make several decisions, each of which affects the solutions generated. In this paper, we focus on five major decisions that are made in conducting factor analysis: (i) establishing how large the sample needs to be, (ii) choosing between factor analysis and principal components analysis, (iii) determining the number of factors to retain, (iv) selecting a method of data extraction, and (v) deciding upon the methods of factor rotation. The purpose of this paper is threefold: (i) to review the literature with respect to these five decisions, (ii) to assess current practices in nursing research, and (iii) to offer recommendations for future use. The literature reviews illustrate that factor analysis remains a dynamic field of study, with recent research having practical implications for those who use this statistical method. The assessment was conducted on 54 factor analysis (and principal components analysis) solutions presented in the results sections of 28 papers published in the 2012 volumes of the 10 highest ranked nursing journals, based on their 5-year impact factors. The main findings from the assessment were that researchers commonly used (a) participants-to-items ratios for determining sample sizes (used for 43% of solutions), (b) principal components analysis (61%) rather than factor analysis (39%), (c) the eigenvalues greater than one rule and screen tests to decide upon the numbers of factors/components to retain (61% and 46%, respectively), (d) principal components analysis and unweighted least squares as methods of data extraction (61% and 19%, respectively), and (e) the Varimax method of rotation (44%). In general, well-established, but out-dated, heuristics and practices informed decision making with respect to the performance of factor analysis in nursing studies. Based on the findings from factor analysis research, it seems likely that the use of such methods may have had a material, adverse effect on the solutions generated. We offer recommendations for future practice with respect to each of the five decisions discussed in this paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cattle are a natural reservoir for Shiga toxigenic Escherichia coli (STEC), however, no data are available on the prevalence and their possible association with organic or conventional farming practices. We have therefore studied the prevalence of STEC and specifically O157:H7 in Swiss dairy cattle by collecting faeces from approximately 500 cows from 60 farms with organic production (OP) and 60 farms with integrated (conventional) production (IP). IP farms were matched to OP farms and were comparable in terms of community, agricultural zone, and number of cows per farm. E. coli were grown overnight in an enrichment medium, followed by DNA isolation and PCR analysis using specific TaqMan assays. STEC were detected in all farms and O157:H7 were present in 25% of OP farms and 17% of IP farms. STEC were detected in 58% and O157:H7 were evidenced in 4.6% of individual faeces. Multivariate statistical analyses of over 250 parameters revealed several risk-factors for the presence of STEC and O157:H7. Risk-factors were mainly related to the potential of cross-contamination of feeds and cross-infection of cows, and age of the animals. In general, no significant differences between the two farm types concerning prevalence or risk for carrying STEC or O157:H7 were observed. Because the incidence of human disease caused by STEC in Switzerland is low, the risk that people to get infected appears to be small despite a relatively high prevalence in cattle. Nevertheless, control and prevention practices are indicated to avoid contamination of animal products.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We provide high-resolution sea surface temperature (SST) and paleoproductivity data focusing on Termination 1. We describe a new method for estimating SSTs based on multivariate statistical analyses performed on modern coccolithophore census data, and we present the first downcore reconstructions derived from coccolithophore assemblages at Ocean Drilling Project (ODP) Site 1233 located offshore Chile. We compare our coccolithophore SST record to alkenone-based SSTs as well as SST reconstructions based on dinoflagellates and radiolaria. All reconstructions generally show a remarkable concordance. As in the alkenone SST record, the Last Glacial Maximum (LGM, 19-23 kyr B.P.) is not clearly defined in our SST reconstruction. After the onset of deglaciation, three major warming steps are recorded: from 18.6 to 18 kyr B.P. (~2.6°C), from 15.7 to 15.3 kyr B.P. (~2.5°C), and from 13 to 11.4 kyr B.P. (~3.4°C). Consistent with the other records from Site 1233 and Antarctic ice core records, we observed a clear Holocene Climatic Optimum (HCO) from ~8-12 kyr B.P. Combining the SST reconstruction with coccolith absolute abundances and accumulation rates, we show that colder temperatures during the LGM are linked to higher coccolithophore productivity offshore Chile and warmer SSTs during the HCO to lower coccolithophore productivity, with indications of weak coastal upwelling. We interpret our data in terms of latitudinal displacements of the Southern Westerlies and the northern margin of the Antarctic Circumpolar Current system over the deglaciation and the Holocene.