951 resultados para matrix-geometric analysis


Relevância:

30.00% 30.00%

Publicador:

Resumo:

When continuous data are coded to categorical variables, two types of coding are possible: crisp coding in the form of indicator, or dummy, variables with values either 0 or 1; or fuzzy coding where each observation is transformed to a set of "degrees of membership" between 0 and 1, using co-called membership functions. It is well known that the correspondence analysis of crisp coded data, namely multiple correspondence analysis, yields principal inertias (eigenvalues) that considerably underestimate the quality of the solution in a low-dimensional space. Since the crisp data only code the categories to which each individual case belongs, an alternative measure of fit is simply to count how well these categories are predicted by the solution. Another approach is to consider multiple correspondence analysis equivalently as the analysis of the Burt matrix (i.e., the matrix of all two-way cross-tabulations of the categorical variables), and then perform a joint correspondence analysis to fit just the off-diagonal tables of the Burt matrix - the measure of fit is then computed as the quality of explaining these tables only. The correspondence analysis of fuzzy coded data, called "fuzzy multiple correspondence analysis", suffers from the same problem, albeit attenuated. Again, one can count how many correct predictions are made of the categories which have highest degree of membership. But here one can also defuzzify the results of the analysis to obtain estimated values of the original data, and then calculate a measure of fit in the familiar percentage form, thanks to the resultant orthogonal decomposition of variance. Furthermore, if one thinks of fuzzy multiple correspondence analysis as explaining the two-way associations between variables, a fuzzy Burt matrix can be computed and the same strategy as in the crisp case can be applied to analyse the off-diagonal part of this matrix. In this paper these alternative measures of fit are defined and applied to a data set of continuous meteorological variables, which are coded crisply and fuzzily into three categories. Measuring the fit is further discussed when the data set consists of a mixture of discrete and continuous variables.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Doxorubicin is an antineoplasic agent active against sarcoma pulmonary metastasis, but its clinical use is hampered by its myelotoxicity and its cumulative cardiotoxicity, when administered systemically. This limitation may be circumvented using the isolated lung perfusion (ILP) approach, wherein a therapeutic agent is infused locoregionally after vascular isolation of the lung. The influence of the mode of infusion (anterograde (AG): through the pulmonary artery (PA); retrograde (RG): through the pulmonary vein (PV)) on doxorubicin pharmacokinetics and lung distribution was unknown. Therefore, a simple, rapid and sensitive high-performance liquid chromatography method has been developed to quantify doxorubicin in four different biological matrices (infusion effluent, serum, tissues with low or high levels of doxorubicin). The related compound daunorubicin was used as internal standard (I.S.). Following a single-step protein precipitation of 500 microl samples with 250 microl acetone and 50 microl zinc sulfate 70% aqueous solution, the obtained supernatant was evaporated to dryness at 60 degrees C for exactly 45 min under a stream of nitrogen and the solid residue was solubilized in 200 microl of purified water. A 100 microl-volume was subjected to HPLC analysis onto a Nucleosil 100-5 microm C18 AB column equipped with a guard column (Nucleosil 100-5 microm C(6)H(5) (phenyl) end-capped) using a gradient elution of acetonitrile and 1-heptanesulfonic acid 0.2% pH 4: 15/85 at 0 min-->50/50 at 20 min-->100/0 at 22 min-->15/85 at 24 min-->15/85 at 26 min, delivered at 1 ml/min. The analytes were detected by fluorescence detection with excitation and emission wavelength set at 480 and 550 nm, respectively. The calibration curves were linear over the range of 2-1000 ng/ml for effluent and plasma matrices, and 0.1 microg/g-750 microg/g for tissues matrices. The method is precise with inter-day and intra-day relative standard deviation within 0.5 and 6.7% and accurate with inter-day and intra-day deviations between -5.4 and +7.7%. The in vitro stability in all matrices and in processed samples has been studied at -80 degrees C for 1 month, and at 4 degrees C for 48 h, respectively. During initial studies, heparin used as anticoagulant was found to profoundly influence the measurements of doxorubicin in effluents collected from animals under ILP. Moreover, the strong matrix effect observed with tissues samples indicate that it is mandatory to prepare doxorubicin calibration standard samples in biological matrices which would reflect at best the composition of samples to be analyzed. This method was successfully applied in animal studies for the analysis of effluent, serum and tissue samples collected from pigs and rats undergoing ILP.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of simple and multiple correspondence analysis is well-established in socialscience research for understanding relationships between two or more categorical variables.By contrast, canonical correspondence analysis, which is a correspondence analysis with linearrestrictions on the solution, has become one of the most popular multivariate techniques inecological research. Multivariate ecological data typically consist of frequencies of observedspecies across a set of sampling locations, as well as a set of observed environmental variablesat the same locations. In this context the principal dimensions of the biological variables aresought in a space that is constrained to be related to the environmental variables. Thisrestricted form of correspondence analysis has many uses in social science research as well,as is demonstrated in this paper. We first illustrate the result that canonical correspondenceanalysis of an indicator matrix, restricted to be related an external categorical variable, reducesto a simple correspondence analysis of a set of concatenated (or stacked ) tables. Then weshow how canonical correspondence analysis can be used to focus on, or partial out, aparticular set of response categories in sample survey data. For example, the method can beused to partial out the influence of missing responses, which usually dominate the results of amultiple correspondence analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Standard methods for the analysis of linear latent variable models oftenrely on the assumption that the vector of observed variables is normallydistributed. This normality assumption (NA) plays a crucial role inassessingoptimality of estimates, in computing standard errors, and in designinganasymptotic chi-square goodness-of-fit test. The asymptotic validity of NAinferences when the data deviates from normality has been calledasymptoticrobustness. In the present paper we extend previous work on asymptoticrobustnessto a general context of multi-sample analysis of linear latent variablemodels,with a latent component of the model allowed to be fixed across(hypothetical)sample replications, and with the asymptotic covariance matrix of thesamplemoments not necessarily finite. We will show that, under certainconditions,the matrix $\Gamma$ of asymptotic variances of the analyzed samplemomentscan be substituted by a matrix $\Omega$ that is a function only of thecross-product moments of the observed variables. The main advantage of thisis thatinferences based on $\Omega$ are readily available in standard softwareforcovariance structure analysis, and do not require to compute samplefourth-order moments. An illustration with simulated data in the context ofregressionwith errors in variables will be presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

γ-Hydroxybutyric acid (GHB) is an endogenous short-chain fatty acid popular as a recreational drug due to sedative and euphoric effects, but also often implicated in drug-facilitated sexual assaults owing to disinhibition and amnesic properties. Whilst discrimination between endogenous and exogenous GHB as required in intoxication cases may be achieved by the determination of the carbon isotope content, such information has not yet been exploited to answer source inference questions of forensic investigation and intelligence interests. However, potential isotopic fractionation effects occurring through the whole metabolism of GHB may be a major concern in this regard. Thus, urine specimens from six healthy male volunteers who ingested prescription GHB sodium salt, marketed as Xyrem(®), were analysed by means of gas chromatography/combustion/isotope ratio mass spectrometry to assess this particular topic. A very narrow range of δ(13)C values, spreading from -24.810/00 to -25.060/00, was observed, whilst mean δ(13)C value of Xyrem(®) corresponded to -24.990/00. Since urine samples and prescription drug could not be distinguished by means of statistical analysis, carbon isotopic effects and subsequent influence on δ(13)C values through GHB metabolism as a whole could be ruled out. Thus, a link between GHB as a raw matrix and found in a biological fluid may be established, bringing relevant information regarding source inference evaluation. Therefore, this study supports a diversified scope of exploitation for stable isotopes characterized in biological matrices from investigations on intoxication cases to drug intelligence programmes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Structural equation models are widely used in economic, socialand behavioral studies to analyze linear interrelationships amongvariables, some of which may be unobservable or subject to measurementerror. Alternative estimation methods that exploit different distributionalassumptions are now available. The present paper deals with issues ofasymptotic statistical inferences, such as the evaluation of standarderrors of estimates and chi--square goodness--of--fit statistics,in the general context of mean and covariance structures. The emphasisis on drawing correct statistical inferences regardless of thedistribution of the data and the method of estimation employed. A(distribution--free) consistent estimate of $\Gamma$, the matrix ofasymptotic variances of the vector of sample second--order moments,will be used to compute robust standard errors and a robust chi--squaregoodness--of--fit squares. Simple modifications of the usual estimateof $\Gamma$ will also permit correct inferences in the case of multi--stage complex samples. We will also discuss the conditions under which,regardless of the distribution of the data, one can rely on the usual(non--robust) inferential statistics. Finally, a multivariate regressionmodel with errors--in--variables will be used to illustrate, by meansof simulated data, various theoretical aspects of the paper.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In moment structure analysis with nonnormal data, asymptotic valid inferences require the computation of a consistent (under general distributional assumptions) estimate of the matrix $\Gamma$ of asymptotic variances of sample second--order moments. Such a consistent estimate involves the fourth--order sample moments of the data. In practice, the use of fourth--order moments leads to computational burden and lack of robustness against small samples. In this paper we show that, under certain assumptions, correct asymptotic inferences can be attained when $\Gamma$ is replaced by a matrix $\Omega$ that involves only the second--order moments of the data. The present paper extends to the context of multi--sample analysis of second--order moment structures, results derived in the context of (simple--sample) covariance structure analysis (Satorra and Bentler, 1990). The results apply to a variety of estimation methods and general type of statistics. An example involving a test of equality of means under covariance restrictions illustrates theoretical aspects of the paper.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Signal search analysis is a general method to discover and characterize sequence motifs that are positionally correlated with a functional site (e.g. a transcription or translation start site). The method has played an instrumental role in the analysis of eukaryotic promoter elements. The signal search analysis server provides access to four different computer programs as well as to a large number of precompiled functional site collections. The programs offered allow: (i) the identification of non-random sequence regions under evolutionary constraint; (ii) the detection of consensus sequence-based motifs that are over- or under-represented at a particular distance from a functional site; (iii) the analysis of the positional distribution of a consensus sequence- or weight matrix-based sequence motif around a functional site; and (iv) the optimization of a weight matrix description of a locally over-represented sequence motif. These programs can be accessed at: http://www.isrec.isb-sib.ch/ssa/.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Power transformations of positive data tables, prior to applying the correspondence analysis algorithm, are shown to open up a family of methods with direct connections to the analysis of log-ratios. Two variations of this idea are illustrated. The first approach is simply to power the original data and perform a correspondence analysis this method is shown to converge to unweighted log-ratio analysis as the power parameter tends to zero. The second approach is to apply the power transformation to thecontingency ratios, that is the values in the table relative to expected values based on the marginals this method converges to weighted log-ratio analysis, or the spectral map. Two applications are described: first, a matrix of population genetic data which is inherently two-dimensional, and second, a larger cross-tabulation with higher dimensionality, from a linguistic analysis of several books.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Closely related species may be very difficult to distinguish morphologically, yet sometimes morphology is the only reasonable possibility for taxonomic classification. Here we present learning-vector-quantization artificial neural networks as a powerful tool to classify specimens on the basis of geometric morphometric shape measurements. As an example, we trained a neural network to distinguish between field and root voles from Procrustes transformed landmark coordinates on the dorsal side of the skull, which is so similar in these two species that the human eye cannot make this distinction. Properly trained neural networks misclassified only 3% of specimens. Therefore, we conclude that the capacity of learning vector quantization neural networks to analyse spatial coordinates is a powerful tool among the range of pattern recognition procedures that is available to employ the information content of geometric morphometrics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The application of correspondence analysis to square asymmetrictables is often unsuccessful because of the strong role played by thediagonal entries of the matrix, obscuring the data off the diagonal. A simplemodification of the centering of the matrix, coupled with the correspondingchange in row and column masses and row and column metrics, allows the tableto be decomposed into symmetric and skew--symmetric components, which canthen be analyzed separately. The symmetric and skew--symmetric analyses canbe performed using a simple correspondence analysis program if the data areset up in a special block format.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates what has caused output and inflation volatility to fall in the USusing a small scale structural model using Bayesian techniques and rolling samples. Thereare instabilities in the posterior of the parameters describing the private sector, the policyrule and the standard deviation of the shocks. Results are robust to the specification ofthe policy rule. Changes in the parameters describing the private sector are the largest,but those of the policy rule and the covariance matrix of the shocks explain the changes most.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the joint visualization of two matrices which have common rowsand columns, for example multivariate data observed at two time pointsor split accord-ing to a dichotomous variable. Methods of interest includeprincipal components analysis for interval-scaled data, or correspondenceanalysis for frequency data or ratio-scaled variables on commensuratescales. A simple result in matrix algebra shows that by setting up thematrices in a particular block format, matrix sum and difference componentscan be visualized. The case when we have more than two matrices is alsodiscussed and the methodology is applied to data from the InternationalSocial Survey Program.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The generalization of simple (two-variable) correspondence analysis to more than two categorical variables, commonly referred to as multiple correspondence analysis, is neither obvious nor well-defined. We present two alternative ways of generalizing correspondence analysis, one based on the quantification of the variables and intercorrelation relationships, and the other based on the geometric ideas of simple correspondence analysis. We propose a version of multiple correspondence analysis, with adjusted principal inertias, as the method of choice for the geometric definition, since it contains simple correspondence analysis as an exact special case, which is not the situation of the standard generalizations. We also clarify the issue of supplementary point representation and the properties of joint correspondence analysis, a method that visualizes all two-way relationships between the variables. The methodology is illustrated using data on attitudes to science from the International Social Survey Program on Environment in 1993.