967 resultados para Data matrix
Resumo:
One of the tantalising remaining problems in compositional data analysis lies in how to deal with data sets in which there are components which are essential zeros. By anessential zero we mean a component which is truly zero, not something recorded as zero simply because the experimental design or the measuring instrument has not been sufficiently sensitive to detect a trace of the part. Such essential zeros occur inmany compositional situations, such as household budget patterns, time budgets,palaeontological zonation studies, ecological abundance studies. Devices such as nonzero replacement and amalgamation are almost invariably ad hoc and unsuccessful insuch situations. From consideration of such examples it seems sensible to build up amodel in two stages, the first determining where the zeros will occur and the secondhow the unit available is distributed among the non-zero parts. In this paper we suggest two such models, an independent binomial conditional logistic normal model and a hierarchical dependent binomial conditional logistic normal model. The compositional data in such modelling consist of an incidence matrix and a conditional compositional matrix. Interesting statistical problems arise, such as the question of estimability of parameters, the nature of the computational process for the estimation of both the incidence and compositional parameters caused by the complexity of the subcompositional structure, the formation of meaningful hypotheses, and the devising of suitable testing methodology within a lattice of such essential zero-compositional hypotheses. The methodology is illustrated by application to both simulated and real compositional data
Resumo:
Objectives: Polychlorinated biphenyls (PCBs) are considered probable human carcinogens by the International Agency for Research on Cancer and one congener, PCB126, has been rated as a known human carcinogen. A period-specific job exposure matrix (JEM) was developed for former PCB-exposed capacitor manufacturing workers (n=12,605) (1938-1977). Methods: A detailed exposure assessment for this plant was based on a number of exposure determinants (proximity, degree of contact with PCBs, temperature, ventilation, process control, job mobility). The intensity and frequency of PCB exposures by job for both inhalation and dermal exposures, and additional chemical exposures were reviewed. The JEM was developed in nine steps: (1) all unique jobs (n=1,684) were assessed using (2) defined PCB exposure determinants; (3) the exposure determinants were used to develop exposure profiles; (4) similar exposure profiles were combined into categories having similar PCB exposures; (5) qualitative intensity (high-medium-low-baseline) and frequency (continuous-intermittent) ratings were developed, and (6) used to qualitatively rate inhalation and dermal exposure separately for each category; (7) quantitative intensity ratings based on available air concentrations were developed for inhalation and dermal exposures based on equal importance of both routes of exposure; (8) adjustments were made for overall exposure, and (9) for each category the product of intensity and frequency was calculated, and exposure in the earlier era was weighted. Results: A period-specific JEM modified for two eras of stable PCB exposure conditions. Conclusions: These exposure estimates, derived from a systematic and rigorous use of the exposure determinant data, lead to cumulative PCB exposure-response relationships in the epidemiological cancer mortality and incidence studies of this cohort. [Authors]
Resumo:
A traditional photonic-force microscope (PFM) results in huge sets of data, which requires tedious numerical analysis. In this paper, we propose instead an analog signal processor to attain real-time capabilities while retaining the richness of the traditional PFM data. Our system is devoted to intracellular measurements and is fully interactive through the use of a haptic joystick. Using our specialized analog hardware along with a dedicated algorithm, we can extract the full 3D stiffness matrix of the optical trap in real time, including the off-diagonal cross-terms. Our system is also capable of simultaneously recording data for subsequent offline analysis. This allows us to check that a good correlation exists between the classical analysis of stiffness and our real-time measurements. We monitor the PFM beads using an optical microscope. The force-feedback mechanism of the haptic joystick helps us in interactively guiding the bead inside living cells and collecting information from its (possibly anisotropic) environment. The instantaneous stiffness measurements are also displayed in real time on a graphical user interface. The whole system has been built and is operational; here we present early results that confirm the consistency of the real-time measurements with offline computations.
Resumo:
When continuous data are coded to categorical variables, two types of coding are possible: crisp coding in the form of indicator, or dummy, variables with values either 0 or 1; or fuzzy coding where each observation is transformed to a set of "degrees of membership" between 0 and 1, using co-called membership functions. It is well known that the correspondence analysis of crisp coded data, namely multiple correspondence analysis, yields principal inertias (eigenvalues) that considerably underestimate the quality of the solution in a low-dimensional space. Since the crisp data only code the categories to which each individual case belongs, an alternative measure of fit is simply to count how well these categories are predicted by the solution. Another approach is to consider multiple correspondence analysis equivalently as the analysis of the Burt matrix (i.e., the matrix of all two-way cross-tabulations of the categorical variables), and then perform a joint correspondence analysis to fit just the off-diagonal tables of the Burt matrix - the measure of fit is then computed as the quality of explaining these tables only. The correspondence analysis of fuzzy coded data, called "fuzzy multiple correspondence analysis", suffers from the same problem, albeit attenuated. Again, one can count how many correct predictions are made of the categories which have highest degree of membership. But here one can also defuzzify the results of the analysis to obtain estimated values of the original data, and then calculate a measure of fit in the familiar percentage form, thanks to the resultant orthogonal decomposition of variance. Furthermore, if one thinks of fuzzy multiple correspondence analysis as explaining the two-way associations between variables, a fuzzy Burt matrix can be computed and the same strategy as in the crisp case can be applied to analyse the off-diagonal part of this matrix. In this paper these alternative measures of fit are defined and applied to a data set of continuous meteorological variables, which are coded crisply and fuzzily into three categories. Measuring the fit is further discussed when the data set consists of a mixture of discrete and continuous variables.
Resumo:
Asymptotic chi-squared test statistics for testing the equality ofmoment vectors are developed. The test statistics proposed aregeneralizedWald test statistics that specialize for different settings by inserting andappropriate asymptotic variance matrix of sample moments. Scaled teststatisticsare also considered for dealing with situations of non-iid sampling. Thespecializationwill be carried out for testing the equality of multinomial populations, andtheequality of variance and correlation matrices for both normal andnon-normaldata. When testing the equality of correlation matrices, a scaled versionofthe normal theory chi-squared statistic is proven to be an asymptoticallyexactchi-squared statistic in the case of elliptical data.
Resumo:
Graphical displays which show inter--sample distances are importantfor the interpretation and presentation of multivariate data. Except whenthe displays are two--dimensional, however, they are often difficult tovisualize as a whole. A device, based on multidimensional unfolding, isdescribed for presenting some intrinsically high--dimensional displays infewer, usually two, dimensions. This goal is achieved by representing eachsample by a pair of points, say $R_i$ and $r_i$, so that a theoreticaldistance between the $i$-th and $j$-th samples is represented twice, onceby the distance between $R_i$ and $r_j$ and once by the distance between$R_j$ and $r_i$. Self--distances between $R_i$ and $r_i$ need not be zero.The mathematical conditions for unfolding to exhibit symmetry are established.Algorithms for finding approximate fits, not constrained to be symmetric,are discussed and some examples are given.
Resumo:
Aims: A rapid and simple HPLC-MS method was developed for the simultaneousdetermination of antidementia drugs, including donepezil, galantamine, rivastigmineand its major metabolite NAP 226 - 90, and memantine, for TherapeuticDrug Monitoring (TDM). In the elderly population treated with antidementiadrugs, the presence of several comorbidities, drug interactions resulting frompolypharmacy, and variations in drug metabolism and elimination, are possiblefactors leading to the observed high interindividual variability in plasma levels.Although evidence for the benefit of TDM for antidementia drugs still remains tobe demonstrated, an individually adapted dosage through TDM might contributeto minimize the risk of adverse reactions and to increase the probability of efficienttherapeutic response. Methods: A solid-phase extraction procedure with amixed-mode cation exchange sorbent was used to isolate the drugs from 0.5 mL ofplasma. The compounds were analyzed on a reverse-phase column with a gradientelution consisting of an ammonium acetate buffer at pH 9.3 and acetonitrile anddetected by mass spectrometry in the single ion monitoring mode. Isotope-labeledinternal standards were used for quantification where possible. The validatedmethod was used to measure the plasma levels of antidementia drugs in 300patients treated with these drugs. Results: The method was validated accordingto international standards of validation, including the assessment of the trueness(-8 - 11 %), the imprecision (repeatability: 1-5%, intermediate imprecision:2 - 9 %), selectivity and matrix effects variability (less than 6 %). Furthermore,short and long-term stability of the analytes in plasma was ascertained. Themethod proved to be robust in the calibrated ranges of 1 - 300 ng/mL for rivastigmineand memantine and 2 - 300 mg/mL for donepezil, galantamine and NAP226 - 90. We recently published a full description of the method (1). We found ahigh interindividual variability in plasma levels of these drugs in a study populationof 300 patients. The plasma level measurements, with some preliminaryclinical and pharmacogenetic results, will be presented. Conclusion: A simpleLC-MS method was developed for plasma level determination of antidementiadrugs which was successfully used in a clinical study with 300 patients.
Resumo:
The central message of this paper is that nobody should be using the samplecovariance matrix for the purpose of portfolio optimization. It containsestimation error of the kind most likely to perturb a mean-varianceoptimizer. In its place, we suggest using the matrix obtained from thesample covariance matrix through a transformation called shrinkage. Thistends to pull the most extreme coefficients towards more central values,thereby systematically reducing estimation error where it matters most.Statistically, the challenge is to know the optimal shrinkage intensity,and we give the formula for that. Without changing any other step in theportfolio optimization process, we show on actual stock market data thatshrinkage reduces tracking error relative to a benchmark index, andsubstantially increases the realized information ratio of the activeportfolio manager.
Resumo:
A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.
Resumo:
Several lines of evidences have suggested that T cell activation could be impaired in the tumor environment, a condition referred to as tumor-induced immunosuppression. We have previously shown that tenascin-C, an extracellular matrix protein highly expressed in the tumor stroma, inhibits T lymphocyte activation in vitro, raising the possibility that this molecule might contribute to tumor-induced immunosuppression in vivo. However, the region of the protein mediating this effect has remained elusive. Here we report the identification of the minimal region of tenascin-C that can inhibit T cell activation. Recombinant fragments corresponding to defined regions of the molecule were tested for their ability to inhibit in vitro activation of human peripheral blood T cells induced by anti-CD3 mAbs in combination with fibronectin or IL-2. A recombinant protein encompassing the alternatively spliced fibronectin type III domains of tenascin-C (TnFnIII A-D) vigorously inhibited both early and late lymphocyte activation events including activation-induced TCR/CD8 down-modulation, cytokine production, and DNA synthesis. In agreement with this, full length recombinant tenascin-C containing the alternatively spliced region suppressed T cell activation, whereas tenascin-C lacking this region did not. Using a series of smaller fragments and deletion mutants issued from this region, we have identified the TnFnIII A1A2 domain as the minimal region suppressing T cell activation. Single TnFnIII A1 or A2 domains were no longer inhibitory, while maximal inhibition required the presence of the TnFnIII A3 domain. Altogether, these data demonstrate that the TnFnIII A1A2 domain mediate the ability of tenascin-C to inhibit in vitro T cell activation and provide insights into the immunosuppressive activity of tenascin-C in vivo.
Resumo:
Multiple lines of evidence show that matrix metalloproteinases (MMPs) are involved in the peripheral neural system degenerative and regenerative processes. MMP-9 was suggested in particular to play a role in the peripheral nerve after injury or during Wallerian degeneration. Interestingly, our previous analysis of Lpin1 mutant mice (which present morphological signs of active demyelination and acute inflammatory cell migration, similar to processes present in the PNS undergoing Wallerian degeneration) revealed an accumulation of MMP-9 in the endoneurium of affected animals. We therefore generated a mouse line lacking both the Lpin1 and the MMP-9 genes in order to determine if MMP-9 plays a role in either inhibition or potentiation of the demyelinating phenotype present in Lpin1 knockout mice. The inactivation of MMP-9 alone did not lead to defects in PNS structure or function. Interestingly we observed that the double mutant animals showed reduced nerve conduction velocity, lower myelin protein mRNA expressions, and had more histological abnormalities as compared to the Lpin1 single mutants. In addition, based on immunohistochemical analysis and macrophage markers mRNA expression, we found a lower macrophage content in the sciatic nerve of the double mutant animals. Together our data indicate that MMP-9 plays a role in macrophage recruitment during postinjury PNS regeneration processes and suggest that slower macrophage infiltration delays regenerative processes in PNS.
Resumo:
Distal myopathies represent a heterogeneous group of inherited skeletal muscle disorders. One type of adult-onset, progressive autosomal-dominant distal myopathy, frequently associated with dysphagia and dysphonia (vocal cord and pharyngeal weakness with distal myopathy [VCPDM]), has been mapped to chromosome 5q31 in a North American pedigree. Here, we report the identification of a second large VCPDM family of Bulgarian descent and fine mapping of the critical interval. Sequencing of positional candidate genes revealed precisely the same nonconservative S85C missense mutation affecting an interspecies conserved residue in the MATR3 gene in both families. MATR3 is expressed in skeletal muscle and encodes matrin 3, a component of the nuclear matrix, which is a proteinaceous network that extends throughout the nucleus. Different disease related haplotype signatures in the two families provided evidence that two independent mutational events at the same position in MATR3 cause VCPDM. Our data establish proof of principle that the nuclear matrix is crucial for normal skeletal muscle structure and function and put VCPDM on the growing list of monogenic disorders associated with the nuclear proteome.
Resumo:
By means of confocal laser scanning microscopy and indirect fluorescence experiments we have examined the behavior of heat-shock protein 70 (HSP70) within the nucleus as well as of a nuclear matrix protein (M(r) = 125 kDa) during a prolonged heat-shock response (up to 24 h at 42 degrees C) in HeLa cells. In control cells HSP70 was mainly located in the cytoplasm. The protein translocated within the nucleus upon cell exposure to hyperthermia. The fluorescent pattern revealed by monoclonal antibody to HSP70 exhibited several changes during the 24-h-long incubation. The nuclear matrix protein showed changes in its location that were evident as early as 1 h after initiation of heat shock. After 7 h of treatment, the protein regained its original distribution. However, in the late stages of the hyperthermic treatment (17-24 h) the fluorescent pattern due to 125-kDa protein changed again and its original distribution was never observed again. These results show that HSP70 changes its localization within the nucleus conceivably because it is involved in solubilizing aggregated polypeptides present in different nuclear regions. Our data also strengthen the contention that proteins of the insoluble nucleoskeleton are involved in nuclear structure changes that occur during heat-shock response.
Resumo:
BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.
Resumo:
This research provides a description of the process followed in order to assemble a "Social Accounting Matrix" for Spain corresponding to the year 2000 (SAMSP00). As argued in the paper, this process attempts to reconcile ESA95 conventions with requirements of applied general equilibrium modelling. Particularly, problems related to the level of aggregation of net taxation data, and to the valuation system used for expressing the monetary value of input-output transactions have deserved special attention. Since the adoption of ESA95 conventions, input-output transactions have been preferably valued at basic prices, which impose additional difficulties on modellers interested in computing applied general equilibrium models. This paper addresses these difficulties by developing a procedure that allows SAM-builders to change the valuation system of input-output transactions conveniently. In addition, this procedure produces new data related to net taxation information.