967 resultados para Data matrix
Resumo:
In this paper we analyse, using Monte Carlo simulation, the possible consequences of incorrect assumptions on the true structure of the random effects covariance matrix and the true correlation pattern of residuals, over the performance of an estimation method for nonlinear mixed models. The procedure under study is the well known linearization method due to Lindstrom and Bates (1990), implemented in the nlme library of S-Plus and R. Its performance is studied in terms of bias, mean square error (MSE), and true coverage of the associated asymptotic confidence intervals. Ignoring other criteria like the convenience of avoiding over parameterised models, it seems worst to erroneously assume some structure than do not assume any structure when this would be adequate.
Resumo:
The graphical representation of spatial soil properties in a digital environment is complex because it requires a conversion of data collected in a discrete form onto a continuous surface. The objective of this study was to apply three-dimension techniques of interpolation and visualization on soil texture and fertility properties and establish relationships with pedogenetic factors and processes in a slope area. The GRASS Geographic Information System was used to generate three-dimensional models and ParaView software to visualize soil volumes. Samples of the A, AB, BA, and B horizons were collected in a regular 122-point grid in an area of 13 ha, in Pinhais, PR, in southern Brazil. Geoprocessing and graphic computing techniques were effective in identifying and delimiting soil volumes of distinct ranges of fertility properties confined within the soil matrix. Both three-dimensional interpolation and the visualization tool facilitated interpretation in a continuous space (volumes) of the cause-effect relationships between soil texture and fertility properties and pedological factors and processes, such as higher clay contents following the drainage lines of the area. The flattest part with more weathered soils (Oxisols) had the highest pH values and lower Al3+ concentrations. These techniques of data interpolation and visualization have great potential for use in diverse areas of soil science, such as identification of soil volumes occurring side-by-side but that exhibit different physical, chemical, and mineralogical conditions for plant root growth, and monitoring of plumes of organic and inorganic pollutants in soils and sediments, among other applications. The methodological details for interpolation and a three-dimensional view of soil data are presented here.
Resumo:
For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing power.
Resumo:
This research provides a description of the process followed in order to assemble a "Social Accounting Matrix" for Spain corresponding to the year 2000 (SAMSP00). As argued in the paper, this process attempts to reconcile ESA95 conventions with requirements of applied general equilibrium modelling. Particularly, problems related to the level of aggregation of net taxation data, and to the valuation system used for expressing the monetary value of input-output transactions have deserved special attention. Since the adoption of ESA95 conventions, input-output transactions have been preferably valued at basic prices, which impose additional difficulties on modellers interested in computing applied general equilibrium models. This paper addresses these difficulties by developing a procedure that allows SAM-builders to change the valuation system of input-output transactions conveniently. In addition, this procedure produces new data related to net taxation information.
Resumo:
BACKGROUND: The visceral (VAT) and subcutaneous (SCAT) adipose tissues play different roles in physiology and obesity. The molecular mechanisms underlying their expansion in obesity and following body weight reduction are poorly defined. METHODOLOGY: C57Bl/6 mice fed a high fat diet (HFD) for 6 months developed low, medium, or high body weight as compared to normal chow fed mice. Mice from each groups were then treated with the cannabinoid receptor 1 antagonist rimonabant or vehicle for 24 days to normalize their body weight. Transcriptomic data for visceral and subcutaneous adipose tissues from each group of mice were obtained and analyzed to identify: i) genes regulated by HFD irrespective of body weight, ii) genes whose expression correlated with body weight, iii) the biological processes activated in each tissue using gene set enrichment analysis (GSEA), iv) the transcriptional programs affected by rimonabant. PRINCIPAL FINDINGS: In VAT, "metabolic" genes encoding enzymes for lipid and steroid biosynthesis and glucose catabolism were down-regulated irrespective of body weight whereas "structure" genes controlling cell architecture and tissue remodeling had expression levels correlated with body weight. In SCAT, the identified "metabolic" and "structure" genes were mostly different from those identified in VAT and were regulated irrespective of body weight. GSEA indicated active adipogenesis in both tissues but a more prominent involvement of tissue stroma in VAT than in SCAT. Rimonabant treatment normalized most gene expression but further reduced oxidative phosphorylation gene expression in SCAT but not in VAT. CONCLUSION: VAT and SCAT show strikingly different gene expression programs in response to high fat diet and rimonabant treatment. Our results may lead to identification of therapeutic targets acting on specific fat depots to control obesity.
Resumo:
Panel data can be arranged into a matrix in two ways, called 'long' and 'wide' formats (LFand WF). The two formats suggest two alternative model approaches for analyzing paneldata: (i) univariate regression with varying intercept; and (ii) multivariate regression withlatent variables (a particular case of structural equation model, SEM). The present papercompares the two approaches showing in which circumstances they yield equivalent?insome cases, even numerically equal?results. We show that the univariate approach givesresults equivalent to the multivariate approach when restrictions of time invariance (inthe paper, the TI assumption) are imposed on the parameters of the multivariate model.It is shown that the restrictions implicit in the univariate approach can be assessed bychi-square difference testing of two nested multivariate models. In addition, commontests encountered in the econometric analysis of panel data, such as the Hausman test, areshown to have an equivalent representation as chi-square difference tests. Commonalitiesand differences between the univariate and multivariate approaches are illustrated usingan empirical panel data set of firms' profitability as well as a simulated panel data.
Resumo:
Gene transfer in eukaryotic cells and organisms suffers from epigenetic effects that result in low or unstable transgene expression and high clonal variability. Use of epigenetic regulators such as matrix attachment regions (MARs) is a promising approach to alleviate such unwanted effects. Dissection of a known MAR allowed the identification of sequence motifs that mediate elevated transgene expression. Bioinformatics analysis implied that these motifs adopt a curved DNA structure that positions nucleosomes and binds specific transcription factors. From these observations, we computed putative MARs from the human genome. Cloning of several predicted MARs indicated that they are much more potent than the previously known element, boosting the expression of recombinant proteins from cultured cells as well as mediating high and sustained expression in mice. Thus we computationally identified potent epigenetic regulators, opening new strategies toward high and stable transgene expression for research, therapeutic production or gene-based therapies.
Resumo:
Tutkimus keskittyy kansainväliseen hajauttamiseen suomalaisen sijoittajan näkökulmasta. Tutkimuksen toinen tavoite on selvittää tehostavatko uudet kovarianssimatriisiestimaattorit minimivarianssiportfolion optimointiprosessia. Tavallisen otoskovarianssimatriisin lisäksi optimoinnissa käytetään kahta kutistusestimaattoria ja joustavaa monimuuttuja-GARCH(1,1)-mallia. Tutkimusaineisto koostuu Dow Jonesin toimialaindekseistä ja OMX-H:n portfolioindeksistä. Kansainvälinen hajautusstrategia on toteutettu käyttäen toimialalähestymistapaa ja portfoliota optimoidaan käyttäen kahtatoista komponenttia. Tutkimusaieisto kattaa vuodet 1996-2005 eli 120 kuukausittaista havaintoa. Muodostettujen portfolioiden suorituskykyä mitataan Sharpen indeksillä. Tutkimustulosten mukaan kansainvälisesti hajautettujen investointien ja kotimaisen portfolion riskikorjattujen tuottojen välillä ei ole tilastollisesti merkitsevää eroa. Myöskään uusien kovarianssimatriisiestimaattoreiden käytöstä ei synnytilastollisesti merkitsevää lisäarvoa verrattuna otoskovarianssimatrisiin perustuvaan portfolion optimointiin.
Resumo:
Cerebral, ocular, dental, auricular, skeletal anomalies (CODAS) syndrome (MIM 600373) was first described and named by Shehib et al, in 1991 in a single patient. The anomalies referred to in the acronym are as follows: cerebral-developmental delay, ocular-cataracts, dental-aberrant cusp morphology and delayed eruption, auricular-malformations of the external ear, and skeletal-spondyloepiphyseal dysplasia. This distinctive constellation of anatomical findings should allow easy recognition but despite this only four apparently sporadic patients have been reported in the last 20 years indicating that the full phenotype is indeed very rare with perhaps milder or a typical presentations that are allelic but without sufficient phenotypic resemblance to permit clinical diagnosis. We performed exome sequencing in three patients (an isolated case and a brother and sister sib pair) with classical features of CODAS. Sanger sequencing was used to confirm results as well as for mutation discovery in a further four unrelated patients ascertained via their skeletal features. Compound heterozygous or homozygous mutations in LONP1 were found in all (8 separate mutations; 6 missense, 1 nonsense, 1 small in-frame deletion) thus establishing the genetic basis of CODAS and the pattern of inheritance (autosomal recessive). LONP1 encodes an enzyme of bacterial ancestry that participates in protein turnover within the mitochondrial matrix. The mutations cluster at the ATP-binding and proteolytic domains of the enzyme. Biallelic inheritance and clustering of mutations confirm dysfunction of LONP1 activity as the molecular basis of CODAS but the pathogenesis remains to be explored.
Resumo:
Matrix metalloproteinases (MMPs) are major executors of extracellular matrix remodeling and, consequently, play key roles in the response of cells to their microenvironment. The experimentally accessible stem cell population and the robust regenerative capabilities of planarians offer an ideal model to study how modulation of the proteolytic system in the extracellular environment affects cell behavior in vivo. Genome-wide identification of Schmidtea mediterranea MMPs reveals that planarians possess four mmp-like genes. Two of them (mmp1 and mmp2) are strongly expressed in a subset of secretory cells and encode putative matrilysins. The other genes (mt-mmpA and mt-mmpB) are widely expressed in postmitotic cells and appear structurally related to membrane-type MMPs. These genes are conserved in the planarian Dugesia japonica. Here we explore the role of the planarian mmp genes by RNA interference (RNAi) during tissue homeostasis and regeneration. Our analyses identify essential functions for two of them. Following inhibition of mmp1 planarians display dramatic disruption of tissues architecture and significant decrease in cell death. These results suggest that mmp1 controls tissue turnover, modulating survival of postmitotic cells. Unexpectedly, the ability to regenerate is unaffected by mmp1(RNAi). Silencing of mt-mmpA alters tissue integrity and delays blastema growth, without affecting proliferation of stem cells. Our data support the possibility that the activity of this protease modulates cell migration and regulates anoikis, with a consequent pivotal role in tissue homeostasis and regeneration. Our data provide evidence of the involvement of specific MMPs in tissue homeostasis and regeneration and demonstrate that the behavior of planarian stem cells is critically dependent on the microenvironment surrounding these cells. Studying MMPs function in the planarian model provides evidence on how individual proteases work in vivo in adult tissues. These results have high potential to generate significant information for development of regenerative and anti cancer therapies.
Resumo:
The objective of the thesis was to explore the nature and characteristics of customer-related internal communication in a global industrial matrix organization during a specific customer relationship, and how it could be improved. The theoretical part of the study views the field of the concepts of intra-organizational information and knowledge sharing. The theoretical part also views the internal communications influences to customer relationships, its problematic, and the suggestions to improve internal communication in literature. The empirical part of the study was conducted with the Content Analysis and the Social Network Analysis as research methods. The data was collected by interviews and a questionnaire. Internal communication was observed first generally within the organization from the point of view of a certain business, and secondly, during a specific customer relationship at personal level and at departmental level. The results of the study describe the nature and characteristics of internal communication in the organization. The results give 13 suggestions for improving internal communication in the organization. Although the study has been done in one specific organization, it also offers insights for other organizations as well as managers to improve their internal communication.
Resumo:
One main assumption in the theory of rough sets applied to information tables is that the elements that exhibit the same information are indiscernible (similar) and form blocks that can be understood as elementary granules of knowledge about the universe. We propose a variant of this concept defining a measure of similarity between the elements of the universe in order to consider that two objects can be indiscernible even though they do not share all the attribute values because the knowledge is partial or uncertain. The set of similarities define a matrix of a fuzzy relation satisfying reflexivity and symmetry but transitivity thus a partition of the universe is not attained. This problem can be solved calculating its transitive closure what ensure a partition for each level belonging to the unit interval [0,1]. This procedure allows generalizing the theory of rough sets depending on the minimum level of similarity accepted. This new point of view increases the rough character of the data because increases the set of indiscernible objects. Finally, we apply our results to a not real application to be capable to remark the differences and the improvements between this methodology and the classical one
Resumo:
Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.
Resumo:
Visual data mining (VDM) tools employ information visualization techniques in order to represent large amounts of high-dimensional data graphically and to involve the user in exploring data at different levels of detail. The users are looking for outliers, patterns and models – in the form of clusters, classes, trends, and relationships – in different categories of data, i.e., financial, business information, etc. The focus of this thesis is the evaluation of multidimensional visualization techniques, especially from the business user’s perspective. We address three research problems. The first problem is the evaluation of projection-based visualizations with respect to their effectiveness in preserving the original distances between data points and the clustering structure of the data. In this respect, we propose the use of existing clustering validity measures. We illustrate their usefulness in evaluating five visualization techniques: Principal Components Analysis (PCA), Sammon’s Mapping, Self-Organizing Map (SOM), Radial Coordinate Visualization and Star Coordinates. The second problem is concerned with evaluating different visualization techniques as to their effectiveness in visual data mining of business data. For this purpose, we propose an inquiry evaluation technique and conduct the evaluation of nine visualization techniques. The visualizations under evaluation are Multiple Line Graphs, Permutation Matrix, Survey Plot, Scatter Plot Matrix, Parallel Coordinates, Treemap, PCA, Sammon’s Mapping and the SOM. The third problem is the evaluation of quality of use of VDM tools. We provide a conceptual framework for evaluating the quality of use of VDM tools and apply it to the evaluation of the SOM. In the evaluation, we use an inquiry technique for which we developed a questionnaire based on the proposed framework. The contributions of the thesis consist of three new evaluation techniques and the results obtained by applying these evaluation techniques. The thesis provides a systematic approach to evaluation of various visualization techniques. In this respect, first, we performed and described the evaluations in a systematic way, highlighting the evaluation activities, and their inputs and outputs. Secondly, we integrated the evaluation studies in the broad framework of usability evaluation. The results of the evaluations are intended to help developers and researchers of visualization systems to select appropriate visualization techniques in specific situations. The results of the evaluations also contribute to the understanding of the strengths and limitations of the visualization techniques evaluated and further to the improvement of these techniques.
Resumo:
In the current study, we performed a soybean production spatial distribution analysis in Paraná State. Seven crop-year data, from 2003-04 to 2009-10, obtained from the Paraná Department of Agriculture and Supply (SEAB) were used to develop a Boxmap for each crop-year, show soybean production throughout this time interval. Moran's index was used to measure spatial autocorrelation among municipalities at an aggregate level, while LISA index local correlation. For each index, different contiguity matrix and order were used and there was a significance level study. As a result, we have showed spatial relationship among cities regarding the production, which allowed the indication of high and low production clusters. Finally, identifying main soybean-producing cities, what may provide supply chain members with information to strengthen the crop production in Paraná.