18 resultados para Automatic Analysis of Multivariate Categorical Data Sets

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dimensionality reduction is employed for visual data analysis as a way to obtaining reduced spaces for high dimensional data or to mapping data directly into 2D or 3D spaces. Although techniques have evolved to improve data segregation on reduced or visual spaces, they have limited capabilities for adjusting the results according to user's knowledge. In this paper, we propose a novel approach to handling both dimensionality reduction and visualization of high dimensional data, taking into account user's input. It employs Partial Least Squares (PLS), a statistical tool to perform retrieval of latent spaces focusing on the discriminability of the data. The method employs a training set for building a highly precise model that can then be applied to a much larger data set very effectively. The reduced data set can be exhibited using various existing visualization techniques. The training data is important to code user's knowledge into the loop. However, this work also devises a strategy for calculating PLS reduced spaces when no training data is available. The approach produces increasingly precise visual mappings as the user feeds back his or her knowledge and is capable of working with small and unbalanced training sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the analysis of instrumented indentation data, it is common practice to incorporate the combined moduli of the indenter (E-i) and the specimen (E) in the so-called reduced modulus (E-r) to account for indenter deformation. Although indenter systems with rigid or elastic tips are considered as equivalent if E-r is the same, the validity of this practice has been questioned over the years. The present work uses systematic finite element simulations to examine the role of the elastic deformation of the indenter tip in instrumented indentation measurements and the validity of the concept of the reduced modulus in conical and pyramidal (Berkovich) indentations. It is found that the apical angle increases as a result of the indenter deformation, which influences in the analysis of the results. Based upon the inaccuracies introduced by the reduced modulus approximation in the analysis of the unloading segment of instrumented indentation applied load (P)-penetration depth (delta) curves, a detailed examination is then conducted on the role of indenter deformation upon the dimensionless functions describing the loading stages of such curves. Consequences of the present results in the extraction of the uniaxial stress-strain characteristics of the indented material through such dimensional analyses are finally illustrated. It is found that large overestimations in the assessment of the strain hardening behavior result by neglecting tip compliance. Guidelines are given in the paper to reduce such overestimations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is thought that speciation in phytophagous insects is often due to colonization of novel host plants, because radiations of plant and insect lineages are typically asynchronous. Recent phylogenetic comparisons have supported this model of diversification for both insect herbivores and specialized pollinators. An exceptional case where contemporaneous plant-insect diversification might be expected is the obligate mutualism between fig trees (Ficus species, Moraceae) and their pollinating wasps (Agaonidae, Hymenoptera). The ubiquity and ecological significance of this mutualism in tropical and subtropical ecosystems has long intrigued biologists, but the systematic challenge posed by >750 interacting species pairs has hindered progress toward understanding its evolutionary history. In particular, taxon sampling and analytical tools have been insufficient for large-scale cophylogenetic analyses. Here, we sampled nearly 200 interacting pairs of fig and wasp species from across the globe. Two supermatrices were assembled: on an average, wasps had sequences from 77% of 6 genes (5.6 kb), figs had sequences from 60% of 5 genes (5.5 kb), and overall 850 new DNA sequences were generated for this study. We also developed a new analytical tool, Jane 2, for event-based phylogenetic reconciliation analysis of very large data sets. Separate Bayesian phylogenetic analyses for figs and fig wasps under relaxed molecular clock assumptions indicate Cretaceous diversification of crown groups and contemporaneous divergence for nearly half of all fig and pollinator lineages. Event-based cophylogenetic analyses further support the codiversification hypothesis. Biogeographic analyses indicate that the present-day distribution of fig and pollinator lineages is consistent with a Eurasian origin and subsequent dispersal, rather than with Gondwanan vicariance. Overall, our findings indicate that the fig-pollinator mutualism represents an extreme case among plant-insect interactions of coordinated dispersal and long-term codiversification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A computational pipeline combining texture analysis and pattern classification algorithms was developed for investigating associations between high-resolution MRI features and histological data. This methodology was tested in the study of dentate gyrus images of sclerotic hippocampi resected from refractory epilepsy patients. Images were acquired using a simple surface coil in a 3.0T MRI scanner. All specimens were subsequently submitted to histological semiquantitative evaluation. The computational pipeline was applied for classifying pixels according to: a) dentate gyrus histological parameters and b) patients' febrile or afebrile initial precipitating insult history. The pipeline results for febrile and afebrile patients achieved 70% classification accuracy, with 78% sensitivity and 80% specificity [area under the reader observer characteristics (ROC) curve: 0.89]. The analysis of the histological data alone was not sufficient to achieve significant power to separate febrile and afebrile groups. Interesting enough, the results from our approach did not show significant correlation with histological parameters (which per se were not enough to classify patient groups). These results showed the potential of adding computational texture analysis together with classification methods for detecting subtle MRI signal differences, a method sufficient to provide good clinical classification. A wide range of applications of this pipeline can also be used in other areas of medical imaging. Magn Reson Med, 2012. (c) 2012 Wiley Periodicals, Inc.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance. Results: The empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application. Conclusions: We conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data visualization techniques are powerful in the handling and analysis of multivariate systems. One such technique known as parallel coordinates was used to support the diagnosis of an event, detected by a neural network-based monitoring system, in a boiler at a Brazilian Kraft pulp mill. Its attractiveness is the possibility of the visualization of several variables simultaneously. The diagnostic procedure was carried out step-by-step going through exploratory, explanatory, confirmatory, and communicative goals. This tool allowed the visualization of the boiler dynamics in an easier way, compared to commonly used univariate trend plots. In addition it facilitated analysis of other aspects, namely relationships among process variables, distinct modes of operation and discrepant data. The whole analysis revealed firstly that the period involving the detected event was associated with a transition between two distinct normal modes of operation, and secondly the presence of unusual changes in process variables at this time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Concentrations of 39 organic compounds were determined in three fractions (head, heart and tail) obtained from the pot still distillation of fermented sugarcane juice. The results were evaluated using analysis of variance (ANOVA), Tukey's test, principal component analysis (PCA), hierarchical cluster analysis (HCA) and linear discriminant analysis (LDA). According to PCA and HCA, the experimental data lead to the formation of three clusters. The head fractions give rise to a more defined group. The heart and tail fractions showed some overlap consistent with its acid composition. The predictive ability of calibration and validation of the model generated by LDA for the three fractions classification were 90.5 and 100%, respectively. This model recognized as the heart twelve of the thirteen commercial cachacas (92.3%) with good sensory characteristics, thus showing potential for guiding the process of cuts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Portable system of energy dispersive X-ray fluorescence was used to determine the elemental composition of 68 pottery fragments from Sambaqui do Bacanga, an archeological site in Sao Luis, Maranhao, Brazil. This site was occupied from 6600 BP until 900 BP. By determining the element chemical composition of those fragments, it was possible to verify the existence of engobe in 43 pottery fragments. Obtained from two-dimensional graphs and hierarchical cluster analysis performed in fragments of stratigraphies from surface and 113-cm level, and 10 to 20, 132 and 144-cm level, it was possible to group these fragments in five distinct groups, according to their stratigraphies. The results of data grouping (two-dimensional graphics) are in agreement with hierarchical cluster analysis by Ward method. Copyright (C) 2011 John Wiley & Sons, Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Background A popular model for gene regulatory networks is the Boolean network model. In this paper, we propose an algorithm to perform an analysis of gene regulatory interactions using the Boolean network model and time-series data. Actually, the Boolean network is restricted in the sense that only a subset of all possible Boolean functions are considered. We explore some mathematical properties of the restricted Boolean networks in order to avoid the full search approach. The problem is modeled as a Constraint Satisfaction Problem (CSP) and CSP techniques are used to solve it. Results We applied the proposed algorithm in two data sets. First, we used an artificial dataset obtained from a model for the budding yeast cell cycle. The second data set is derived from experiments performed using HeLa cells. The results show that some interactions can be fully or, at least, partially determined under the Boolean model considered. Conclusions The algorithm proposed can be used as a first step for detection of gene/protein interactions. It is able to infer gene relationships from time-series data of gene expression, and this inference process can be aided by a priori knowledge available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work, different methods to estimate the value of thin film residual stresses using instrumented indentation data were analyzed. This study considered procedures proposed in the literature, as well as a modification on one of these methods and a new approach based on the effect of residual stress on the value of hardness calculated via the Oliver and Pharr method. The analysis of these methods was centered on an axisymmetric two-dimensional finite element model, which was developed to simulate instrumented indentation testing of thin ceramic films deposited onto hard steel substrates. Simulations were conducted varying the level of film residual stress, film strain hardening exponent, film yield strength, and film Poisson's ratio. Different ratios of maximum penetration depth h(max) over film thickness t were also considered, including h/t = 0.04, for which the contribution of the substrate in the mechanical response of the system is not significant. Residual stresses were then calculated following the procedures mentioned above and compared with the values used as input in the numerical simulations. In general, results indicate the difference that each method provides with respect to the input values depends on the conditions studied. The method by Suresh and Giannakopoulos consistently overestimated the values when stresses were compressive. The method provided by Wang et al. has shown less dependence on h/t than the others.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present an analysis of observations made with the Arcminute Microkelvin Imager (AMI) and the CanadaFranceHawaii Telescope (CFHT) of six galaxy clusters in a redshift range of 0.160.41. The cluster gas is modelled using the SunyaevZeldovich (SZ) data provided by AMI, while the total mass is modelled using the lensing data from the CFHT. In this paper, we (i) find very good agreement between SZ measurements (assuming large-scale virialization and a gas-fraction prior) and lensing measurements of the total cluster masses out to r200; (ii) perform the first multiple-component weak-lensing analysis of A115; (iii) confirm the unusual separation between the gas and mass components in A1914 and (iv) jointly analyse the SZ and lensing data for the relaxed cluster A611, confirming our use of a simulation-derived masstemperature relation for parametrizing measurements of the SZ effect.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: To review the presentation of hyperinsulinemic hypoglycemia of the infancy (HHI), its treatment and histology in Brazilian pediatric endocrinology sections. Materials and method: The protocol analyzed data of birth, laboratory results, treatment, surgery, and pancreas histology. Results: Twenty-five cases of HHI from six centers were analyzed: 15 male, 3/25 born by vaginal delivery. The average age at diagnosis was 10.3 days. Glucose and insulin levels in the critical sample showed an average of 24.7 mg/dL and 26.3 UI/dL. Intravenous infusion of the glucose was greater than 10 mg/kg/min in all cases (M:19,1). Diazoxide was used in 15/25 of the cases, octreotide in 10, glucocorticoid in 8, growth hormone in 3, nifedipine in 2 and glucagon in 1. Ten of the cases underwent pancreatectomy and histology results showed the diffuse form of disease. Conclusion: This is the first critic review of a Brazilian sample with congenital HHI. Arq Bras Endocrinol Metab. 2012; 56(9): 666-71

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduction. Patients with terminal heart failure have increased more than the available organs leading to a high mortality rate on the waiting list. Use of Marginal and expanded criteria donors has increased due to the heart shortage. Objective. We analyzed all heart transplantations (HTx) in Sao Paulo state over 8 years for donor profile and recipient risk factors. Method. This multi-institutional review collected HTx data from all institutions in the state of Sao Paulo, Brazil. From 2002 to 2008 (6 years), only 512 (28.8%) of 1777 available heart donors were accepted for transplantation. All medical records were analyzed retrospectively; none of the used donors was excluded, even those considered to be nonstandard. Results. The hospital mortality rate was 27.9% (n = 143) and the average follow-up time was 29.4 +/- 28.4 months. The survival rate was 55.5% (n = 285) at 6 years after HTx. Univariate analysis showed the following factors to impact survival: age (P = .0004), arterial hypertension (P = .4620), norepinephrine (P = .0450), cardiac arrest (P = .8500), diabetes mellitus (P = .5120), infection (P = .1470), CKMB (creatine kinase MB) (P = .8694), creatinine (P = .7225), and Na+ (P = .3273). On multivariate analysis, only age showed significance; logistic regression showed a significant cut-off at 40 years: organs from donors older than 40 years showed a lower late survival rates (P = .0032). Conclusions. Donor age older than 40 years represents an important risk factor for survival after HTx. Neither donor gender nor norepinephrine use negatively affected early survival.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Surveys were conducted in Brazil, Benin and Tanzania to collect predatory mites as candidates for control of the coconut mite Aceria guerreronis Keifer, a serious pest of coconut fruits. At all locations surveyed, one of the most dominant predators on infested coconut fruits was identified as Neoseiulus baraki Athias-Henriot, based on morphological similarity with regard to taxonomically relevant characters. However, scrutiny of our own and published descriptions suggests that consistent morphological differences may exist between the Benin population and those from the other geographic origins. In this study, we combined three methods to assess whether these populations belong to one species or a few distinct, yet closely related species. First, multivariate analysis of 32 morphological characters showed that the Benin population differed from the other three populations. Second, DNA sequence analysis based on the mitochondrial cytochrome oxidase subunit I (COI) showed the same difference between these populations. Third, cross-breeding between populations was unsuccessful in all combinations. These data provide evidence for the existence of cryptic species. Subsequent morphological research showed that the Benin population can be distinguished from the others by a new character (not included in the multivariate analysis), viz. the number of teeth on the fixed digit of the female chelicera.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quality of fresh-cut carambola (Averrhoa carambola L) is related to many chemical and biochemical variables especially those involved with softening and browning, both influenced by storage temperature. To study these effects, a multivariate analysis was used to evaluate slices packaged in vacuum-sealed polyolefin bags, and stored at 2.5 degrees C, 5 degrees C and 10 degrees C, for up to 16 d. The quality of slices at each temperature was correlated with the duration of storage, O(2) and CO(2) concentration in the package, physical chemical constituents, and activity of enzymes involved in softening (PG) and browning (PPO) metabolism. Three quality groups were identified by hierarchical cluster analysis, and the classification of the components within each of these groups was obtained from a principal component analysis (PCA). The characterization of samples by PCA clearly distinguished acceptable and non-acceptable slices. According to PCA, acceptable slices presented higher ascorbic acid content, greater hue angles ((o)h) and final lightness (L-5) in the first principal component (PC1). On the other hand, non-acceptable slices presented higher total pectin content. PPO activity in the PC1. Non-acceptable slices also presented higher soluble pectin content, increased pectin solubilisation and higher CO(2) concentration in the second principal component (PC2) whereas acceptable slices showed lower total sugar content. The hierarchical cluster and PCA analyses were useful for discriminating the quality of slices stored at different temperatures. (C) 2011 Elsevier B.V. All rights reserved.