882 resultados para Methods: data analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Papayas have a very short green life as a result of their rapid pulp softening as well as their susceptibility to physical injury and mold growth. The ripening-related changes take place very quickly, and there is a continued interest in the reduction of postharvest losses. Proteins have a central role in biological processes, and differential proteomics enables the discrimination of proteins affected during papaya ripening. A comparative analysis of the proteomes of climacteric and pre-climacteric papayas was performed using 2DE-DIGE. Third seven proteins corresponding to spots with significant differences in abundance during ripening were submitted to MS analysis, and 27 proteins were identified and classified into six main categories related to the metabolic changes occurring during ripening. Proteins from the cell wall (alpha-galactosidase and invertase), ethylene biosynthesis (methionine synthase), climacteric respiratory burst, stress response, synthesis of carotenoid precursors (hydroxymethylbutenyl 4-diphosphate synthase, GcpE), and chromoplast differentiation (fibrillin) were identified. There was some correspondence between the identified proteins and the data from previous transcript profiling of papaya fruit, but new, accumulated proteins were identified, which reinforces the importance of differential proteomics as a tool to investigate ripening and provides potentially useful information for maintaining fruit quality and minimizing postharvest losses. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dimensionality reduction is employed for visual data analysis as a way to obtaining reduced spaces for high dimensional data or to mapping data directly into 2D or 3D spaces. Although techniques have evolved to improve data segregation on reduced or visual spaces, they have limited capabilities for adjusting the results according to user's knowledge. In this paper, we propose a novel approach to handling both dimensionality reduction and visualization of high dimensional data, taking into account user's input. It employs Partial Least Squares (PLS), a statistical tool to perform retrieval of latent spaces focusing on the discriminability of the data. The method employs a training set for building a highly precise model that can then be applied to a much larger data set very effectively. The reduced data set can be exhibited using various existing visualization techniques. The training data is important to code user's knowledge into the loop. However, this work also devises a strategy for calculating PLS reduced spaces when no training data is available. The approach produces increasingly precise visual mappings as the user feeds back his or her knowledge and is capable of working with small and unbalanced training sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A common interest in gene expression data analysis is to identify from a large pool of candidate genes the genes that present significant changes in expression levels between a treatment and a control biological condition. Usually, it is done using a statistic value and a cutoff value that are used to separate the genes differentially and nondifferentially expressed. In this paper, we propose a Bayesian approach to identify genes differentially expressed calculating sequentially credibility intervals from predictive densities which are constructed using the sampled mean treatment effect from all genes in study excluding the treatment effect of genes previously identified with statistical evidence for difference. We compare our Bayesian approach with the standard ones based on the use of the t-test and modified t-tests via a simulation study, using small sample sizes which are common in gene expression data analysis. Results obtained report evidence that the proposed approach performs better than standard ones, especially for cases with mean differences and increases in treatment variance in relation to control variance. We also apply the methodologies to a well-known publicly available data set on Escherichia coli bacterium.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: To evaluate suicide rates and trends in Sao Paulo by sex, age-strata, and methods. Methods: Data was collected from State registry from 1996 to 2009. Population was estimated using the National Census. We utilized joinpoint regression analysis to explore temporal trends. We also evaluated marital status, ethnicity, birthplace and methods for suicide. Results: In the period analyzed, 6,002 suicides were accrued with a rate of 4.6 per 100,000 (7.5 in men and 2.0 in women); the male-to-female ratio was around 3.7. Trends for men presented a significant decline of 5.3% per year from 1996 to 2002, and a significant increase of 2.5% from 2002 onwards. Women did not present significant changes. For men, the elderly (> 65 years) had a significant reduction of 2.3% per year, while younger men (25-44 years) presented a significant increase of 8.6% from 2004 onwards. Women did not present significant trend changes according to age. Leading suicide methods were hanging and poisoning for men and women, respectively. Other analyses showed an increased suicide risk ratio for singles and foreigners. Conclusions: Specific epidemiological trends for suicide in the city of Sao Paulo that warrant further investigation were identified. High-risk groups - such as immigrants - could benefit from targeted strategies of suicide prevention.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new method for analysis of scattering data from lamellar bilayer systems is presented. The method employs a form-free description of the cross-section structure of the bilayer and the fit is performed directly to the scattering data, introducing also a structure factor when required. The cross-section structure (electron density profile in the case of X-ray scattering) is described by a set of Gaussian functions and the technique is termed Gaussian deconvolution. The coefficients of the Gaussians are optimized using a constrained least-squares routine that induces smoothness of the electron density profile. The optimization is coupled with the point-of-inflection method for determining the optimal weight of the smoothness. With the new approach, it is possible to optimize simultaneously the form factor, structure factor and several other parameters in the model. The applicability of this method is demonstrated by using it in a study of a multilamellar system composed of lecithin bilayers, where the form factor and structure factor are obtained simultaneously, and the obtained results provided new insight into this very well known system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this article, we propose a new Bayesian flexible cure rate survival model, which generalises the stochastic model of Klebanov et al. [Klebanov LB, Rachev ST and Yakovlev AY. A stochastic-model of radiation carcinogenesis - latent time distributions and their properties. Math Biosci 1993; 113: 51-75], and has much in common with the destructive model formulated by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)]. In our approach, the accumulated number of lesions or altered cells follows a compound weighted Poisson distribution. This model is more flexible than the promotion time cure model in terms of dispersion. Moreover, it possesses an interesting and realistic interpretation of the biological mechanism of the occurrence of the event of interest as it includes a destructive process of tumour cells after an initial treatment or the capacity of an individual exposed to irradiation to repair altered cells that results in cancer induction. In other words, what is recorded is only the damaged portion of the original number of altered cells not eliminated by the treatment or repaired by the repair system of an individual. Markov Chain Monte Carlo (MCMC) methods are then used to develop Bayesian inference for the proposed model. Also, some discussions on the model selection and an illustration with a cutaneous melanoma data set analysed by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)] are presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Background Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space. Results Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster. Conclusion Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Background In a classical study, Durkheim noted a direct relation between suicide rates and wealth in the XIX century France. Since that time, several studies have verified this relationship. It is known that suicide rates are associated with income, although the direction of this association varies worldwide. Brazil presents a heterogeneous distribution of income and suicide across its territory; however, evaluation for an association between these variables has shown mixed results. We aimed to evaluate the relationship between suicide rates and income in Brazil, State of São Paulo (SP), and City of SP, considering geographical area and temporal trends. Methods Data were extracted from the National and State official statistics departments. Three socioeconomic areas were considered according to income, from the wealthiest (area 1) to the poorest (area 3). We also considered three regions: country-wide (27 Brazilian States and 558 Brazilian micro-regions), state-wide (645 counties of SP State), and city-wide (96 districts of SP city). Relative risks (RR) were calculated among areas 1, 2, and 3 for all regions, in a cross-sectional approach. Then, we used Joinpoint analysis to explore the temporal trends of suicide rates and SaTScan to investigate geographical clusters of high/low suicide rates across the territory. Results Suicide rates in Brazil, the State of SP, and the city of SP were 6.2, 6.6, and 5.4 per 100,000, respectively. Taking suicide rates of the poorest area (3) as reference, the RR for the wealthiest area was 1.64, 0.88, and 1.65 for Brazil, State of SP, and city of SP, respectively (p for trend <0.05 for all analyses). Spatial cluster of high suicide rates were identified at Brazilian southern (RR = 2.37), state of SP western (RR = 1.32), and city of SP central (RR = 1.65) regions. A direct association between income and suicide were found for Brazil (OR = 2.59) and the city of SP (OR = 1.07), and an inverse association for the state of SP (OR = 0.49). Conclusions Temporospatial analyses revealed higher suicide rates in wealthier areas in Brazil and the city of SP and in poorer areas in the State of SP. We further discuss the role of socioeconomic characteristics for explaining these discrepancies and the importance of our findings in public health policies. Similar studies in other Brazilian States and developing countries are warranted.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Aortic aneurysm and dissection are important causes of death in older people. Ruptured aneurysms show catastrophic fatality rates reaching near 80%. Few population-based mortality studies have been published in the world and none in Brazil. The objective of the present study was to use multiple-cause-of-death methodology in the analysis of mortality trends related to aortic aneurysm and dissection in the state of Sao Paulo, between 1985 and 2009. Methods: We analyzed mortality data from the Sao Paulo State Data Analysis System, selecting all death certificates on which aortic aneurysm and dissection were listed as a cause-of-death. The variables sex, age, season of the year, and underlying, associated or total mentions of causes of death were studied using standardized mortality rates, proportions and historical trends. Statistical analyses were performed by chi-square goodness-of-fit and H Kruskal-Wallis tests, and variance analysis. The joinpoint regression model was used to evaluate changes in age-standardized rates trends. A p value less than 0.05 was regarded as significant. Results: Over a 25-year period, there were 42,615 deaths related to aortic aneurysm and dissection, of which 36,088 (84.7%) were identified as underlying cause and 6,527 (15.3%) as an associated cause-of-death. Dissection and ruptured aneurysms were considered as an underlying cause of death in 93% of the deaths. For the entire period, a significant increased trend of age-standardized death rates was observed in men and women, while certain non-significant decreases occurred from 1996/2004 until 2009. Abdominal aortic aneurysms and aortic dissections prevailed among men and aortic dissections and aortic aneurysms of unspecified site among women. In 1985 and 2009 death rates ratios of men to women were respectively 2.86 and 2.19, corresponding to a difference decrease between rates of 23.4%. For aortic dissection, ruptured and non-ruptured aneurysms, the overall mean ages at death were, respectively, 63.2, 68.4 and 71.6 years; while, as the underlying cause, the main associated causes of death were as follows: hemorrhages (in 43.8%/40.5%/13.9%); hypertensive diseases (in 49.2%/22.43%/24.5%) and atherosclerosis (in 14.8%/25.5%/15.3%); and, as associated causes, their principal overall underlying causes of death were diseases of the circulatory (55.7%), and respiratory (13.8%) systems and neoplasms (7.8%). A significant seasonal variation, with highest frequency in winter, occurred in deaths identified as underlying cause for aortic dissection, ruptured and non-ruptured aneurysms. Conclusions: This study introduces the methodology of multiple-causes-of-death to enhance epidemiologic knowledge of aortic aneurysm and dissection in São Paulo, Brazil. The results presented confer light to the importance of mortality statistics and the need for epidemiologic studies to understand unique trends in our own population.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: To evaluate suicide rates and trends in São Paulo by sex, age-strata, and methods. METHODS: Data was collected from State registry from 1996 to 2009. Population was estimated using the National Census. We utilized joinpoint regression analysis to explore temporal trends. We also evaluated marital status, ethnicity, birthplace and methods for suicide. RESULTS: In the period analyzed, 6,002 suicides were accrued with a rate of 4.6 per 100,000 (7.5 in men and 2.0 in women); the male-to-female ratio was around 3.7. Trends for men presented a significant decline of 5.3% per year from 1996 to 2002, and a significant increase of 2.5% from 2002 onwards. Women did not present significant changes. For men, the elderly (> 65 years) had a significant reduction of 2.3% per year, while younger men (25-44 years) presented a significant increase of 8.6% from 2004 onwards. Women did not present significant trend changes according to age. Leading suicide methods were hanging and poisoning for men and women, respectively. Other analyses showed an increased suicide risk ratio for singles and foreigners. CONCLUSIONS: Specific epidemiological trends for suicide in the city of São Paulo that warrant further investigation were identified. High-risk groups - such as immigrants - could benefit from targeted strategies of suicide prevention.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis is based on five papers addressing variance reduction in different ways. The papers have in common that they all present new numerical methods. Paper I investigates quantitative structure-retention relationships from an image processing perspective, using an artificial neural network to preprocess three-dimensional structural descriptions of the studied steroid molecules. Paper II presents a new method for computing free energies. Free energy is the quantity that determines chemical equilibria and partition coefficients. The proposed method may be used for estimating, e.g., chromatographic retention without performing experiments. Two papers (III and IV) deal with correcting deviations from bilinearity by so-called peak alignment. Bilinearity is a theoretical assumption about the distribution of instrumental data that is often violated by measured data. Deviations from bilinearity lead to increased variance, both in the data and in inferences from the data, unless invariance to the deviations is built into the model, e.g., by the use of the method proposed in paper III and extended in paper IV. Paper V addresses a generic problem in classification; namely, how to measure the goodness of different data representations, so that the best classifier may be constructed. Variance reduction is one of the pillars on which analytical chemistry rests. This thesis considers two aspects on variance reduction: before and after experiments are performed. Before experimenting, theoretical predictions of experimental outcomes may be used to direct which experiments to perform, and how to perform them (papers I and II). After experiments are performed, the variance of inferences from the measured data are affected by the method of data analysis (papers III-V).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The present PhD thesis was focused on the development and application of chemical methodology (Py-GC-MS) and data-processing method by multivariate data analysis (chemometrics). The chromatographic and mass spectrometric data obtained with this technique are particularly suitable to be interpreted by chemometric methods such as PCA (Principal Component Analysis) as regards data exploration and SIMCA (Soft Independent Models of Class Analogy) for the classification. As a first approach, some issues related to the field of cultural heritage were discussed with a particular attention to the differentiation of binders used in pictorial field. A marker of egg tempera the phosphoric acid esterified, a pyrolysis product of lecithin, was determined using HMDS (hexamethyldisilazane) rather than the TMAH (tetramethylammonium hydroxide) as a derivatizing reagent. The validity of analytical pyrolysis as tool to characterize and classify different types of bacteria was verified. The FAMEs chromatographic profiles represent an important tool for the bacterial identification. Because of the complexity of the chromatograms, it was possible to characterize the bacteria only according to their genus, while the differentiation at the species level has been achieved by means of chemometric analysis. To perform this study, normalized areas peaks relevant to fatty acids were taken into account. Chemometric methods were applied to experimental datasets. The obtained results demonstrate the effectiveness of analytical pyrolysis and chemometric analysis for the rapid characterization of bacterial species. Application to a samples of bacterial (Pseudomonas Mendocina), fungal (Pleorotus ostreatus) and mixed- biofilms was also performed. A comparison with the chromatographic profiles established the possibility to: • Differentiate the bacterial and fungal biofilms according to the (FAMEs) profile. • Characterize the fungal biofilm by means the typical pattern of pyrolytic fragments derived from saccharides present in the cell wall. • Individuate the markers of bacterial and fungal biofilm in the same mixed-biofilm sample.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work, new tools in atmospheric pollutant sampling and analysis were applied in order to go deeper in source apportionment study. The project was developed mainly by the study of atmospheric emission sources in a suburban area influenced by a municipal solid waste incinerator (MSWI), a medium-sized coastal tourist town and a motorway. Two main research lines were followed. For what concerns the first line, the potentiality of the use of PM samplers coupled with a wind select sensor was assessed. Results showed that they may be a valid support in source apportionment studies. However, meteorological and territorial conditions could strongly affect the results. Moreover, new markers were investigated, particularly focusing on the processes of biomass burning. OC revealed a good biomass combustion process indicator, as well as all determined organic compounds. Among metals, lead and aluminium are well related to the biomass combustion. Surprisingly PM was not enriched of potassium during bonfire event. The second research line consists on the application of Positive Matrix factorization (PMF), a new statistical tool in data analysis. This new technique was applied to datasets which refer to different time resolution data. PMF application to atmospheric deposition fluxes identified six main sources affecting the area. The incinerator’s relative contribution seemed to be negligible. PMF analysis was then applied to PM2.5 collected with samplers coupled with a wind select sensor. The higher number of determined environmental indicators allowed to obtain more detailed results on the sources affecting the area. Vehicular traffic revealed the source of greatest concern for the study area. Also in this case, incinerator’s relative contribution seemed to be negligible. Finally, the application of PMF analysis to hourly aerosol data demonstrated that the higher the temporal resolution of the data was, the more the source profiles were close to the real one.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The candidate tackled an important issue in contemporary management: the role of CSR and Sustainability. The research proposal focused on a longitudinal and inductive research, directed to specify the evolution of CSR and contribute to the new institutional theory, in particular institutional work framework, and to the relation between institutions and discourse analysis. The documental analysis covers all the evolution of CSR, focusing also on a number of important networks and associations. Some of the methodologies employed in the thesis have been employed as a consequence of data analysis, in a truly inductive research process. The thesis is composed by two section. The first section mainly describes the research process and the analyses results. The candidates employed several research methods: a longitudinal content analysis of documents, a vocabulary research with statistical metrics as cluster analysis and factor analysis, a rhetorical analysis of justifications. The second section puts in relation the analysis results with theoretical frameworks and contributions. The candidate confronted with several frameworks: Actor-Network-Theory, Institutional work and Boundary Work, Institutional Logic. Chapters are focused on different issues: a historical reconstruction of CSR; a reflection about symbolic adoption of recurrent labels; two case studies of Italian networks, in order to confront institutional and boundary works; a theoretical model of institutional change based on contradiction and institutional complexity; the application of the model to CSR and Sustainability, proposing Sustainability as a possible institutional logic.