947 resultados para Dynamic data analysis


Relevância:

90.00% 90.00%

Publicador:

Resumo:

A new method for analysis of scattering data from lamellar bilayer systems is presented. The method employs a form-free description of the cross-section structure of the bilayer and the fit is performed directly to the scattering data, introducing also a structure factor when required. The cross-section structure (electron density profile in the case of X-ray scattering) is described by a set of Gaussian functions and the technique is termed Gaussian deconvolution. The coefficients of the Gaussians are optimized using a constrained least-squares routine that induces smoothness of the electron density profile. The optimization is coupled with the point-of-inflection method for determining the optimal weight of the smoothness. With the new approach, it is possible to optimize simultaneously the form factor, structure factor and several other parameters in the model. The applicability of this method is demonstrated by using it in a study of a multilamellar system composed of lecithin bilayers, where the form factor and structure factor are obtained simultaneously, and the obtained results provided new insight into this very well known system.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this article, we propose a new Bayesian flexible cure rate survival model, which generalises the stochastic model of Klebanov et al. [Klebanov LB, Rachev ST and Yakovlev AY. A stochastic-model of radiation carcinogenesis - latent time distributions and their properties. Math Biosci 1993; 113: 51-75], and has much in common with the destructive model formulated by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)]. In our approach, the accumulated number of lesions or altered cells follows a compound weighted Poisson distribution. This model is more flexible than the promotion time cure model in terms of dispersion. Moreover, it possesses an interesting and realistic interpretation of the biological mechanism of the occurrence of the event of interest as it includes a destructive process of tumour cells after an initial treatment or the capacity of an individual exposed to irradiation to repair altered cells that results in cancer induction. In other words, what is recorded is only the damaged portion of the original number of altered cells not eliminated by the treatment or repaired by the repair system of an individual. Markov Chain Monte Carlo (MCMC) methods are then used to develop Bayesian inference for the proposed model. Also, some discussions on the model selection and an illustration with a cutaneous melanoma data set analysed by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)] are presented.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Abstract Background Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space. Results Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster. Conclusion Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Abstract Background Prostate cancer is a leading cause of death in the male population, therefore, a comprehensive study about the genes and the molecular networks involved in the tumoral prostate process becomes necessary. In order to understand the biological process behind potential biomarkers, we have analyzed a set of 57 cDNA microarrays containing ~25,000 genes. Results Principal Component Analysis (PCA) combined with the Maximum-entropy Linear Discriminant Analysis (MLDA) were applied in order to identify genes with the most discriminative information between normal and tumoral prostatic tissues. Data analysis was carried out using three different approaches, namely: (i) differences in gene expression levels between normal and tumoral conditions from an univariate point of view; (ii) in a multivariate fashion using MLDA; and (iii) with a dependence network approach. Our results show that malignant transformation in the prostatic tissue is more related to functional connectivity changes in their dependence networks than to differential gene expression. The MYLK, KLK2, KLK3, HAN11, LTF, CSRP1 and TGM4 genes presented significant changes in their functional connectivity between normal and tumoral conditions and were also classified as the top seven most informative genes for the prostate cancer genesis process by our discriminant analysis. Moreover, among the identified genes we found classically known biomarkers and genes which are closely related to tumoral prostate, such as KLK3 and KLK2 and several other potential ones. Conclusion We have demonstrated that changes in functional connectivity may be implicit in the biological process which renders some genes more informative to discriminate between normal and tumoral conditions. Using the proposed method, namely, MLDA, in order to analyze the multivariate characteristic of genes, it was possible to capture the changes in dependence networks which are related to cell transformation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Aortic aneurysm and dissection are important causes of death in older people. Ruptured aneurysms show catastrophic fatality rates reaching near 80%. Few population-based mortality studies have been published in the world and none in Brazil. The objective of the present study was to use multiple-cause-of-death methodology in the analysis of mortality trends related to aortic aneurysm and dissection in the state of Sao Paulo, between 1985 and 2009. Methods: We analyzed mortality data from the Sao Paulo State Data Analysis System, selecting all death certificates on which aortic aneurysm and dissection were listed as a cause-of-death. The variables sex, age, season of the year, and underlying, associated or total mentions of causes of death were studied using standardized mortality rates, proportions and historical trends. Statistical analyses were performed by chi-square goodness-of-fit and H Kruskal-Wallis tests, and variance analysis. The joinpoint regression model was used to evaluate changes in age-standardized rates trends. A p value less than 0.05 was regarded as significant. Results: Over a 25-year period, there were 42,615 deaths related to aortic aneurysm and dissection, of which 36,088 (84.7%) were identified as underlying cause and 6,527 (15.3%) as an associated cause-of-death. Dissection and ruptured aneurysms were considered as an underlying cause of death in 93% of the deaths. For the entire period, a significant increased trend of age-standardized death rates was observed in men and women, while certain non-significant decreases occurred from 1996/2004 until 2009. Abdominal aortic aneurysms and aortic dissections prevailed among men and aortic dissections and aortic aneurysms of unspecified site among women. In 1985 and 2009 death rates ratios of men to women were respectively 2.86 and 2.19, corresponding to a difference decrease between rates of 23.4%. For aortic dissection, ruptured and non-ruptured aneurysms, the overall mean ages at death were, respectively, 63.2, 68.4 and 71.6 years; while, as the underlying cause, the main associated causes of death were as follows: hemorrhages (in 43.8%/40.5%/13.9%); hypertensive diseases (in 49.2%/22.43%/24.5%) and atherosclerosis (in 14.8%/25.5%/15.3%); and, as associated causes, their principal overall underlying causes of death were diseases of the circulatory (55.7%), and respiratory (13.8%) systems and neoplasms (7.8%). A significant seasonal variation, with highest frequency in winter, occurred in deaths identified as underlying cause for aortic dissection, ruptured and non-ruptured aneurysms. Conclusions: This study introduces the methodology of multiple-causes-of-death to enhance epidemiologic knowledge of aortic aneurysm and dissection in São Paulo, Brazil. The results presented confer light to the importance of mortality statistics and the need for epidemiologic studies to understand unique trends in our own population.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results: In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions: This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this thesis some multivariate spectroscopic methods for the analysis of solutions are proposed. Spectroscopy and multivariate data analysis form a powerful combination for obtaining both quantitative and qualitative information and it is shown how spectroscopic techniques in combination with chemometric data evaluation can be used to obtain rapid, simple and efficient analytical methods. These spectroscopic methods consisting of spectroscopic analysis, a high level of automation and chemometric data evaluation can lead to analytical methods with a high analytical capacity, and for these methods, the term high-capacity analysis (HCA) is suggested. It is further shown how chemometric evaluation of the multivariate data in chromatographic analyses decreases the need for baseline separation. The thesis is based on six papers and the chemometric tools used are experimental design, principal component analysis (PCA), soft independent modelling of class analogy (SIMCA), partial least squares regression (PLS) and parallel factor analysis (PARAFAC). The analytical techniques utilised are scanning ultraviolet-visible (UV-Vis) spectroscopy, diode array detection (DAD) used in non-column chromatographic diode array UV spectroscopy, high-performance liquid chromatography with diode array detection (HPLC-DAD) and fluorescence spectroscopy. The methods proposed are exemplified in the analysis of pharmaceutical solutions and serum proteins. In Paper I a method is proposed for the determination of the content and identity of the active compound in pharmaceutical solutions by means of UV-Vis spectroscopy, orthogonal signal correction and multivariate calibration with PLS and SIMCA classification. Paper II proposes a new method for the rapid determination of pharmaceutical solutions by the use of non-column chromatographic diode array UV spectroscopy, i.e. a conventional HPLC-DAD system without any chromatographic column connected. In Paper III an investigation is made of the ability of a control sample, of known content and identity to diagnose and correct errors in multivariate predictions something that together with use of multivariate residuals can make it possible to use the same calibration model over time. In Paper IV a method is proposed for simultaneous determination of serum proteins with fluorescence spectroscopy and multivariate calibration. Paper V proposes a method for the determination of chromatographic peak purity by means of PCA of HPLC-DAD data. In Paper VI PARAFAC is applied for the decomposition of DAD data of some partially separated peaks into the pure chromatographic, spectral and concentration profiles.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Although in Europe and in the USA many studies focus on organic, little is known on the topic in China. This research provides an insight on Shanghai consumers’ perception of organic, aiming at understanding and representing in graphic form the network of mental associations that stems from the organic concept. To acquire, process and aggregate the individual networks it was used the “Brand concept mapping” methodology (Roedder et al., 2006), while the data analysis was carried out also using analytic procedures. The results achieved suggest that organic food is perceived as healthy, safe and costly. Although these attributes are pretty much consistent with the European perception, some relevant differences emerged. First, organic is not necessarily synonymous with natural product in China, also due to a poor translation of the term in the Chinese language that conveys the idea of a manufactured product. Secondly, the organic label has to deal with the competition with the green food label in terms of image and positioning on the market, since they are easily associated and often confused. “Environmental protection” also emerged as relevant association, while the ethical and social values were not mentioned. In conclusion, health care and security concerns are the factors that influence most the food consumption in China (many people are so concerned about food safety that they found it difficult to shop), and the associations “Safe”, “Pure and natural”, “without chemicals” and “healthy” have been identified as the best candidates for leveraging a sound image of organic food .

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The present PhD thesis was focused on the development and application of chemical methodology (Py-GC-MS) and data-processing method by multivariate data analysis (chemometrics). The chromatographic and mass spectrometric data obtained with this technique are particularly suitable to be interpreted by chemometric methods such as PCA (Principal Component Analysis) as regards data exploration and SIMCA (Soft Independent Models of Class Analogy) for the classification. As a first approach, some issues related to the field of cultural heritage were discussed with a particular attention to the differentiation of binders used in pictorial field. A marker of egg tempera the phosphoric acid esterified, a pyrolysis product of lecithin, was determined using HMDS (hexamethyldisilazane) rather than the TMAH (tetramethylammonium hydroxide) as a derivatizing reagent. The validity of analytical pyrolysis as tool to characterize and classify different types of bacteria was verified. The FAMEs chromatographic profiles represent an important tool for the bacterial identification. Because of the complexity of the chromatograms, it was possible to characterize the bacteria only according to their genus, while the differentiation at the species level has been achieved by means of chemometric analysis. To perform this study, normalized areas peaks relevant to fatty acids were taken into account. Chemometric methods were applied to experimental datasets. The obtained results demonstrate the effectiveness of analytical pyrolysis and chemometric analysis for the rapid characterization of bacterial species. Application to a samples of bacterial (Pseudomonas Mendocina), fungal (Pleorotus ostreatus) and mixed- biofilms was also performed. A comparison with the chromatographic profiles established the possibility to: • Differentiate the bacterial and fungal biofilms according to the (FAMEs) profile. • Characterize the fungal biofilm by means the typical pattern of pyrolytic fragments derived from saccharides present in the cell wall. • Individuate the markers of bacterial and fungal biofilm in the same mixed-biofilm sample.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Il problema dell'antibiotico-resistenza è un problema di sanità pubblica per affrontare il quale è necessario un sistema di sorveglianza basato sulla raccolta e l'analisi dei dati epidemiologici di laboratorio. Il progetto di dottorato è consistito nello sviluppo di una applicazione web per la gestione di tali dati di antibiotico sensibilità di isolati clinici utilizzabile a livello di ospedale. Si è creata una piattaforma web associata a un database relazionale per avere un’applicazione dinamica che potesse essere aggiornata facilmente inserendo nuovi dati senza dover manualmente modificare le pagine HTML che compongono l’applicazione stessa. E’ stato utilizzato il database open-source MySQL in quanto presenta numerosi vantaggi: estremamente stabile, elevate prestazioni, supportato da una grande comunità online ed inoltre gratuito. Il contenuto dinamico dell’applicazione web deve essere generato da un linguaggio di programmazione tipo “scripting” che automatizzi operazioni di inserimento, modifica, cancellazione, visualizzazione di larghe quantità di dati. E’ stato scelto il PHP, linguaggio open-source sviluppato appositamente per la realizzazione di pagine web dinamiche, perfettamente utilizzabile con il database MySQL. E’ stata definita l’architettura del database creando le tabelle contenenti i dati e le relazioni tra di esse: le anagrafiche, i dati relativi ai campioni, microrganismi isolati e agli antibiogrammi con le categorie interpretative relative al dato antibiotico. Definite tabelle e relazioni del database è stato scritto il codice associato alle funzioni principali: inserimento manuale di antibiogrammi, importazione di antibiogrammi multipli provenienti da file esportati da strumenti automatizzati, modifica/eliminazione degli antibiogrammi precedenti inseriti nel sistema, analisi dei dati presenti nel database con tendenze e andamenti relativi alla prevalenza di specie microbiche e alla chemioresistenza degli stessi, corredate da grafici. Lo sviluppo ha incluso continui test delle funzioni via via implementate usando reali dati clinici e sono stati introdotti appositi controlli e l’introduzione di una semplice e pulita veste grafica.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The candidate tackled an important issue in contemporary management: the role of CSR and Sustainability. The research proposal focused on a longitudinal and inductive research, directed to specify the evolution of CSR and contribute to the new institutional theory, in particular institutional work framework, and to the relation between institutions and discourse analysis. The documental analysis covers all the evolution of CSR, focusing also on a number of important networks and associations. Some of the methodologies employed in the thesis have been employed as a consequence of data analysis, in a truly inductive research process. The thesis is composed by two section. The first section mainly describes the research process and the analyses results. The candidates employed several research methods: a longitudinal content analysis of documents, a vocabulary research with statistical metrics as cluster analysis and factor analysis, a rhetorical analysis of justifications. The second section puts in relation the analysis results with theoretical frameworks and contributions. The candidate confronted with several frameworks: Actor-Network-Theory, Institutional work and Boundary Work, Institutional Logic. Chapters are focused on different issues: a historical reconstruction of CSR; a reflection about symbolic adoption of recurrent labels; two case studies of Italian networks, in order to confront institutional and boundary works; a theoretical model of institutional change based on contradiction and institutional complexity; the application of the model to CSR and Sustainability, proposing Sustainability as a possible institutional logic.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This thesis is developed in the contest of Ritmare project WP1, which main objective is the development of a sustainable fishery through the identification of populations boundaries in commercially important species in Italian Seas. Three main objectives are discussed in order to help reach the main purpose of identification of stock boundaries in Parapenaeus longirostris: 1 -Development of a representative sampling design for Italian seas; 2 -Evaluation of 2b-RAD protocol; 3 -Investigation of populations through biological data analysis. First of all we defined and accomplished a sampling design which properly represents all Italian seas. Then we used information and data about nursery areas distribution, abundance of populations and importance of P. longirostris in local fishery, to develop an experimental design that prioritize the most important areas to maximize the results with actual project funds. We introduced for the first time the use of 2b-RAD on this species, a genotyping method based on sequencing the uniform fragments produced by type IIB restriction endonucleases. Thanks to this method we were able to move from genetics to the more complex genomics. In order to proceed with 2b-RAD we performed several tests to identify the best DNA extraction kit and protocol and finally we were able to extract 192 high quality DNA extracts ready to be processed. We tested 2b-RAD with five samples and after high-throughput sequencing of libraries we used the software “Stacks” to analyze the sequences. We obtained positive results identifying a great number of SNP markers among the five samples. To guarantee a multidisciplinary approach we used the biological data associated to the collected samples to investigate differences between geographical samples. Such approach assures continuity with other project, for instance STOCKMED, which utilize a combination of molecular and biological analysis as well.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The purpose of my PhD thesis has been to face the issue of retrieving a three dimensional attenuation model in volcanic areas. To this purpose, I first elaborated a robust strategy for the analysis of seismic data. This was done by performing several synthetic tests to assess the applicability of spectral ratio method to our purposes. The results of the tests allowed us to conclude that: 1) spectral ratio method gives reliable differential attenuation (dt*) measurements in smooth velocity models; 2) short signal time window has to be chosen to perform spectral analysis; 3) the frequency range over which to compute spectral ratios greatly affects dt* measurements. Furthermore, a refined approach for the application of spectral ratio method has been developed and tested. Through this procedure, the effects caused by heterogeneities of propagation medium on the seismic signals may be removed. The tested data analysis technique was applied to the real active seismic SERAPIS database. It provided a dataset of dt* measurements which was used to obtain a three dimensional attenuation model of the shallowest part of Campi Flegrei caldera. Then, a linearized, iterative, damped attenuation tomography technique has been tested and applied to the selected dataset. The tomography, with a resolution of 0.5 km in the horizontal directions and 0.25 km in the vertical direction, allowed to image important features in the off-shore part of Campi Flegrei caldera. High QP bodies are immersed in a high attenuation body (Qp=30). The latter is well correlated with low Vp and high Vp/Vs values and it is interpreted as a saturated marine and volcanic sediments layer. High Qp anomalies, instead, are interpreted as the effects either of cooled lava bodies or of a CO2 reservoir. A pseudo-circular high Qp anomaly was detected and interpreted as the buried rim of NYT caldera.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The main purposes of this essay were to investigate in detail the burning rate anomaly phenomenon, also known as "Hump Effect", in solid rocket motors casted in mandrel and the mechanisms at the base of it, as well as the developing of a numeric code, in Matlab environment, in order to obtain a forecasting tool to generate concentration and orientation maps of the particles within the grain. The importance of these analysis is due to the fact that the forecasts of ballistics curves in new motors have to be improved in order to reduce the amount of experimental tests needed for the characterization of their ballistic behavior. This graduate work is divided into two parts. The first one is about bidimensional and tridimensional simulations on z9 motor casting process. The simulations have been carried out respectively with Fluent and Flow 3D. The second one is about the analysis of fluid dynamic data and the developing of numeric codes which give information about the concentration and orientation of particles within the grain based on fluid strain rate information which are extrapolated from CFD software.