941 resultados para Multivariate data analysis


Relevância:

90.00% 90.00%

Publicador:

Resumo:

The beta-Birnbaum-Saunders (Cordeiro and Lemonte, 2011) and Birnbaum-Saunders (Birnbaum and Saunders, 1969a) distributions have been used quite effectively to model failure times for materials subject to fatigue and lifetime data. We define the log-beta-Birnbaum-Saunders distribution by the logarithm of the beta-Birnbaum-Saunders distribution. Explicit expressions for its generating function and moments are derived. We propose a new log-beta-Birnbaum-Saunders regression model that can be applied to censored data and be used more effectively in survival analysis. We obtain the maximum likelihood estimates of the model parameters for censored data and investigate influence diagnostics. The new location-scale regression model is modified for the possibility that long-term survivors may be presented in the data. Its usefulness is illustrated by means of two real data sets. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Objective: To build a life table and determine the factors related to the time of treatment of undernourished children at a nutrition rehabilitation centre (CREN), Sao Paulo, Brazil. Design: Nutritional status was assessed from weight-for-age, height-for-age and BMI-for-age Z-scores, while neuropsychomotor development was classified according to the milestones of childhood development. Life tables, Kaplan-Meier survival curves and Cox multiple regression models were employed in data analysis. Setting: CREN (Centre of Nutritional Recovery and Education), Sao Paulo, Brazil. Subjects: Undernourished children (n 228) from the southern slums of Sao Paulo who had received treatment at CREN under a day-hospital regime between the years 1994 and 2009. Results: The Kaplan-Meier curves of survival analysis showed statistically significant differences in the periods of treatment at CREN between children presenting different degrees of neuropsychomotor development (log-rank = 6.621; P = 0.037). Estimates based on the multivariate Cox model revealed that children aged >= 24 months at the time of admission exhibited a lower probability of nutritional rehabilitation (hazard ratio (HR) = 0.49; P = 0.046) at the end of the period compared with infants aged up 12 months. Children presenting slow development were better rehabilitated in comparison with those exhibiting adequate evolution (HR = 4.48; P = 0.023). No significant effects of sex, degree of undernutrition or birth weight on the probability of nutritional rehabilitation were found. Conclusions: Age and neuropsychomotor developmental status at the time of admission to CREN are critical factors in determining the duration of treatment.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we propose a random intercept Poisson model in which the random effect is assumed to follow a generalized log-gamma (GLG) distribution. This random effect accommodates (or captures) the overdispersion in the counts and induces within-cluster correlation. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are, in general, required for deriving the marginal models, we obtain the multivariate negative binomial model from a particular parameter setting of the hierarchical model. An iterative process is derived for obtaining the maximum likelihood estimates for the parameters in the multivariate negative binomial model. Residual analysis is proposed and two applications with real data are given for illustration. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A common interest in gene expression data analysis is to identify from a large pool of candidate genes the genes that present significant changes in expression levels between a treatment and a control biological condition. Usually, it is done using a statistic value and a cutoff value that are used to separate the genes differentially and nondifferentially expressed. In this paper, we propose a Bayesian approach to identify genes differentially expressed calculating sequentially credibility intervals from predictive densities which are constructed using the sampled mean treatment effect from all genes in study excluding the treatment effect of genes previously identified with statistical evidence for difference. We compare our Bayesian approach with the standard ones based on the use of the t-test and modified t-tests via a simulation study, using small sample sizes which are common in gene expression data analysis. Results obtained report evidence that the proposed approach performs better than standard ones, especially for cases with mean differences and increases in treatment variance in relation to control variance. We also apply the methodologies to a well-known publicly available data set on Escherichia coli bacterium.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A new method for analysis of scattering data from lamellar bilayer systems is presented. The method employs a form-free description of the cross-section structure of the bilayer and the fit is performed directly to the scattering data, introducing also a structure factor when required. The cross-section structure (electron density profile in the case of X-ray scattering) is described by a set of Gaussian functions and the technique is termed Gaussian deconvolution. The coefficients of the Gaussians are optimized using a constrained least-squares routine that induces smoothness of the electron density profile. The optimization is coupled with the point-of-inflection method for determining the optimal weight of the smoothness. With the new approach, it is possible to optimize simultaneously the form factor, structure factor and several other parameters in the model. The applicability of this method is demonstrated by using it in a study of a multilamellar system composed of lecithin bilayers, where the form factor and structure factor are obtained simultaneously, and the obtained results provided new insight into this very well known system.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this article, we propose a new Bayesian flexible cure rate survival model, which generalises the stochastic model of Klebanov et al. [Klebanov LB, Rachev ST and Yakovlev AY. A stochastic-model of radiation carcinogenesis - latent time distributions and their properties. Math Biosci 1993; 113: 51-75], and has much in common with the destructive model formulated by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)]. In our approach, the accumulated number of lesions or altered cells follows a compound weighted Poisson distribution. This model is more flexible than the promotion time cure model in terms of dispersion. Moreover, it possesses an interesting and realistic interpretation of the biological mechanism of the occurrence of the event of interest as it includes a destructive process of tumour cells after an initial treatment or the capacity of an individual exposed to irradiation to repair altered cells that results in cancer induction. In other words, what is recorded is only the damaged portion of the original number of altered cells not eliminated by the treatment or repaired by the repair system of an individual. Markov Chain Monte Carlo (MCMC) methods are then used to develop Bayesian inference for the proposed model. Also, some discussions on the model selection and an illustration with a cutaneous melanoma data set analysed by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)] are presented.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Abstract Background Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space. Results Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster. Conclusion Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Abstract Background Dizziness is a common complaint among older adults and has been linked to a wide range of health conditions, psychological and social characteristics in this population. However a profile of dizziness is still uncertain which hampers clinical decision-making. We therefore sought to explore the relationship between dizziness and a comprehensive range of demographic data, diseases, health and geriatric conditions, and geriatric syndromes in a representative sample of community-dwelling older people. Methods This is a cross-sectional, population-based study derived from FIBRA (Network for the Study of Frailty in Brazilian Elderly Adults), with 391 elderly adults, both men and women, aged 65 years and older. Elderly participants living at home in an urban area were enrolled through a process of random cluster sampling of census regions. The outcome variable was the self-report of dizziness in the last year. Several feelings of dizziness were investigated including vertigo, spinning, light or heavy headedness, floating, fuzziness, giddiness and instability. A multivariate logistic regression analysis was conducted to estimate the adjusted odds ratios and build the probability model for dizziness. Results The complaint of dizziness was reported by 45% of elderly adults, from which 71.6% were women (p=0.004). The multivariate regression analysis revealed that dizziness is associated with depressive symptoms (OR = 2.08; 95% CI 1.29–3.35), perceived fatigue (OR = 1.93; 95% CI 1.21-3.10), recurring falls (OR = 2.01; 95% CI 1.11-3.62) and excessive drowsiness (OR = 1.91; 95% CI 1.11–3.29). The discrimination of the final model was AUC = 0.673 (95% CI 0.619-0.727) (p< 0.001). Conclusions The prevalence of dizziness in community-dwelling elderly adults is substantial. It is associated with other common geriatric conditions usually neglected in elderly adults, such as fatigue and drowsiness, supporting its possible multifactorial manifestation. Our findings demonstrate the need to expand the design in future studies, aiming to estimate risk and identify possible causal relations.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Aortic aneurysm and dissection are important causes of death in older people. Ruptured aneurysms show catastrophic fatality rates reaching near 80%. Few population-based mortality studies have been published in the world and none in Brazil. The objective of the present study was to use multiple-cause-of-death methodology in the analysis of mortality trends related to aortic aneurysm and dissection in the state of Sao Paulo, between 1985 and 2009. Methods: We analyzed mortality data from the Sao Paulo State Data Analysis System, selecting all death certificates on which aortic aneurysm and dissection were listed as a cause-of-death. The variables sex, age, season of the year, and underlying, associated or total mentions of causes of death were studied using standardized mortality rates, proportions and historical trends. Statistical analyses were performed by chi-square goodness-of-fit and H Kruskal-Wallis tests, and variance analysis. The joinpoint regression model was used to evaluate changes in age-standardized rates trends. A p value less than 0.05 was regarded as significant. Results: Over a 25-year period, there were 42,615 deaths related to aortic aneurysm and dissection, of which 36,088 (84.7%) were identified as underlying cause and 6,527 (15.3%) as an associated cause-of-death. Dissection and ruptured aneurysms were considered as an underlying cause of death in 93% of the deaths. For the entire period, a significant increased trend of age-standardized death rates was observed in men and women, while certain non-significant decreases occurred from 1996/2004 until 2009. Abdominal aortic aneurysms and aortic dissections prevailed among men and aortic dissections and aortic aneurysms of unspecified site among women. In 1985 and 2009 death rates ratios of men to women were respectively 2.86 and 2.19, corresponding to a difference decrease between rates of 23.4%. For aortic dissection, ruptured and non-ruptured aneurysms, the overall mean ages at death were, respectively, 63.2, 68.4 and 71.6 years; while, as the underlying cause, the main associated causes of death were as follows: hemorrhages (in 43.8%/40.5%/13.9%); hypertensive diseases (in 49.2%/22.43%/24.5%) and atherosclerosis (in 14.8%/25.5%/15.3%); and, as associated causes, their principal overall underlying causes of death were diseases of the circulatory (55.7%), and respiratory (13.8%) systems and neoplasms (7.8%). A significant seasonal variation, with highest frequency in winter, occurred in deaths identified as underlying cause for aortic dissection, ruptured and non-ruptured aneurysms. Conclusions: This study introduces the methodology of multiple-causes-of-death to enhance epidemiologic knowledge of aortic aneurysm and dissection in São Paulo, Brazil. The results presented confer light to the importance of mortality statistics and the need for epidemiologic studies to understand unique trends in our own population.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results: In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions: This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Although in Europe and in the USA many studies focus on organic, little is known on the topic in China. This research provides an insight on Shanghai consumers’ perception of organic, aiming at understanding and representing in graphic form the network of mental associations that stems from the organic concept. To acquire, process and aggregate the individual networks it was used the “Brand concept mapping” methodology (Roedder et al., 2006), while the data analysis was carried out also using analytic procedures. The results achieved suggest that organic food is perceived as healthy, safe and costly. Although these attributes are pretty much consistent with the European perception, some relevant differences emerged. First, organic is not necessarily synonymous with natural product in China, also due to a poor translation of the term in the Chinese language that conveys the idea of a manufactured product. Secondly, the organic label has to deal with the competition with the green food label in terms of image and positioning on the market, since they are easily associated and often confused. “Environmental protection” also emerged as relevant association, while the ethical and social values were not mentioned. In conclusion, health care and security concerns are the factors that influence most the food consumption in China (many people are so concerned about food safety that they found it difficult to shop), and the associations “Safe”, “Pure and natural”, “without chemicals” and “healthy” have been identified as the best candidates for leveraging a sound image of organic food .

Relevância:

90.00% 90.00%

Publicador:

Resumo:

PROBLEM In the last few years farm tourism or agritourism as it is also referred to has enjoyed increasing success because of its generally acknowledged role as a promoter of economic and social development of rural areas. As a consequence, a plethora of studies have been dedicated to this tourist sector, focusing on a variety of issues. Nevertheless, despite the difficulties of many farmers to orient their business towards potential customers, the contribution of the marketing literature has been moderate. PURPOSE This dissertation builds upon studies which advocate the necessity of farm tourism to innovate itself according to the increasingly demanding needs of customers. Henceforth, the purpose of this dissertation is to critically evaluate the level of professionalism reached in the farm tourism market within a marketing approach. METHODOLOGY This dissertation is a cross-country perspective incorporating the marketing of farm tourism studied in Germany and Italy. Hence, the marketing channels of this tourist sector are examined both from the supply and the demand side by means of five exploratory studies. The data collection has been conducted in the timeframe of 2006 to 2009 in manifold ways (online survey, catalogues of industry associations, face-to-face interviews, etc.) according to the purpose of the research of each study project. The data have been analyzed using multivariate statistical analysis. FINDINGS A comprehensive literature review provides the state of the art of the main differences and similarities of farm tourism in the two countries of study. The main findings contained in the empirical chapters provide insights on many aspects of agritourism including how the expectations of farm operators and customers differ, which development scenarios of farm tourism are more likely to meet individuals’ needs, how new technologies can impact the demand for farm tourism, etc. ORIGINALITY/VALUE The value of this study is in the investigation of the process by which farmers’ participation in the development of this sector intersects with consumer consumption patterns. Focusing on this process should allow farm operators and others including related businesses to more efficiently allocate resources.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of this work is to contribute to the development of new multifunctional nanocarriers for improved encapsulation and delivery of anticancer and antiviral drugs. The work focused on water soluble and biocompatible oligosaccharides, the cyclodextrins (CyDs), and a new family of nanostructured, biodegradable carrier materials made of porous metal-organic frameworks (nanoMOFs). The drugs of choice were the anticancer doxorubicin (DOX), azidothymidine (AZT) and its phosphate derivatives and artemisinin (ART). DOX possesses a pharmacological drawback due to its self-aggregation tendency in water. The non covalent binding of DOX to a series of CyD derivatives, such as g-CyD, an epichlorohydrin crosslinked b-CyD polymer (pb-CyD) and a citric acid crosslinked g-CyD polymer (pg-CyD) was studied by UV visible absorption, circular dichroism and fluorescence. Multivariate global analysis of multiwavelength data from spectroscopic titrations allowed identification and characterization of the stable complexes. pg-CyD proved to be the best carrier showing both high association constants and ability to monomerize DOX. AZT is an important antiretroviral drug. The active form is AZT-triphosphate (AZT-TP), formed in metabolic paths of low efficiency. Direct administration of AZT-TP is limited by its poor stability in biological media. So the development of suitable carriers is highly important. In this context we studied the binding of some phosphorilated derivatives to nanoMOFs by spectroscopic methods. The results obtained with iron(III)-trimesate nanoMOFs allowed to prove that the binding of these drugs mainly occurs by strong iono-covalent bonds to iron(III) centers. On the basis of these and other results obtained in partner laboratories, it was possible to propose this highly versatile and “green” carrier system for delivery of phosphorylated nucleoside analogues. The interaction of DOX with nanoMOFs was also studied. Finally the binding of the antimalarial drug, artemisinin (ART) with two cyclodextrin-based carriers,the pb-CyD and a light responsive bis(b-CyD) host, was also studied.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The candidate tackled an important issue in contemporary management: the role of CSR and Sustainability. The research proposal focused on a longitudinal and inductive research, directed to specify the evolution of CSR and contribute to the new institutional theory, in particular institutional work framework, and to the relation between institutions and discourse analysis. The documental analysis covers all the evolution of CSR, focusing also on a number of important networks and associations. Some of the methodologies employed in the thesis have been employed as a consequence of data analysis, in a truly inductive research process. The thesis is composed by two section. The first section mainly describes the research process and the analyses results. The candidates employed several research methods: a longitudinal content analysis of documents, a vocabulary research with statistical metrics as cluster analysis and factor analysis, a rhetorical analysis of justifications. The second section puts in relation the analysis results with theoretical frameworks and contributions. The candidate confronted with several frameworks: Actor-Network-Theory, Institutional work and Boundary Work, Institutional Logic. Chapters are focused on different issues: a historical reconstruction of CSR; a reflection about symbolic adoption of recurrent labels; two case studies of Italian networks, in order to confront institutional and boundary works; a theoretical model of institutional change based on contradiction and institutional complexity; the application of the model to CSR and Sustainability, proposing Sustainability as a possible institutional logic.