808 resultados para multivariate hidden Markov model
Resumo:
Background Several researchers seek methods for the selection of homogeneous groups of animals in experimental studies, a fact justified because homogeneity is an indispensable prerequisite for casualization of treatments. The lack of robust methods that comply with statistical and biological principles is the reason why researchers use empirical or subjective methods, influencing their results. Objective To develop a multivariate statistical model for the selection of a homogeneous group of animals for experimental research and to elaborate a computational package to use it. Methods The set of echocardiographic data of 115 male Wistar rats with supravalvular aortic stenosis (AoS) was used as an example of model development. Initially, the data were standardized, and became dimensionless. Then, the variance matrix of the set was submitted to principal components analysis (PCA), aiming at reducing the parametric space and at retaining the relevant variability. That technique established a new Cartesian system into which the animals were allocated, and finally the confidence region (ellipsoid) was built for the profile of the animals’ homogeneous responses. The animals located inside the ellipsoid were considered as belonging to the homogeneous batch; those outside the ellipsoid were considered spurious. Results The PCA established eight descriptive axes that represented the accumulated variance of the data set in 88.71%. The allocation of the animals in the new system and the construction of the confidence region revealed six spurious animals as compared to the homogeneous batch of 109 animals. Conclusion The biometric criterion presented proved to be effective, because it considers the animal as a whole, analyzing jointly all parameters measured, in addition to having a small discard rate.
Resumo:
Infectious diseases can bring about population declines and local host extinctions, contributing significantly to the global biodiversity crisis. Nonetheless, studies measuring population-level effects of pathogens in wild host populations are rare, and taxonomically biased toward avian hosts and macroparasitic infections. We investigated the effects of bovine tuberculosis (bTB), caused by the bacterial pathogen Mycobacterium bovis, on African buffalo (Syncerus caffer) at Hluhluwe-iMfolozi Park, South Africa. We tested 1180 buffalo for bTB infection between May 2000 and November 2001. Most infections were mild, confirming the chronic nature of the disease in buffalo. However, our data indicate that bTB affects both adult survival and fecundity. Using an age-structured population model, we demonstrate that the pathogen can reduce population growth rate drastically; yet its effects appear difficult to detect at the population level: bTB causes no conspicuous mass mortalities or fast population declines, nor does it alter host-population age structure significantly. Our models suggest that this syndrome—low detectability coupled with severe impacts on population growth rate and, therefore, resilience—may be characteristic of chronic diseases in large mammals.
Resumo:
In this paper we propose a hybrid hazard regression model with threshold stress which includes the proportional hazards and the accelerated failure time models as particular cases. To express the behavior of lifetimes the generalized-gamma distribution is assumed and an inverse power law model with a threshold stress is considered. For parameter estimation we develop a sampling-based posterior inference procedure based on Markov Chain Monte Carlo techniques. We assume proper but vague priors for the parameters of interest. A simulation study investigates the frequentist properties of the proposed estimators obtained under the assumption of vague priors. Further, some discussions on model selection criteria are given. The methodology is illustrated on simulated and real lifetime data set.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for the right-censored survival data when immune or cured individuals may be present in the population from which the data is taken. In our approach the number of competing causes of the event of interest follows the Conway-Maxwell-Poisson distribution which generalizes the Poisson distribution. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the proposed model. Also, some discussions on the model selection and an illustration with a real data set are considered.
Resumo:
The present paper has two goals. First to present a natural example of a new class of random fields which are the variable neighborhood random fields. The example we consider is a partially observed nearest neighbor binary Markov random field. The second goal is to establish sufficient conditions ensuring that the variable neighborhoods are almost surely finite. We discuss the relationship between the almost sure finiteness of the interaction neighborhoods and the presence/absence of phase transition of the underlying Markov random field. In the case where the underlying random field has no phase transition we show that the finiteness of neighborhoods depends on a specific relation between the noise level and the minimum values of the one-point specification of the Markov random field. The case in which there is phase transition is addressed in the frame of the ferromagnetic Ising model. We prove that the existence of infinite interaction neighborhoods depends on the phase.
Resumo:
A data set of a commercial Nellore beef cattle selection program was used to compare breeding models that assumed or not markers effects to estimate the breeding values, when a reduced number of animals have phenotypic, genotypic and pedigree information available. This herd complete data set was composed of 83,404 animals measured for weaning weight (WW), post-weaning gain (PWG), scrotal circumference (SC) and muscle score (MS), corresponding to 116,652 animals in the relationship matrix. Single trait analyses were performed by MTDFREML software to estimate fixed and random effects solutions using this complete data. The additive effects estimated were assumed as the reference breeding values for those animals. The individual observed phenotype of each trait was adjusted for fixed and random effects solutions, except for direct additive effects. The adjusted phenotype composed of the additive and residual parts of observed phenotype was used as dependent variable for models' comparison. Among all measured animals of this herd, only 3160 animals were genotyped for 106 SNP markers. Three models were compared in terms of changes on animals' rank, global fit and predictive ability. Model 1 included only polygenic effects, model 2 included only markers effects and model 3 included both polygenic and markers effects. Bayesian inference via Markov chain Monte Carlo methods performed by TM software was used to analyze the data for model comparison. Two different priors were adopted for markers effects in models 2 and 3, the first prior assumed was a uniform distribution (U) and, as a second prior, was assumed that markers effects were distributed as normal (N). Higher rank correlation coefficients were observed for models 3_U and 3_N, indicating a greater similarity of these models animals' rank and the rank based on the reference breeding values. Model 3_N presented a better global fit, as demonstrated by its low DIC. The best models in terms of predictive ability were models 1 and 3_N. Differences due prior assumed to markers effects in models 2 and 3 could be attributed to the better ability of normal prior in handle with collinear effects. The models 2_U and 2_N presented the worst performance, indicating that this small set of markers should not be used to genetically evaluate animals with no data, since its predictive ability is restricted. In conclusion, model 3_N presented a slight superiority when a reduce number of animals have phenotypic, genotypic and pedigree information. It could be attributed to the variation retained by markers and polygenic effects assumed together and the normal prior assumed to markers effects, that deals better with the collinearity between markers. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Concentrations of 39 organic compounds were determined in three fractions (head, heart and tail) obtained from the pot still distillation of fermented sugarcane juice. The results were evaluated using analysis of variance (ANOVA), Tukey's test, principal component analysis (PCA), hierarchical cluster analysis (HCA) and linear discriminant analysis (LDA). According to PCA and HCA, the experimental data lead to the formation of three clusters. The head fractions give rise to a more defined group. The heart and tail fractions showed some overlap consistent with its acid composition. The predictive ability of calibration and validation of the model generated by LDA for the three fractions classification were 90.5 and 100%, respectively. This model recognized as the heart twelve of the thirteen commercial cachacas (92.3%) with good sensory characteristics, thus showing potential for guiding the process of cuts.
Resumo:
IDENTIFICATION OF ETHANOLIC WOOD EXTRACTS USING ELECTRONIC ABSORPTION SPECTRUM AND MULTIVARIATE ANALYSIS. The application of multivariate analysis to spectrophotometric (UV) data was explored for distinguishing extracts of cachaca woods commonly used in the manufacture of casks for aging cachacas (oak, cabretiva-parda, jatoba, amendoim and canela-sassafras). Absorbances close to 280 nm were more strongly correlated with oak and jatoba woods, whereas absorbances near 230 nm were more correlated with canela-sassafras and cabretiva-parda. A comparison between the spectrophotometric model and the model based on chromatographic (HPLC-DAD) data was carried out. The spectrophotometric model better explained the variance data (PC1 + PC2 = 91%) exhibiting potential as a routine method for checking aged spirits.
Resumo:
This study performed an exploratory analysis of the anthropometrical and morphological muscle variables related to the one-repetition maximum (1RM) performance. In addition, the capacity of these variables to predict the force production was analyzed. 50 active males were submitted to the experimental procedures: vastus lateralis muscle biopsy, quadriceps magnetic resonance imaging, body mass assessment and 1RM test in the leg-press exercise. K-means cluster analysis was performed after obtaining the body mass, sum of the left and right quadriceps muscle cross-sectional area (Sigma CSA), percentage of the type II fibers and the 1RM performance. The number of clusters was defined a priori and then were labeled as high strength performance (HSP1RM) group and low strength performance (LSP1RM) group. Stepwise multiple regressions were performed by means of body mass, Sigma CSA, percentage of the type II fibers and clusters as predictors' variables and 1RM performance as response variable. The clusters mean +/- SD were: 292.8 +/- 52.1 kg, 84.7 +/- 17.9 kg, 19249.7 +/- 1645.5 mm(2) and 50.8 +/- 7.2% for the HSP1RM and 254.0 +/- 51.1 kg, 69.2 +/- 8.1 kg, 15483.1 +/- 1 104.8 mm(2) and 51.7 +/- 6.2 %, for the LSP1RM in the 1RM, body mass, Sigma CSA and muscle fiber type II percentage, respectively. The most important variable in the clusters division was the Sigma CSA. In addition, the Sigma CSA and muscle fiber type II percentage explained the variance in the 1RM performance (Adj R-2 = 0.35, p = 0.0001) for all participants and for the LSP1RM (Adj R-2 = 0.25, p = 0.002). For the HSP1RM, only the Sigma CSA was entered in the model and showed the highest capacity to explain the variance in the 1RM performance (Adj R-2 = 0.38, p = 0.01). As a conclusion, the muscle CSA was the most relevant variable to predict force production in individuals with no strength training background.
Resumo:
This study analyzes an accident in which two maintenance workers suffered severe burns while replacing a circuit breaker panel in a steel mill, following model of analysis and prevention of accidents (MAPA) developed with the objective of enlarging the perimeter of interventions and contributing to deconstruction of blame attribution practices. The study was based on materials produced by a health service team in an in-depth analysis of the accident. The analysis shows that decisions related to system modernization were taken without considering their implications in maintenance scheduling and creating conflicts of priorities and of interests between production and safety; and also reveals that the lack of a systemic perspective in safety management was its principal failure. To explain the accident as merely non-fulfillment of idealized formal safety rules feeds practices of blame attribution supported by alibi norms and inhibits possible prevention. In contrast, accident analyses undertaken in worker health surveillance services show potential to reveal origins of these events incubated in the history of the system ignored in practices guided by the traditional paradigm.
Resumo:
We study a probabilistic model of interacting spins indexed by elements of a finite subset of the d-dimensional integer lattice, da parts per thousand yen1. Conditions of time reversibility are examined. It is shown that the model equilibrium distribution converges to a limit distribution as the indexing set expands to the whole lattice. The occupied site percolation problem is solved for the limit distribution. Two models with similar dynamics are also discussed.
Resumo:
In this article, we propose a new Bayesian flexible cure rate survival model, which generalises the stochastic model of Klebanov et al. [Klebanov LB, Rachev ST and Yakovlev AY. A stochastic-model of radiation carcinogenesis - latent time distributions and their properties. Math Biosci 1993; 113: 51-75], and has much in common with the destructive model formulated by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)]. In our approach, the accumulated number of lesions or altered cells follows a compound weighted Poisson distribution. This model is more flexible than the promotion time cure model in terms of dispersion. Moreover, it possesses an interesting and realistic interpretation of the biological mechanism of the occurrence of the event of interest as it includes a destructive process of tumour cells after an initial treatment or the capacity of an individual exposed to irradiation to repair altered cells that results in cancer induction. In other words, what is recorded is only the damaged portion of the original number of altered cells not eliminated by the treatment or repaired by the repair system of an individual. Markov Chain Monte Carlo (MCMC) methods are then used to develop Bayesian inference for the proposed model. Also, some discussions on the model selection and an illustration with a cutaneous melanoma data set analysed by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)] are presented.
Resumo:
Abstract Background Using univariate and multivariate variance components linkage analysis methods, we studied possible genotype × age interaction in cardiovascular phenotypes related to the aging process from the Framingham Heart Study. Results We found evidence for genotype × age interaction for fasting glucose and systolic blood pressure. Conclusions There is polygenic genotype × age interaction for fasting glucose and systolic blood pressure and quantitative trait locus × age interaction for a linkage signal for systolic blood pressure phenotypes located on chromosome 17 at 67 cM.
Resumo:
Background Where malaria endemicity is low, control programmes need increasingly sensitive tools for monitoring malaria transmission intensity (MTI) and to better define health priorities. A cross-sectional survey was conducted in a low endemicity area of the Peruvian north-western coast to assess the MTI using both molecular and serological tools. Methods Epidemiological, parasitological and serological data were collected from 2,667 individuals in three settlements of Bellavista district, in May 2010. Parasite infection was detected using microscopy and polymerase chain reaction (PCR). Antibodies to Plasmodium vivax merozoite surface protein-119 (PvMSP119) and to Plasmodium falciparum glutamate-rich protein (PfGLURP) were detected by ELISA. Risk factors for exposure to malaria (seropositivity) were assessed by multivariate survey logistic regression models. Age-specific antibody prevalence of both P. falciparum and P. vivax were analysed using a previously published catalytic conversion model based on maximum likelihood for generating seroconversion rates (SCR). Results The overall parasite prevalence by microscopy and PCR were extremely low: 0.3 and 0.9%, respectively for P. vivax, and 0 and 0.04%, respectively for P. falciparum, while seroprevalence was much higher, 13.6% for P. vivax and 9.8% for P. falciparum. Settlement, age and occupation as moto-taxi driver during previous year were significantly associated with P. falciparum exposure, while age and distance to the water drain were associated with P. vivax exposure. Likelihood ratio tests supported age seroprevalence curves with two SCR for both P. vivax and P. falciparum indicating significant changes in the MTI over time. The SCR for PfGLURP was 19-fold lower after 2002 as compared to before (λ1 = 0.022 versus λ2 = 0.431), and the SCR for PvMSP119 was four-fold higher after 2006 as compared to before (λ1 = 0.024 versus λ2 = 0.006). Conclusion Combining molecular and serological tools considerably enhanced the capacity of detecting current and past exposure to malaria infections and related risks factors in this very low endemicity area. This allowed for an improved characterization of the current human reservoir of infections, largely hidden and heterogeneous, as well as providing insights into recent changes in species specific MTIs. This approach will be of key importance for evaluating and monitoring future malaria elimination strategies.
Resumo:
In this thesis some multivariate spectroscopic methods for the analysis of solutions are proposed. Spectroscopy and multivariate data analysis form a powerful combination for obtaining both quantitative and qualitative information and it is shown how spectroscopic techniques in combination with chemometric data evaluation can be used to obtain rapid, simple and efficient analytical methods. These spectroscopic methods consisting of spectroscopic analysis, a high level of automation and chemometric data evaluation can lead to analytical methods with a high analytical capacity, and for these methods, the term high-capacity analysis (HCA) is suggested. It is further shown how chemometric evaluation of the multivariate data in chromatographic analyses decreases the need for baseline separation. The thesis is based on six papers and the chemometric tools used are experimental design, principal component analysis (PCA), soft independent modelling of class analogy (SIMCA), partial least squares regression (PLS) and parallel factor analysis (PARAFAC). The analytical techniques utilised are scanning ultraviolet-visible (UV-Vis) spectroscopy, diode array detection (DAD) used in non-column chromatographic diode array UV spectroscopy, high-performance liquid chromatography with diode array detection (HPLC-DAD) and fluorescence spectroscopy. The methods proposed are exemplified in the analysis of pharmaceutical solutions and serum proteins. In Paper I a method is proposed for the determination of the content and identity of the active compound in pharmaceutical solutions by means of UV-Vis spectroscopy, orthogonal signal correction and multivariate calibration with PLS and SIMCA classification. Paper II proposes a new method for the rapid determination of pharmaceutical solutions by the use of non-column chromatographic diode array UV spectroscopy, i.e. a conventional HPLC-DAD system without any chromatographic column connected. In Paper III an investigation is made of the ability of a control sample, of known content and identity to diagnose and correct errors in multivariate predictions something that together with use of multivariate residuals can make it possible to use the same calibration model over time. In Paper IV a method is proposed for simultaneous determination of serum proteins with fluorescence spectroscopy and multivariate calibration. Paper V proposes a method for the determination of chromatographic peak purity by means of PCA of HPLC-DAD data. In Paper VI PARAFAC is applied for the decomposition of DAD data of some partially separated peaks into the pure chromatographic, spectral and concentration profiles.