283 resultados para Probabilistic latent semantic analysis (PLSA)

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to both the widespread and multipurpose use of document images and the current availability of a high number of document images repositories, robust information retrieval mechanisms and systems have been increasingly demanded. This paper presents an approach to support the automatic generation of relationships among document images by exploiting Latent Semantic Indexing (LSI) and Optical Character Recognition (OCR). We developed the LinkDI (Linking of Document Images) service, which extracts and indexes document images content, computes its latent semantics, and defines relationships among images as hyperlinks. LinkDI was experimented with document images repositories, and its performance was evaluated by comparing the quality of the relationships created among textual documents as well as among their respective document images. Considering those same document images, we ran further experiments in order to compare the performance of LinkDI when it exploits or not the LSI technique. Experimental results showed that LSI can mitigate the effects of usual OCR misrecognition, which reinforces the feasibility of LinkDI relating OCR output with high degradation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Current diagnostic criteria cannot capture the full range of bipolar spectrum. This study aims to clarify the natural co-segregation of manic-depressive symptoms occurring in the general population. Methods: Using data from the Sao Paulo Catchment Area Study, latent class analysis (LCA) was applied to eleven manic and fourteen depressive symptoms assessed through CIDI 1.1 in 1464 subjects from a community-based study in Sao Paulo, Brazil. All manic symptoms were assessed, regardless of presence of euphoria or irritability, and demographics, services used, suicidality and CIDI/DSM-IIIR mood disorders used to external validate the classes. Results: The four obtained classes were labeled Euthymics (EU; 49.1%), Mild Affectives (MA; 31.1%), Bipolars (BIP; 10.7%), and Depressives (DEP; 9%). BIP and DEP classes represented bipolar and depressive spectra, respectively. Compared to DEP class, BIP exhibited more atypical depressive characteristics (hypersomnia and increase in appetite and/or weight gain), risk of suicide, and use of services. Depressives had rates of atypical symptoms and suicidality comparable to oligosymptomatic MA class subjects. Limitations: The use of lay interviewers and DSM-IIIR diagnostic criteria, which are more restrictive than the currently used DSM-IV TR. Conclusions: Findings of high prevalence of bipolar spectrum and of atypical symptoms and suicidality as indicators of bipolarity are of great clinical importance, due to different treatment needs, and higher severity. Lifetime sub-affective and syndromic manic symptoms are clinically significant, arguing for the need Of revising DSM bipolar spectrum categories. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human herpesvirus 8 (HHV-8), also known as Kaposi's sarcoma-associated herpesvirus (KSHV), is the etiologic agent of all forms of Kaposi's sarcoma, primary effusion lymphoma and the plasmablastic cell variant of multicentric Castleman disease. In endemic areas of sub-Saharan Africa, blood transfusions have been associated with a substantial risk of HHV-8 transmission. By contrast, several studies among healthy blood donors from North America have failed to detect HHV-8 DNA in samples of seropositive individuals. In this study, using a real-time PCR assay, we investigated the presence of HHV-8 DNA in whole-blood samples of 803 HHV-8 blood donors from three Brazilian states (Sao Paulo, Amazon, Bahia) who tested positive for HHV-8 antibodies, in a previous multicenter study. HHV-8 DNA was not detected in any sample. Our findings do not support the introduction of routine HHV-8 screening among healthy blood donors in Brazil. (WC = 140).

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Alternative splicing of gene transcripts greatly expands the functional capacity of the genome, and certain splice isoforms may indicate specific disease states such as cancer. Splice junction microarrays interrogate thousands of splice junctions, but data analysis is difficult and error prone because of the increased complexity compared to differential gene expression analysis. We present Rank Change Detection (RCD) as a method to identify differential splicing events based upon a straightforward probabilistic model comparing the over-or underrepresentation of two or more competing isoforms. RCD has advantages over commonly used methods because it is robust to false positive errors due to nonlinear trends in microarray measurements. Further, RCD does not depend on prior knowledge of splice isoforms, yet it takes advantage of the inherent structure of mutually exclusive junctions, and it is conceptually generalizable to other types of splicing arrays or RNA-Seq. RCD specifically identifies the biologically important cases when a splice junction becomes more or less prevalent compared to other mutually exclusive junctions. The example data is from different cell lines of glioblastoma tumors assayed with Agilent microarrays.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: The post-genomic era has brought new challenges regarding the understanding of the organization and function of the human genome. Many of these challenges are centered on the meaning of differential gene regulation under distinct biological conditions and can be performed by analyzing the Multiple Differential Expression (MDE) of genes associated with normal and abnormal biological processes. Currently MDE analyses are limited to usual methods of differential expression initially designed for paired analysis. Results: We proposed a web platform named ProbFAST for MDE analysis which uses Bayesian inference to identify key genes that are intuitively prioritized by means of probabilities. A simulated study revealed that our method gives a better performance when compared to other approaches and when applied to public expression data, we demonstrated its flexibility to obtain relevant genes biologically associated with normal and abnormal biological processes. Conclusions: ProbFAST is a free accessible web-based application that enables MDE analysis on a global scale. It offers an efficient methodological approach for MDE analysis of a set of genes that are turned on and off related to functional information during the evolution of a tumor or tissue differentiation. ProbFAST server can be accessed at http://gdm.fmrp.usp.br/probfast.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

ABSTRACT Microphysical and thermodynamical features of two tropical systems, namely Hurricane Ivan and Typhoon Conson, and one sub-tropical, Catarina, have been analyzed based on space-born radar PR measurements available on the TRMM satellite. The procedure to classify the reflectivity profiles followed the Heymsfield et al (2000) and Steiner et al (1995) methodologies. The water and ice content have been calculated using a relationship obtained with data of the surface SPOL radar and PR in Rondonia State in Brazil. The diabatic heating rate due to latent heat release has been estimated using the methodology developed by Tao et al (1990). A more detailed analysis has been performed for Hurricane Catarina, the first of its kind in South Atlantic. High water content mean value has been found in Conson and Ivan at low levels and close to their centers. Results indicate that hurricane Catarina was shallower than the other two systems, with less water and the water was concentrated closer to its center. The mean ice content in Catarina was about 0.05 g kg-1 while in Conson it was 0.06 g kg-1 and in Ivan 0.08 g kg-1. Conson and Ivan had water content up to 0.3 g kg-1 above the 0ºC layer, while Catarina had less than 0.15 g kg-1. The latent heat released by Catarina showed to be very similar to the other two systems, except in the regions closer to the center.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Stavskaya's model is a one-dimensional probabilistic cellular automaton (PCA) introduced in the end of the 1960s as an example of a model displaying a nonequilibrium phase transition. Although its absorbing state phase transition is well understood nowadays, the model never received a full numerical treatment to investigate its critical behavior. In this Brief Report we characterize the critical behavior of Stavskaya's PCA by means of Monte Carlo simulations and finite-size scaling analysis. The critical exponents of the model are calculated and indicate that its phase transition belongs to the directed percolation universality class of critical behavior, as would be expected on the basis of the directed percolation conjecture. We also explicitly establish the relationship of the model with the Domany-Kinzel PCA on its directed site percolation line, a connection that seems to have gone unnoticed in the literature so far.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdos-Renyi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabasi-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree k variation, decreasing its network recovery rate with the increase of k. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel flow-based strategy for implementing simultaneous determinations of different chemical species reacting with the same reagent(s) at different rates is proposed and applied to the spectrophotometric catalytic determination of iron and vanadium in Fe-V alloys. The method relies on the influence of Fe(II) and V(IV) on the rate of the iodide oxidation by Cr(VI) under acidic conditions, the Jones reducing agent is then needed Three different plugs of the sample are sequentially inserted into an acidic KI reagent carrier stream, and a confluent Cr(VI) solution is added downstream Overlap between the inserted plugs leads to a complex sample zone with several regions of maximal and minimal absorbance values. Measurements performed on these regions reveal the different degrees of reaction development and tend to be more precise Data are treated by multivariate calibration involving the PLS algorithm The proposed system is very simple and rugged Two latent variables carried out ca 95% of the analytical information and the results are in agreement with ICP-OES. (C) 2010 Elsevier B V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fatigue and crack propagation are phenomena affected by high uncertainties, where deterministic methods fail to predict accurately the structural life. The present work aims at coupling reliability analysis with boundary element method. The latter has been recognized as an accurate and efficient numerical technique to deal with mixed mode propagation, which is very interesting for reliability analysis. The coupled procedure allows us to consider uncertainties during the crack growth process. In addition, it computes the probability of fatigue failure for complex structural geometry and loading. Two coupling procedures are considered: direct coupling of reliability and mechanical solvers and indirect coupling by the response surface method. Numerical applications show the performance of the proposed models in lifetime assessment under uncertainties, where the direct method has shown faster convergence than response surface method. (C) 2010 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we introduce a Bayesian analysis for survival multivariate data in the presence of a covariate vector and censored observations. Different ""frailties"" or latent variables are considered to capture the correlation among the survival times for the same individual. We assume Weibull or generalized Gamma distributions considering right censored lifetime data. We develop the Bayesian analysis using Markov Chain Monte Carlo (MCMC) methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The multivariate skew-t distribution (J Multivar Anal 79:93-113, 2001; J R Stat Soc, Ser B 65:367-389, 2003; Statistics 37:359-363, 2003) includes the Student t, skew-Cauchy and Cauchy distributions as special cases and the normal and skew-normal ones as limiting cases. In this paper, we explore the use of Markov Chain Monte Carlo (MCMC) methods to develop a Bayesian analysis of repeated measures, pretest/post-test data, under multivariate null intercept measurement error model (J Biopharm Stat 13(4):763-771, 2003) where the random errors and the unobserved value of the covariate (latent variable) follows a Student t and skew-t distribution, respectively. The results and methods are numerically illustrated with an example in the field of dentistry.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Skew-normal distribution is a class of distributions that includes the normal distributions as a special case. In this paper, we explore the use of Markov Chain Monte Carlo (MCMC) methods to develop a Bayesian analysis in a multivariate, null intercept, measurement error model [R. Aoki, H. Bolfarine, J.A. Achcar, and D. Leao Pinto Jr, Bayesian analysis of a multivariate null intercept error-in -variables regression model, J. Biopharm. Stat. 13(4) (2003b), pp. 763-771] where the unobserved value of the covariate (latent variable) follows a skew-normal distribution. The results and methods are applied to a real dental clinical trial presented in [A. Hadgu and G. Koch, Application of generalized estimating equations to a dental randomized clinical trial, J. Biopharm. Stat. 9 (1999), pp. 161-178].

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate the critical behaviour of a probabilistic mixture of cellular automata (CA) rules 182 and 200 (in Wolfram`s enumeration scheme) by mean-field analysis and Monte Carlo simulations. We found that as we switch off one CA and switch on the other by the variation of the single parameter of the model, the probabilistic CA (PCA) goes through an extinction-survival-type phase transition, and the numerical data indicate that it belongs to the directed percolation universality class of critical behaviour. The PCA displays a characteristic stationary density profile and a slow, diffusive dynamics close to the pure CA 200 point that we discuss briefly. Remarks on an interesting related stochastic lattice gas are addressed in the conclusions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Predictors of random effects are usually based on the popular mixed effects (ME) model developed under the assumption that the sample is obtained from a conceptual infinite population; such predictors are employed even when the actual population is finite. Two alternatives that incorporate the finite nature of the population are obtained from the superpopulation model proposed by Scott and Smith (1969. Estimation in multi-stage surveys. J. Amer. Statist. Assoc. 64, 830-840) or from the finite population mixed model recently proposed by Stanek and Singer (2004. Predicting random effects from finite population clustered samples with response error. J. Amer. Statist. Assoc. 99, 1119-1130). Predictors derived under the latter model with the additional assumptions that all variance components are known and that within-cluster variances are equal have smaller mean squared error (MSE) than the competitors based on either the ME or Scott and Smith`s models. As population variances are rarely known, we propose method of moment estimators to obtain empirical predictors and conduct a simulation study to evaluate their performance. The results suggest that the finite population mixed model empirical predictor is more stable than its competitors since, in terms of MSE, it is either the best or the second best and when second best, its performance lies within acceptable limits. When both cluster and unit intra-class correlation coefficients are very high (e.g., 0.95 or more), the performance of the empirical predictors derived under the three models is similar. (c) 2007 Elsevier B.V. All rights reserved.