76 resultados para MCMC algorithms
Resumo:
Background: Genome wide association studies (GWAS) are becoming the approach of choice to identify genetic determinants of complex phenotypes and common diseases. The astonishing amount of generated data and the use of distinct genotyping platforms with variable genomic coverage are still analytical challenges. Imputation algorithms combine directly genotyped markers information with haplotypic structure for the population of interest for the inference of a badly genotyped or missing marker and are considered a near zero cost approach to allow the comparison and combination of data generated in different studies. Several reports stated that imputed markers have an overall acceptable accuracy but no published report has performed a pair wise comparison of imputed and empiric association statistics of a complete set of GWAS markers. Results: In this report we identified a total of 73 imputed markers that yielded a nominally statistically significant association at P < 10(-5) for type 2 Diabetes Mellitus and compared them with results obtained based on empirical allelic frequencies. Interestingly, despite their overall high correlation, association statistics based on imputed frequencies were discordant in 35 of the 73 (47%) associated markers, considerably inflating the type I error rate of imputed markers. We comprehensively tested several quality thresholds, the haplotypic structure underlying imputed markers and the use of flanking markers as predictors of inaccurate association statistics derived from imputed markers. Conclusions: Our results suggest that association statistics from imputed markers showing specific MAF (Minor Allele Frequencies) range, located in weak linkage disequilibrium blocks or strongly deviating from local patterns of association are prone to have inflated false positive association signals. The present study highlights the potential of imputation procedures and proposes simple procedures for selecting the best imputed markers for follow-up genotyping studies.
Resumo:
Background: Hepatitis C virus (HCV) is an important human pathogen affecting around 3% of the human population. In Brazil, it is estimated that there are approximately 2 to 3 million HCV chronic carriers. There are few reports of HCV prevalence in Rondonia State (RO), but it was estimated in 9.7% from 1999 to 2005. The aim of this study was to characterize HCV genotypes in 58 chronic HCV infected patients from Porto Velho, Rondonia (RO), Brazil. Methods: A fragment of 380 bp of NS5B region was amplified by nested PCR for genotyping analysis. Viral sequences were characterized by phylogenetic analysis using reference sequences obtained from the GenBank (n = 173). Sequences were aligned using Muscle software and edited in the SE-AL software. Phylogenetic analyses were conducted using Bayesian Markov chain Monte Carlo simulation (MCMC) to obtain the MCC tree using BEAST v. 1.5.3. Results: From 58 anti-HCV positive samples, 22 were positive to the NS5B fragment and successfully sequenced. Genotype 1b was the most prevalent in this population (50%), followed by 1a (27.2%), 2b (13.6%) and 3a (9.0%). Conclusions: This study is the first report of HCV genotypes from Rondonia State and subtype 1b was found to be the most prevalent. This subtype is mostly found among people who have a previous history of blood transfusion but more detailed studies with a larger number of patients are necessary to understand the HCV dynamics in the population of Rondonia State, Brazil.
Resumo:
Background: Hepatitis B virus (HBV) can be classified into nine genotypes (A-I) defined by sequence divergence of more than 8% based on the complete genome. This study aims to identify the genotypic distribution of HBV in 40 HBsAg-positive patients from Rondonia, Brazil. A fragment of 1306 bp partially comprising surface and polymerase overlapping genes was amplified by PCR. Amplified DNA was purified and sequenced. Amplified DNA was purified and sequenced on an ABI PRISM (R) 377 Automatic Sequencer (Applied Biosystems, Foster City, CA, USA). The obtained sequences were aligned with reference sequences obtained from the GenBank using Clustal X software and then edited with Se-Al software. Phylogenetic analyses were conducted by the Markov Chain Monte Carlo (MCMC) approach using BEAST v.1.5.3. Results: The subgenotypes distribution was A1 (37.1%), D3 (22.8%), F2a (20.0%), D4 (17.1%) and D2 (2.8%). Conclusions: These results for the first HBV genotypic characterization in Rondonia state are consistent with other studies in Brazil, showing the presence of several HBV genotypes that reflects the mixed origin of the population, involving descendants from Native Americans, Europeans, and Africans.
Resumo:
Background: The Brazilian population is mainly descendant from European colonizers, Africans and Native Americans. Some Afro-descendants lived in small isolated communities since the slavery period. The epidemiological status of HBV infection in Quilombos communities from northeast of Brazil remains unknown. The aim of this study was to characterize the HBV genotypes circulating inside a Quilombo isolated community from Maranhao State, Brazil. Methods: Seventy-two samples from Frechal Quilombo community at Maranhao were collected. All serum samples were screened by enzyme-linked immunosorbent assays for the presence of hepatitis B surface antigen ( HBsAg). HBsAg positive samples were submitted to DNA extraction and a fragment of 1306 bp partially comprising HBsAg and polymerase coding regions (S/POL) was amplified by nested PCR and its nucleotide sequence was determined. Viral isolates were genotyped by phylogenetic analysis using reference sequences from each genotype obtained from GenBank (n = 320). Sequences were aligned using Muscle software and edited in the SE-AL software. Bayesian phylogenetic analyses were conducted using Markov Chain Monte Carlo (MCMC) method to obtain the MCC tree using BEAST v.1.5.3. Results: Of the 72 individuals, 9 (12.5%) were HBsAg-positive and 4 of them were successfully sequenced for the 1306 bp fragment. All these samples were genotype A1 and grouped together with other sequences reported from Brazil. Conclusions: The present study represents the first report on the HBV genotypes characterization of this community in the Maranhao state in Brazil where a high HBsAg frequency was found. In this study, we reported a high frequency of HBV infection and the exclusive presence of subgenotype A1 in an Afro-descendent community in the Maranhao State, Brazil.
Resumo:
Background: GB virus C (GBV-C) is an enveloped positive-sense ssRNA virus belonging to the Flaviviridae family. Studies on the genetic variability of the GBV-C reveals the existence of six genotypes: genotype 1 predominates in West Africa, genotype 2 in Europe and America, genotype 3 in Asia, genotype 4 in Southwest Asia, genotype 5 in South Africa and genotype 6 in Indonesia. The aim of this study was to determine the frequency and genotypic distribution of GBV-C in the Colombian population. Methods: Two groups were analyzed: i) 408 Colombian blood donors infected with HCV (n = 250) and HBV (n = 158) from Bogota and ii) 99 indigenous people with HBV infection from Leticia, Amazonas. A fragment of 344 bp from the 5' untranslated region (5' UTR) was amplified by nested RT PCR. Viral sequences were genotyped by phylogenetic analysis using reference sequences from each genotype obtained from GenBank (n = 160). Bayesian phylogenetic analyses were conducted using Markov chain Monte Carlo (MCMC) approach to obtain the MCC tree using BEAST v. 1.5.3. Results: Among blood donors, from 158 HBsAg positive samples, eight 5.06% (n = 8) were positive for GBV-C and from 250 anti-HCV positive samples, 3.2%(n = 8) were positive for GBV-C. Also, 7.7% (n = 7) GBV-C positive samples were found among indigenous people from Leticia. A phylogenetic analysis revealed the presence of the following GBV-C genotypes among blood donors: 2a (41.6%), 1 (33.3%), 3 (16.6%) and 2b (8.3%). All genotype 1 sequences were found in co-infection with HBV and 4/5 sequences genotype 2a were found in co-infection with HCV. All sequences from indigenous people from Leticia were classified as genotype 3. The presence of GBV-C infection was not correlated with the sex (p = 0.43), age (p = 0.38) or origin (p = 0.17). Conclusions: It was found a high frequency of GBV-C genotype 1 and 2 in blood donors. The presence of genotype 3 in indigenous population was previously reported from Santa Marta region in Colombia and in native people from Venezuela and Bolivia. This fact may be correlated to the ancient movements of Asian people to South America a long time ago.
Resumo:
This paper presents a new statistical algorithm to estimate rainfall over the Amazon Basin region using the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI). The algorithm relies on empirical relationships derived for different raining-type systems between coincident measurements of surface rainfall rate and 85-GHz polarization-corrected brightness temperature as observed by the precipitation radar (PR) and TMI on board the TRMM satellite. The scheme includes rain/no-rain area delineation (screening) and system-type classification routines for rain retrieval. The algorithm is validated against independent measurements of the TRMM-PR and S-band dual-polarization Doppler radar (S-Pol) surface rainfall data for two different periods. Moreover, the performance of this rainfall estimation technique is evaluated against well-known methods, namely, the TRMM-2A12 [ the Goddard profiling algorithm (GPROF)], the Goddard scattering algorithm (GSCAT), and the National Environmental Satellite, Data, and Information Service (NESDIS) algorithms. The proposed algorithm shows a normalized bias of approximately 23% for both PR and S-Pol ground truth datasets and a mean error of 0.244 mm h(-1) ( PR) and -0.157 mm h(-1)(S-Pol). For rain volume estimates using PR as reference, a correlation coefficient of 0.939 and a normalized bias of 0.039 were found. With respect to rainfall distributions and rain area comparisons, the results showed that the formulation proposed is efficient and compatible with the physics and dynamics of the observed systems over the area of interest. The performance of the other algorithms showed that GSCAT presented low normalized bias for rain areas and rain volume [0.346 ( PR) and 0.361 (S-Pol)], and GPROF showed rainfall distribution similar to that of the PR and S-Pol but with a bimodal distribution. Last, the five algorithms were evaluated during the TRMM-Large-Scale Biosphere-Atmosphere Experiment in Amazonia (LBA) 1999 field campaign to verify the precipitation characteristics observed during the easterly and westerly Amazon wind flow regimes. The proposed algorithm presented a cumulative rainfall distribution similar to the observations during the easterly regime, but it underestimated for the westerly period for rainfall rates above 5 mm h(-1). NESDIS(1) overestimated for both wind regimes but presented the best westerly representation. NESDIS(2), GSCAT, and GPROF underestimated in both regimes, but GPROF was closer to the observations during the easterly flow.
Resumo:
Context. Classical Be stars are rapid rotators of spectral type late O to early A and luminosity class V-III, which exhibit Balmer emission lines and often a near infrared excess originating in an equatorially concentrated circumstellar envelope, both produced by sporadic mass ejection episodes. The causes of the abnormal mass loss (the so-called Be phenomenon) are as yet unknown. Aims. For the first time, we can now study in detail Be stars outside the Earth's atmosphere with sufficient temporal resolution. We investigate the variability of the Be Star CoRoT-ID 102761769 observed with the CoRoT satellite in the exoplanet field during the initial run. Methods. One low-resolution spectrum of the star was obtained with the INT telescope at the Observatorio del Roque de los Muchachos. A time series analysis was performed using both cleanest and singular spectrum analysis algorithms to the CoRoT light curve. To identify the pulsation modes of the observed frequencies, we computed a set of models representative of CoRoT-ID 102761769 by varying its main physical parameters inside the uncertainties discussed. Results. We found two close frequencies related to the star. They are 2.465 c d(-1) (28.5 mu Hz) and 2.441 c d(-1) (28.2 mu Hz). The precision to which those frequencies were found is 0.018 c d(-1) (0.2 mu Hz). The projected stellar rotation was estimated to be 120 km s(-1) from the Fourier transform of spectral lines. If CoRoT-ID 102761769 is a typical Galactic Be star it rotates near the critical velocity. The critical rotation frequency of a typical B5-6 star is about 3.5 c d(-1) (40.5 mu Hz), which implies that the above frequencies are really caused by stellar pulsations rather than star's rotation.
Resumo:
Context. CoRoT is a pioneering space mission devoted to the analysis of stellar variability and the photometric detection of extrasolar planets. Aims. We present the list of planetary transit candidates detected in the first field observed by CoRoT, IRa01, the initial run toward the Galactic anticenter, which lasted for 60 days. Methods. We analysed 3898 sources in the coloured bands and 5974 in the monochromatic band. Instrumental noise and stellar variability were taken into account using detrending tools before applying various transit search algorithms. Results. Fifty sources were classified as planetary transit candidates and the most reliable 40 detections were declared targets for follow-up ground-based observations. Two of these targets have so far been confirmed as planets, CoRoT-1b and CoRoT-4b, for which a complete characterization and specific studies were performed.
Resumo:
Aims. In this work, we describe the pipeline for the fast supervised classification of light curves observed by the CoRoT exoplanet CCDs. We present the classification results obtained for the first four measured fields, which represent a one-year in-orbit operation. Methods. The basis of the adopted supervised classification methodology has been described in detail in a previous paper, as is its application to the OGLE database. Here, we present the modifications of the algorithms and of the training set to optimize the performance when applied to the CoRoT data. Results. Classification results are presented for the observed fields IRa01, SRc01, LRc01, and LRa01 of the CoRoT mission. Statistics on the number of variables and the number of objects per class are given and typical light curves of high-probability candidates are shown. We also report on new stellar variability types discovered in the CoRoT data. The full classification results are publicly available.
Resumo:
The VISTA near infrared survey of the Magellanic System (VMC) will provide deep YJK(s) photometry reaching stars in the oldest turn-off point throughout the Magellanic Clouds (MCs). As part of the preparation for the survey, we aim to access the accuracy in the star formation history (SFH) that can be expected from VMC data, in particular for the Large Magellanic Cloud (LMC). To this aim, we first simulate VMC images containing not only the LMC stellar populations but also the foreground Milky Way (MW) stars and background galaxies. The simulations cover the whole range of density of LMC field stars. We then perform aperture photometry over these simulated images, access the expected levels of photometric errors and incompleteness, and apply the classical technique of SFH-recovery based on the reconstruction of colour-magnitude diagrams (CMD) via the minimisation of a chi-squared-like statistics. We verify that the foreground MW stars are accurately recovered by the minimisation algorithms, whereas the background galaxies can be largely eliminated from the CMD analysis due to their particular colours and morphologies. We then evaluate the expected errors in the recovered star formation rate as a function of stellar age, SFR(t), starting from models with a known age-metallicity relation (AMR). It turns out that, for a given sky area, the random errors for ages older than similar to 0.4 Gyr seem to be independent of the crowding. This can be explained by a counterbalancing effect between the loss of stars from a decrease in the completeness and the gain of stars from an increase in the stellar density. For a spatial resolution of similar to 0.1 deg(2), the random errors in SFR(t) will be below 20% for this wide range of ages. On the other hand, due to the lower stellar statistics for stars younger than similar to 0.4 Gyr, the outer LMC regions will require larger areas to achieve the same level of accuracy in the SFR( t). If we consider the AMR as unknown, the SFH-recovery algorithm is able to accurately recover the input AMR, at the price of an increase of random errors in the SFR(t) by a factor of about 2.5. Experiments of SFH-recovery performed for varying distance modulus and reddening indicate that these parameters can be determined with (relative) accuracies of Delta(m-M)(0) similar to 0.02 mag and Delta E(B-V) similar to 0.01 mag, for each individual field over the LMC. The propagation of these errors in the SFR(t) implies systematic errors below 30%. This level of accuracy in the SFR(t) can reveal significant imprints in the dynamical evolution of this unique and nearby stellar system, as well as possible signatures of the past interaction between the MCs and the MW.
Resumo:
We study the star/galaxy classification efficiency of 13 different decision tree algorithms applied to photometric objects in the Sloan Digital Sky Survey Data Release Seven (SDSS-DR7). Each algorithm is defined by a set of parameters which, when varied, produce different final classification trees. We extensively explore the parameter space of each algorithm, using the set of 884,126 SDSS objects with spectroscopic data as the training set. The efficiency of star-galaxy separation is measured using the completeness function. We find that the Functional Tree algorithm (FT) yields the best results as measured by the mean completeness in two magnitude intervals: 14 <= r <= 21 (85.2%) and r >= 19 (82.1%). We compare the performance of the tree generated with the optimal FT configuration to the classifications provided by the SDSS parametric classifier, 2DPHOT, and Ball et al. We find that our FT classifier is comparable to or better in completeness over the full magnitude range 15 <= r <= 21, with much lower contamination than all but the Ball et al. classifier. At the faintest magnitudes (r > 19), our classifier is the only one that maintains high completeness (> 80%) while simultaneously achieving low contamination (similar to 2.5%). We also examine the SDSS parametric classifier (psfMag - modelMag) to see if the dividing line between stars and galaxies can be adjusted to improve the classifier. We find that currently stars in close pairs are often misclassified as galaxies, and suggest a new cut to improve the classifier. Finally, we apply our FT classifier to separate stars from galaxies in the full set of 69,545,326 SDSS photometric objects in the magnitude range 14 <= r <= 21.
Resumo:
The attenuation of. mesons in cold nuclear matter has been investigated via the time-dependent multiple-scattering Monte Carlo multicollisional (MCMC) intranuclear cascade model. The inelastic. width deduced from CBELSA/TAPS Collaboration data of meson transparency in complex nuclei (Gamma* similar or equal to 30 MeV/c(2)) is approximately 5 times lower than the value obtained with recent theoretical models and consistent with an in-medium total omega N cross section within 25-30 mb for an average meson momentum of 1.1 GeV/c. The momentum-dependent transparency ratios suggest an elastic/total cross-section ratio around 40%. For the case of CLAS Collaboration data a much higher width is deduced (Gamma* greater than or similar to 120 MeV/c(2)), with the MCMC model providing a consistent interpretation of the data, assuming a much higher meson absorption (sigma(omega N)* greater than or similar to 100 mb) for p(omega) similar to 1.7 GeV/c.
Resumo:
The mechanism of incoherent pi(0) and eta photoproduction from complex nuclei is investigated from 4 to 12 GeV with an extended version of the multicollisional Monte Carlo (MCMC) intranuclear cascade model. The calculations take into account the elementary photoproduction amplitudes via a Regge model and the nuclear effects of photon shadowing, Pauli blocking, and meson-nucleus final-state interactions. The results for pi(0) photoproduction reproduced for the first time the magnitude and energy dependence of the measured rations sigma(gamma A)/sigma(gamma N) for several nuclei (Be, C, Al, Cu, Ag, and Pb) from a Cornell experiment. The results for eta photoproduction fitted the inelastic background in Cornell's yields remarkably well, which is clearly not isotropic as previously considered in Cornell's analysis. With this constraint for the background, the eta -> gamma gamma. decay width was extracted using the Primakoff method, combining Be and Cu data [Gamma(eta ->gamma gamma) = 0.476(62) keV] and using Be data only [Gamma(eta ->gamma gamma) = 0.512(90) keV]; where the errors are only statistical. These results are in sharp contrast (similar to 50-60%) with the value reported by the Cornell group [Gamma(eta ->gamma gamma). = 0.324(46) keV] and in line with the Particle Data Group average of 0.510(26) keV.
Resumo:
We study the spin-1/2 Ising model on a Bethe lattice in the mean-field limit, with the interaction constants following one of two deterministic aperiodic sequences, the Fibonacci or period-doubling one. New algorithms of sequence generation were implemented, which were fundamental in obtaining long sequences and, therefore, precise results. We calculate the exact critical temperature for both sequences, as well as the critical exponents beta, gamma, and delta. For the Fibonacci sequence, the exponents are classical, while for the period-doubling one they depend on the ratio between the two exchange constants. The usual relations between critical exponents are satisfied, within error bars, for the period-doubling sequence. Therefore, we show that mean-field-like procedures may lead to nonclassical critical exponents.
Resumo:
Multispectral widefield optical imaging has the potential to improve early detection of oral cancer. The appropriate selection of illumination and collection conditions is required to maximize diagnostic ability. The goals of this study were to (i) evaluate image contrast between oral cancer/precancer and non-neoplastic mucosa for a variety of imaging modalities and illumination/collection conditions, and (ii) use classification algorithms to evaluate and compare the diagnostic utility of these modalities to discriminate cancers and precancers from normal tissue. Narrowband reflectance, autofluorescence, and polarized reflectance images were obtained from 61 patients and 11 normal volunteers. Image contrast was compared to identify modalities and conditions yielding greatest contrast. Image features were extracted and used to train and evaluate classification algorithms to discriminate tissue as non-neoplastic, dysplastic, or cancer; results were compared to histologic diagnosis. Autofluorescence imaging at 405-nm excitation provided the greatest image contrast, and the ratio of red-to-green fluorescence intensity computed from these images provided the best classification of dysplasia/cancer versus non-neoplastic tissue. A sensitivity of 100% and a specificity of 85% were achieved in the validation set. Multispectral widefield images can accurately distinguish neoplastic and non-neoplastic tissue; however, the ability to separate precancerous lesions from cancers with this technique was limited. (C) 2010 Society of Photo-Optical Instrumentation Engineers. [DOI: 10.1117/1.3516593]