5 resultados para agglomerative clustering
em Repositório da Produção Científica e Intelectual da Unicamp
Resumo:
Different types of water bodies, including lakes, streams, and coastal marine waters, are often susceptible to fecal contamination from a range of point and nonpoint sources, and have been evaluated using fecal indicator microorganisms. The most commonly used fecal indicator is Escherichia coli, but traditional cultivation methods do not allow discrimination of the source of pollution. The use of triplex PCR offers an approach that is fast and inexpensive, and here enabled the identification of phylogroups. The phylogenetic distribution of E. coli subgroups isolated from water samples revealed higher frequencies of subgroups A1 and B23 in rivers impacted by human pollution sources, while subgroups D1 and D2 were associated with pristine sites, and subgroup B1 with domesticated animal sources, suggesting their use as a first screening for pollution source identification. A simple classification is also proposed based on phylogenetic subgroup distribution using the w-clique metric, enabling differentiation of polluted and unpolluted sites.
Resumo:
Garlic is a spice and a medicinal plant; hence, there is an increasing interest in 'developing' new varieties with different culinary properties or with high content of nutraceutical compounds. Phenotypic traits and dominant molecular markers are predominantly used to evaluate the genetic diversity of garlic clones. However, 24 SSR markers (codominant) specific for garlic are available in the literature, fostering germplasm researches. In this study, we genotyped 130 garlic accessions from Brazil and abroad using 17 polymorphic SSR markers to assess the genetic diversity and structure. This is the first attempt to evaluate a large set of accessions maintained by Brazilian institutions. A high level of redundancy was detected in the collection (50 % of the accessions represented eight haplotypes). However, non-redundant accessions presented high genetic diversity. We detected on average five alleles per locus, Shannon index of 1.2, HO of 0.5, and HE of 0.6. A core collection was set with 17 accessions, covering 100 % of the alleles with minimum redundancy. Overall FST and D values indicate a strong genetic structure within accessions. Two major groups identified by both model-based (Bayesian approach) and hierarchical clustering (UPGMA dendrogram) techniques were coherent with the classification of accessions according to maturity time (growth cycle): early-late and midseason accessions. Assessing genetic diversity and structure of garlic collections is the first step towards an efficient management and conservation of accessions in genebanks, as well as to advance future genetic studies and improvement of garlic worldwide.
Resumo:
Monte Carlo track structures (MCTS) simulations have been recognized as useful tools for radiobiological modeling. However, the authors noticed several issues regarding the consistency of reported data. Therefore, in this work, they analyze the impact of various user defined parameters on simulated direct DNA damage yields. In addition, they draw attention to discrepancies in published literature in DNA strand break (SB) yields and selected methodologies. The MCTS code Geant4-DNA was used to compare radial dose profiles in a nanometer-scale region of interest (ROI) for photon sources of varying sizes and energies. Then, electron tracks of 0.28 keV-220 keV were superimposed on a geometric DNA model composed of 2.7 × 10(6) nucleosomes, and SBs were simulated according to four definitions based on energy deposits or energy transfers in DNA strand targets compared to a threshold energy ETH. The SB frequencies and complexities in nucleosomes as a function of incident electron energies were obtained. SBs were classified into higher order clusters such as single and double strand breaks (SSBs and DSBs) based on inter-SB distances and on the number of affected strands. Comparisons of different nonuniform dose distributions lacking charged particle equilibrium may lead to erroneous conclusions regarding the effect of energy on relative biological effectiveness. The energy transfer-based SB definitions give similar SB yields as the one based on energy deposit when ETH ≈ 10.79 eV, but deviate significantly for higher ETH values. Between 30 and 40 nucleosomes/Gy show at least one SB in the ROI. The number of nucleosomes that present a complex damage pattern of more than 2 SBs and the degree of complexity of the damage in these nucleosomes diminish as the incident electron energy increases. DNA damage classification into SSB and DSB is highly dependent on the definitions of these higher order structures and their implementations. The authors' show that, for the four studied models, different yields are expected by up to 54% for SSBs and by up to 32% for DSBs, as a function of the incident electrons energy and of the models being compared. MCTS simulations allow to compare direct DNA damage types and complexities induced by ionizing radiation. However, simulation results depend to a large degree on user-defined parameters, definitions, and algorithms such as: DNA model, dose distribution, SB definition, and the DNA damage clustering algorithm. These interdependencies should be well controlled during the simulations and explicitly reported when comparing results to experiments or calculations.
Resumo:
Often in biomedical research, we deal with continuous (clustered) proportion responses ranging between zero and one quantifying the disease status of the cluster units. Interestingly, the study population might also consist of relatively disease-free as well as highly diseased subjects, contributing to proportion values in the interval [0, 1]. Regression on a variety of parametric densities with support lying in (0, 1), such as beta regression, can assess important covariate effects. However, they are deemed inappropriate due to the presence of zeros and/or ones. To evade this, we introduce a class of general proportion density, and further augment the probabilities of zero and one to this general proportion density, controlling for the clustering. Our approach is Bayesian and presents a computationally convenient framework amenable to available freeware. Bayesian case-deletion influence diagnostics based on q-divergence measures are automatic from the Markov chain Monte Carlo output. The methodology is illustrated using both simulation studies and application to a real dataset from a clinical periodontology study.
Resumo:
Seasonally dry tropical plant formations (SDTF) are likely to exhibit phylogenetic clustering owing to niche conservatism driven by a strong environmental filter (water stress), but heterogeneous edaphic environments and life histories may result in heterogeneity in degree of phylogenetic clustering. We investigated phylogenetic patterns across ecological gradients related to water availability (edaphic environment and climate) in the Caatinga, a SDTF in Brazil. Caatinga is characterized by semiarid climate and three distinct edaphic environments - sedimentary, crystalline, and inselberg -representing a decreasing gradient in soil water availability. We used two measures of phylogenetic diversity: Net Relatedness Index based on the entire phylogeny among species present in a site, reflecting long-term diversification; and Nearest Taxon Index based on the tips of the phylogeny, reflecting more recent diversification. We also evaluated woody species in contrast to herbaceous species. The main climatic variable influencing phylogenetic pattern was precipitation in the driest quarter, particularly for herbaceous species, suggesting that environmental filtering related to minimal periods of precipitation is an important driver of Caatinga biodiversity, as one might expect for a SDTF. Woody species tended to show phylogenetic clustering whereas herbaceous species tended towards phylogenetic overdispersion. We also found phylogenetic clustering in two edaphic environments (sedimentary and crystalline) in contrast to phylogenetic overdispersion in the third (inselberg). We conclude that while niche conservatism is evident in phylogenetic clustering in the Caatinga, this is not a universal pattern likely due to heterogeneity in the degree of realized environmental filtering across edaphic environments. Thus, SDTF, in spite of a strong shared environmental filter, are potentially heterogeneous in phylogenetic structuring. Our results support the need for scientifically informed conservation strategies in the Caatinga and other SDTF regions that have not previously been prioritized for conservation in order to take into account this heterogeneity.