873 resultados para agglomerative clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

L'esperimento ATLAS, come gli altri esperimenti che operano al Large Hadron Collider, produce Petabytes di dati ogni anno, che devono poi essere archiviati ed elaborati. Inoltre gli esperimenti si sono proposti di rendere accessibili questi dati in tutto il mondo. In risposta a questi bisogni è stato progettato il Worldwide LHC Computing Grid che combina la potenza di calcolo e le capacità di archiviazione di più di 170 siti sparsi in tutto il mondo. Nella maggior parte dei siti del WLCG sono state sviluppate tecnologie per la gestione dello storage, che si occupano anche della gestione delle richieste da parte degli utenti e del trasferimento dei dati. Questi sistemi registrano le proprie attività in logfiles, ricchi di informazioni utili agli operatori per individuare un problema in caso di malfunzionamento del sistema. In previsione di un maggiore flusso di dati nei prossimi anni si sta lavorando per rendere questi siti ancora più affidabili e uno dei possibili modi per farlo è lo sviluppo di un sistema in grado di analizzare i file di log autonomamente e individuare le anomalie che preannunciano un malfunzionamento. Per arrivare a realizzare questo sistema si deve prima individuare il metodo più adatto per l'analisi dei file di log. In questa tesi viene studiato un approccio al problema che utilizza l'intelligenza artificiale per analizzare i logfiles, più nello specifico viene studiato l'approccio che utilizza dell'algoritmo di clustering K-means.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nei prossimi anni è atteso un aggiornamento sostanziale di LHC, che prevede di aumentare la luminosità integrata di un fattore 10 rispetto a quella attuale. Tale parametro è proporzionale al numero di collisioni per unità di tempo. Per questo, le risorse computazionali necessarie a tutti i livelli della ricostruzione cresceranno notevolmente. Dunque, la collaborazione CMS ha cominciato già da alcuni anni ad esplorare le possibilità offerte dal calcolo eterogeneo, ovvero la pratica di distribuire la computazione tra CPU e altri acceleratori dedicati, come ad esempio schede grafiche (GPU). Una delle difficoltà di questo approccio è la necessità di scrivere, validare e mantenere codice diverso per ogni dispositivo su cui dovrà essere eseguito. Questa tesi presenta la possibilità di usare SYCL per tradurre codice per la ricostruzione di eventi in modo che sia eseguibile ed efficiente su diversi dispositivi senza modifiche sostanziali. SYCL è un livello di astrazione per il calcolo eterogeneo, che rispetta lo standard ISO C++. Questo studio si concentra sul porting di un algoritmo di clustering dei depositi di energia calorimetrici, CLUE, usando oneAPI, l'implementazione SYCL supportata da Intel. Inizialmente, è stato tradotto l'algoritmo nella sua versione standalone, principalmente per prendere familiarità con SYCL e per la comodità di confronto delle performance con le versioni già esistenti. In questo caso, le prestazioni sono molto simili a quelle di codice CUDA nativo, a parità di hardware. Per validare la fisica, l'algoritmo è stato integrato all'interno di una versione ridotta del framework usato da CMS per la ricostruzione. I risultati fisici sono identici alle altre implementazioni mentre, dal punto di vista delle prestazioni computazionali, in alcuni casi, SYCL produce codice più veloce di altri livelli di astrazione adottati da CMS, presentandosi dunque come una possibilità interessante per il futuro del calcolo eterogeneo nella fisica delle alte energie.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Garlic is a spice and a medicinal plant; hence, there is an increasing interest in 'developing' new varieties with different culinary properties or with high content of nutraceutical compounds. Phenotypic traits and dominant molecular markers are predominantly used to evaluate the genetic diversity of garlic clones. However, 24 SSR markers (codominant) specific for garlic are available in the literature, fostering germplasm researches. In this study, we genotyped 130 garlic accessions from Brazil and abroad using 17 polymorphic SSR markers to assess the genetic diversity and structure. This is the first attempt to evaluate a large set of accessions maintained by Brazilian institutions. A high level of redundancy was detected in the collection (50 % of the accessions represented eight haplotypes). However, non-redundant accessions presented high genetic diversity. We detected on average five alleles per locus, Shannon index of 1.2, HO of 0.5, and HE of 0.6. A core collection was set with 17 accessions, covering 100 % of the alleles with minimum redundancy. Overall FST and D values indicate a strong genetic structure within accessions. Two major groups identified by both model-based (Bayesian approach) and hierarchical clustering (UPGMA dendrogram) techniques were coherent with the classification of accessions according to maturity time (growth cycle): early-late and midseason accessions. Assessing genetic diversity and structure of garlic collections is the first step towards an efficient management and conservation of accessions in genebanks, as well as to advance future genetic studies and improvement of garlic worldwide.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Monte Carlo track structures (MCTS) simulations have been recognized as useful tools for radiobiological modeling. However, the authors noticed several issues regarding the consistency of reported data. Therefore, in this work, they analyze the impact of various user defined parameters on simulated direct DNA damage yields. In addition, they draw attention to discrepancies in published literature in DNA strand break (SB) yields and selected methodologies. The MCTS code Geant4-DNA was used to compare radial dose profiles in a nanometer-scale region of interest (ROI) for photon sources of varying sizes and energies. Then, electron tracks of 0.28 keV-220 keV were superimposed on a geometric DNA model composed of 2.7 × 10(6) nucleosomes, and SBs were simulated according to four definitions based on energy deposits or energy transfers in DNA strand targets compared to a threshold energy ETH. The SB frequencies and complexities in nucleosomes as a function of incident electron energies were obtained. SBs were classified into higher order clusters such as single and double strand breaks (SSBs and DSBs) based on inter-SB distances and on the number of affected strands. Comparisons of different nonuniform dose distributions lacking charged particle equilibrium may lead to erroneous conclusions regarding the effect of energy on relative biological effectiveness. The energy transfer-based SB definitions give similar SB yields as the one based on energy deposit when ETH ≈ 10.79 eV, but deviate significantly for higher ETH values. Between 30 and 40 nucleosomes/Gy show at least one SB in the ROI. The number of nucleosomes that present a complex damage pattern of more than 2 SBs and the degree of complexity of the damage in these nucleosomes diminish as the incident electron energy increases. DNA damage classification into SSB and DSB is highly dependent on the definitions of these higher order structures and their implementations. The authors' show that, for the four studied models, different yields are expected by up to 54% for SSBs and by up to 32% for DSBs, as a function of the incident electrons energy and of the models being compared. MCTS simulations allow to compare direct DNA damage types and complexities induced by ionizing radiation. However, simulation results depend to a large degree on user-defined parameters, definitions, and algorithms such as: DNA model, dose distribution, SB definition, and the DNA damage clustering algorithm. These interdependencies should be well controlled during the simulations and explicitly reported when comparing results to experiments or calculations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Often in biomedical research, we deal with continuous (clustered) proportion responses ranging between zero and one quantifying the disease status of the cluster units. Interestingly, the study population might also consist of relatively disease-free as well as highly diseased subjects, contributing to proportion values in the interval [0, 1]. Regression on a variety of parametric densities with support lying in (0, 1), such as beta regression, can assess important covariate effects. However, they are deemed inappropriate due to the presence of zeros and/or ones. To evade this, we introduce a class of general proportion density, and further augment the probabilities of zero and one to this general proportion density, controlling for the clustering. Our approach is Bayesian and presents a computationally convenient framework amenable to available freeware. Bayesian case-deletion influence diagnostics based on q-divergence measures are automatic from the Markov chain Monte Carlo output. The methodology is illustrated using both simulation studies and application to a real dataset from a clinical periodontology study.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Seasonally dry tropical plant formations (SDTF) are likely to exhibit phylogenetic clustering owing to niche conservatism driven by a strong environmental filter (water stress), but heterogeneous edaphic environments and life histories may result in heterogeneity in degree of phylogenetic clustering. We investigated phylogenetic patterns across ecological gradients related to water availability (edaphic environment and climate) in the Caatinga, a SDTF in Brazil. Caatinga is characterized by semiarid climate and three distinct edaphic environments - sedimentary, crystalline, and inselberg -representing a decreasing gradient in soil water availability. We used two measures of phylogenetic diversity: Net Relatedness Index based on the entire phylogeny among species present in a site, reflecting long-term diversification; and Nearest Taxon Index based on the tips of the phylogeny, reflecting more recent diversification. We also evaluated woody species in contrast to herbaceous species. The main climatic variable influencing phylogenetic pattern was precipitation in the driest quarter, particularly for herbaceous species, suggesting that environmental filtering related to minimal periods of precipitation is an important driver of Caatinga biodiversity, as one might expect for a SDTF. Woody species tended to show phylogenetic clustering whereas herbaceous species tended towards phylogenetic overdispersion. We also found phylogenetic clustering in two edaphic environments (sedimentary and crystalline) in contrast to phylogenetic overdispersion in the third (inselberg). We conclude that while niche conservatism is evident in phylogenetic clustering in the Caatinga, this is not a universal pattern likely due to heterogeneity in the degree of realized environmental filtering across edaphic environments. Thus, SDTF, in spite of a strong shared environmental filter, are potentially heterogeneous in phylogenetic structuring. Our results support the need for scientifically informed conservation strategies in the Caatinga and other SDTF regions that have not previously been prioritized for conservation in order to take into account this heterogeneity.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

O presente trabalho consiste em um inventário da herpetofauna do Parque Estadual Carlos Botelho (PECB), localizado na região da Serra de Paranapiacaba, Estado de São Paulo. Os dados foram obtidos por meio de coletas em seis áreas dentro do PECB durante um período de 76 dias distribuídos em um ano, e também por meio de consulta a coleções científicas para obtenção de dados secundários. São apresentados resultados sobre a biologia e ocorrência das espécies no PECB e no Brasil, além de fotografias das diferentes espécies que compõem a herpetofauna do PECB. A herpetofauna do PECB pode ser considerada uma das mais diversificadas de São Paulo, com 65 espécies de anfíbios confirmadas e 59 espécies de répteis registrados neste trabalho. Das 65 espécies de anfíbios, 84% (55 spp.) são endêmicas das formações florestais da Mata Atlântica. Devido às características do relevo do PECB, foram encontrados diferentes padrões altitudinais para os anfíbios: 46% das espécies foram registradas apenas em altitudes acima de 500 m, enquanto que 9% são exclusivas das regiões abaixo de 400 m e 45% ocorrem em todas as áreas amostradas do Parque. Das 59 espécies de répteis do PECB, foram registradas 10 espécies de lagartos, 48 de serpentes e um quelônio. Dentre as serpentes coletadas no PECB, a jararaca Bothrops jararaca foi a mais frequente, com 26,9% (N = 14) do total registrado. Espécies de difícil amostragem, como Echinanthera cephalostriata (13,5%; N = 7) e Taeniophallus affinis (7,7%; N = 4), também foram numerosas no PECB. Dentre os lagartos, Enyalius iheringii foi a espécie mais abundante, com 50% (N = 16) de registros. Uma análise de agrupamento entre 25 taxocenoses de anfíbios brasileiras, incluindo o PECB, resultou na formação de quatro agrupamentos principais. A anurofauna do PECB é mais relacionada com as taxocenoses do Parque Estadual Jacupiranga (0,68) e do Parque Estadual Intervales (0,66). Estes Parques se encontram geograficamente próximos e constituem um dos maiores fragmentos preservados de Mata Atlântica no Brasil. Este trabalho é o primeiro a apresentar a lista de répteis do PECB, alem de complementar o conhecimento em relação a fauna de anfíbios do PECB.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Este trabalho teve por objetivo estudar as causas de variação nos preços de bovinos da raça nelore pertencentes a rebanhos de seleção, os quais foram comercializados em leilões, para verificar as influências das avaliações genéticas e dos julgamentos de exterior sobre esses preços. Para tanto, foram computados os preços de venda de 426 bovinos da referida raça em 12 leilões ocorridos em diversas localidades brasileiras (regiões Centro-Oeste, Norte e Sudeste), entre os anos de 2002 e 2005. O valor médio foi de R$ 3.325,49, sendo o mínimo de R$ 1.400,00 e o máximo de R$ 10.500,00. Esses dados foram digitados juntamente com outras informações que eram apresentadas nos catálogos dos leilões. As informações registradas incluíram o sexo de cada animal, o nome do leilão e as DEPs informadas nos catálogos. Além da avaliação da influência das informações dos catálogos, também foi avaliada a influência das informações dos reprodutores, pais dos animais vendidos nos leilões, envolvendo suas DEPs publicadas em um sumário de reprodutores da raça e as pontuações de suas progênies em julgamentos. Os métodos estatísticos aplicados foram análises de variâncias e análises de agrupamento (método K-médias). Como resultado, foi observado que animais com superioridade genética em características relacionadas a desempenho ponderal, considerando-se os efeitos diretos e maternos, foram valorizados ao serem comercializados nos leilões. Em contra-partida, a pontuação dos reprodutores nos julgamentos não teve influência significativa sobre os preços médios de venda de suas progênies nos leilões.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objetivou- se verificar a existência de variação genética entre cultivares de capim- colonião quanto ao efeito da maturidade sobre a composição química e a digestibilidade, e classificar os genótipos de acordo com características produtivas e de qualidade nutricional. Utilizou- se o delineamento de blocos ao acaso, com parcelas subdivididas e três repetições, considerando parcelas as datas de corte e subparcelas, os genótipos. A produção de MS diferiu entre os genótipos somente aos 90 dias de crescimento, mas a porcentagem de folhas, colmos e material morto variou tanto aos 60 como aos 90 dias de crescimento. Ao contrário do observado para as folhas, a composição química e a digestibilidade do colmo apresentou grande variabilidade entre os genótipos. O colmo apresentou concentrações mais elevadas de FDN, FDA e lignina e menores valores de PB em comparação às folhas. Apresentou ainda maior digestibilidade da MS aos 60 dias de crescimento e maior digestibilidade da FDN aos 30 e 60 dias de crescimento. No agrupamento dos cultivares, os genótipos PM39 e PM47 foram apontados como os mais promissores no programa de melhoramento, por apresentarem alta produtividade e alta qualidade nutricional. A maturidade pouco afeta a digestibilidade de folhas em comparação ao colmo. Quando a participação de colmo no total de massa seca aumenta, esse componente passa a ser o limitador da qualidade de plantas forrageiras. Portanto, programas de melhoramento devem considerar, além da relação folha:colmo, também a digestibilidade in vitro da FDN do colmo na seleção de genótipos.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Macro- and microarrays are well-established technologies to determine gene functions through repeated measurements of transcript abundance. We constructed a chicken skeletal muscle-associated array based on a muscle-specific EST database, which was used to generate a tissue expression dataset of similar to 4500 chicken genes across 5 adult tissues (skeletal muscle, heart, liver, brain, and skin). Only a small number of ESTs were sufficiently well characterized by BLAST searches to determine their probable cellular functions. Evidence of a particular tissue-characteristic expression can be considered an indication that the transcript is likely to be functionally significant. The skeletal muscle macroarray platform was first used to search for evidence of tissue-specific expression, focusing on the biological function of genes/transcripts, since gene expression profiles generated across tissues were found to be reliable and consistent. Hierarchical clustering analysis revealed consistent clustering among genes assigned to 'developmental growth', such as the ontology genes and germ layers. Accuracy of the expression data was supported by comparing information from known transcripts and tissue from which the transcript was derived with macroarray data. Hybridization assays resulted in consistent tissue expression profile, which will be useful to dissect tissue-regulatory networks and to predict functions of novel genes identified after extensive sequencing of the genomes of model organisms. Screening our skeletal-muscle platform using 5 chicken adult tissues allowed us identifying 43 'tissue-specific' transcripts, and 112 co-expressed uncharacterized transcripts with 62 putative motifs. This platform also represents an important tool for functional investigation of novel genes; to determine expression pattern according to developmental stages; to evaluate differences in muscular growth potential between chicken lines, and to identify tissue-specific genes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Xylella fastidiosa is a Gram negative plant pathogen causing many economically important diseases, and analyses of completely sequenced X. fastidiosa genome strains allowed the identification of many prophage-like elements and possibly phage remnants, accounting for up to 15% of the genome composition. To better evaluate the recent evolution of the X. fastidiosa chromosome backbone among distinct pathovars, the number and location of prophage-like regions on two finished genomes (9a5c and Temecula1), and in two candidate molecules (Ann1 and Dixon) were assessed. Based on comparative best bidirectional hit analyses, the majority (51%) of the predicted genes in the X. fastidiosa prophage-like regions are related to structural phage genes belonging to the Siphoviridae family. Electron micrograph reveals the existence of putative viral particles with similar morphology to lambda phages in the bacterial cell in planta. Moreover, analysis of microarray data indicates that 9a5c strain cultivated under stress conditions presents enhanced expression of phage anti-repressor genes, suggesting switches from lysogenic to lytic cycle of phages under stress-induced situations. Furthermore, virulence-associated proteins and toxins are found within these prophage-like elements, thus suggesting an important role in host adaptation. Finally, clustering analyses of phage integrase genes based on multiple alignment patterns reveal they group in five lineages, all possessing a tyrosine recombinase catalytic domain, and phylogenetically close to other integrases found in phages that are genetic mosaics and able to perform generalized and specialized transduction. Integration sites and tRNA association is also evidenced. In summary, we present comparative and experimental evidence supporting the association and contribution of phage activity on the differentiation of Xylella genomes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Population antimicrobial use may influence resistance emergence. Resistance is an ecological phenomenon due to potential transmissibility. We investigated spatial and temporal patterns of ciprofloxacin (CIP) population consumption related to E. coli resistance emergence and dissemination in a major Brazilian city. A total of 4,372 urinary tract infection E. coli cases, with 723 CIP resistant, were identified in 2002 from two outpatient centres. Cases were address geocoded in a digital map. Raw CIP consumption data was transformed into usage density in DDDs by CIP selling points influence zones determination. A stochastic model coupled with a Geographical Information System was applied for relating resistance and usage density and for detecting city areas of high/low resistance risk. Results: E. coli CIP resistant cluster emergence was detected and significantly related to usage density at a level of 5 to 9 CIP DDDs. There were clustered hot-spots and a significant global spatial variation in the residual resistance risk after allowing for usage density. Conclusions: There were clustered hot-spots and a significant global spatial variation in the residual resistance risk after allowing for usage density. The usage density of 5-9 CIP DDDs per 1,000 inhabitants within the same influence zone was the resistance triggering level. This level led to E. coli resistance clustering, proving that individual resistance emergence and dissemination was affected by antimicrobial population consumption.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present study determined the distribution pattern of the hermit crab Loxopagurus loxochelis by a comparison of catch, depth and environmental factors at two separate bays (Caraguatatuba and Ubatuba) of Sao Paulo State, Brazil. The influence of these parameters on the distribution of males, non- ovigerous females and ovigerous females was also evaluated. Crabs were collected monthly, over a period of one year (from July/2002 to June/2003), in seven depths, from 5 to 35 m. Abiotic factors were monitored as follows: superficial and bottom salinity (psu), superficial and bottom temperature (C), organic matter content (%) and sediment composition (%). In total, 366 hermit crabs were sampled in Caraguatatuba and 126 in Ubatuba. The highest frequency of occurrence was verified at 20 m during winter (July) in Caraguatatuba and 25 m during summer (January) in Ubatuba. The highest occurrences were recorded in the regions with bottom salinities ranging from 34 to 36 psu, bottom temperatures from 18 to 24 C and, low percentages of organic matter, gravel and mud; and large proportion of sand in the substrate. There was no significant correlation between the total frequency of organisms and the environmental factors analyzed in both regions. This evidence suggests that other variables as biotic interactions can influence the pattern of distribution of L. loxochelis in the analyzed region, which is considered the limit of the northern distribution of this species.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dengue viruses (DENV) serotypes 1, 2, and 3 have been causing yearly outbreaks in Brazil. In this study, we report the reintroduction of DENV2 in the coast of Sao Paulo State. Partial envelope viral genes were sequenced from eighteen patients with dengue fever during the 2010 epidemic. Phylogenetic analysis showed this strain belongs to the American/Asian genotype and was closely related to the virus that circulated in Rio de Janeiro in 2007 and 2008. The phylogeny also showed no clustering by clinical presentation, suggesting that the disease severity could not be explained by distinct variants or genotypes. The time of the most recent common ancestor of American/Asian genotype and the Sao Paulo and Rio de Janeiro (SP/RJ) monophyletic cluster was estimated to be around 40 and 10 years, respectively. Since this virus was first identified in Brazil in 2007, we suggest that it was already circulating in the country before causing the first documented outbreak. This is the first description of the 2010 outbreak in the State of Sao Paulo, Brazil, and should contribute to efforts to control and monitor the spread of DENVs in endemic areas.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Since establishing universal free access to antiretroviral therapy in 1996, the Brazilian Health System has increased the number of centers providing HIV/AIDS outpatient care from 33 to 540. There had been no formal monitoring of the quality of these services until a survey of 336 AIDS health centers across 7 Brazilian states was undertaken in 2002. Managers of the services were asked to assess their clinics according to parameters of service inputs and service delivery processes. This report analyzes the survey results and identifies predictors of the overall quality of service delivery. Methods: The survey involved completion of a multiple-choice questionnaire comprising 107 parameters of service inputs and processes of delivering care, with responses assessed according to their likely impact on service quality using a 3-point scale. K-means clustering was used to group these services according to their scored responses. Logistic regression analysis was performed to identify predictors of high service quality. Results: The questionnaire was completed by 95.8% (322) of the managers of the sites surveyed. Most sites scored about 50% of the benchmark expectation. K-means clustering analysis identified four quality levels within which services could be grouped: 76 services (24%) were classed as level 1 (best), 53 (16%) as level 2 (medium), 113 (35%) as level 3 (poor), and 80 (25%) as level 4 (very poor). Parameters of service delivery processes were more important than those relating to service inputs for determining the quality classification. Predictors of quality services included larger care sites, specialization for HIV/AIDS, and location within large municipalities. Conclusion: The survey demonstrated highly variable levels of HIV/AIDS service quality across the sites. Many sites were found to have deficiencies in the processes of service delivery processes that could benefit from quality improvement initiatives. These findings could have implications for how HIV/AIDS services are planned in Brazil to achieve quality standards, such as for where service sites should be located, their size and staffing requirements. A set of service delivery indicators has been identified that could be used for routine monitoring of HIV/AIDS service delivery for HIV/AIDS in Brazil (and potentially in other similar settings).