27 resultados para Genetic clustering analysis
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Immobile location-allocation (LA) problems is a type of LA problem that consists in determining the service each facility should offer in order to optimize some criterion (like the global demand), given the positions of the facilities and the customers. Due to the complexity of the problem, i.e. it is a combinatorial problem (where is the number of possible services and the number of facilities) with a non-convex search space with several sub-optimums, traditional methods cannot be applied directly to optimize this problem. Thus we proposed the use of clustering analysis to convert the initial problem into several smaller sub-problems. By this way, we presented and analyzed the suitability of some clustering methods to partition the commented LA problem. Then we explored the use of some metaheuristic techniques such as genetic algorithms, simulated annealing or cuckoo search in order to solve the sub-problems after the clustering analysis
A new approach to segmentation based on fusing circumscribed contours, region growing and clustering
Resumo:
One of the major problems in machine vision is the segmentation of images of natural scenes. This paper presents a new proposal for the image segmentation problem which has been based on the integration of edge and region information. The main contours of the scene are detected and used to guide the posterior region growing process. The algorithm places a number of seeds at both sides of a contour allowing stating a set of concurrent growing processes. A previous analysis of the seeds permits to adjust the homogeneity criterion to the regions's characteristics. A new homogeneity criterion based on clustering analysis and convex hull construction is proposed
Resumo:
We present in this paper the results of the application of several visual methods on a group of locations, dated between VI and I centuries BC, of the ager Tarraconensis (Tarragona, Spain) a Hinterland of the roman colony of Tarraco. The difficulty in interpreting the diverse results in a combined way has been resolved by means of the use of statistical methods, such as Principal Components Analysis (PCA) and K-means clustering analysis. These methods have allowed us to carry out site classifications in function of the landscape's visual structure that contains them and of the visual relationships that could be given among them.
Resumo:
Image segmentation of natural scenes constitutes a major problem in machine vision. This paper presents a new proposal for the image segmentation problem which has been based on the integration of edge and region information. This approach begins by detecting the main contours of the scene which are later used to guide a concurrent set of growing processes. A previous analysis of the seed pixels permits adjustment of the homogeneity criterion to the region's characteristics during the growing process. Since the high variability of regions representing outdoor scenes makes the classical homogeneity criteria useless, a new homogeneity criterion based on clustering analysis and convex hull construction is proposed. Experimental results have proven the reliability of the proposed approach
Resumo:
HEMOLIA (a project under European community’s 7th framework programme) is a new generation Anti-Money Laundering (AML) intelligent multi-agent alert and investigation system which in addition to the traditional financial data makes extensive use of modern society’s huge telecom data source, thereby opening up a new dimension of capabilities to all Money Laundering fighters (FIUs, LEAs) and Financial Institutes (Banks, Insurance Companies, etc.). This Master-Thesis project is done at AIA, one of the partners for the HEMOLIA project in Barcelona. The objective of this thesis is to find the clusters in a network drawn by using the financial data. An extensive literature survey has been carried out and several standard algorithms related to networks have been studied and implemented. The clustering problem is a NP-hard problem and several algorithms like K-Means and Hierarchical clustering are being implemented for studying several problems relating to sociology, evolution, anthropology etc. However, these algorithms have certain drawbacks which make them very difficult to implement. The thesis suggests (a) a possible improvement to the K-Means algorithm, (b) a novel approach to the clustering problem using the Genetic Algorithms and (c) a new algorithm for finding the cluster of a node using the Genetic Algorithm.
Resumo:
We study the properties of the well known Replicator Dynamics when applied to a finitely repeated version of the Prisoners' Dilemma game. We characterize the behavior of such dynamics under strongly simplifying assumptions (i.e. only 3 strategies are available) and show that the basin of attraction of defection shrinks as the number of repetitions increases. After discussing the difficulties involved in trying to relax the 'strongly simplifying assumptions' above, we approach the same model by means of simulations based on genetic algorithms. The resulting simulations describe a behavior of the system very close to the one predicted by the replicator dynamics without imposing any of the assumptions of the analytical model. Our main conclusion is that analytical and computational models are good complements for research in social sciences. Indeed, while on the one hand computational models are extremely useful to extend the scope of the analysis to complex scenar
Resumo:
Creative industries tend to concentrate mainly around large- and medium-sized cities, forming creative local production systems. The text analyses the forces behind clustering of creative industries to provide the first empirical explanation of the determinants of creative employment clustering following a multidisciplinary approach based on cultural and creative economics, evolutionary geography and urban economics. A comparative analysis has been performed for Italy and Spain. The results show different patterns of creative employment clustering in both countries. The small role of historical and cultural endowments, the size of the place, the average size of creative industries, the productive diversity and the concentration of human capital and creative class have been found as common factors of clustering in both countries.
Resumo:
The genetic diversity of three temperate fruit tree phytoplasmas ‘Candidatus Phytoplasma prunorum’, ‘Ca. P. mali’ and ‘Ca. P. pyri’ has been established by multilocus sequence analysis. Among the four genetic loci used, the genes imp and aceF distinguished 30 and 24 genotypes, respectively, and showed the highest variability. Percentage of substitution for imp ranged from 50 to 68% according to species. Percentage of substitution varied between 9 and 12% for aceF, whereas it was between 5 and 6% for pnp and secY. In the case of ‘Ca P. prunorum’ the three most prevalent aceF genotypes were detected in both plants and insect vectors, confirming that the prevalent isolates are propagated by insects. The four isolates known to be hypo-virulent had the same aceF sequence, indicating a possible monophyletic origin. Haplotype network reconstructed by eBURST revealed that among the 34 haplotypes of ‘Ca. P. prunorum’, the four hypo-virulent isolates also grouped together in the same clade. Genotyping of some Spanish and Azerbaijanese ‘Ca. P. pyri’ isolates showed that they shared some alleles with ‘Ca. P. prunorum’, supporting for the first time to our knowledge, the existence of inter-species recombination between these two species.
Resumo:
Report for the scientific sojourn at the University of Reading, United Kingdom, from January until May 2008. The main objectives have been firstly to infer population structure and parameters in demographic models using a total of 13 microsatellite loci for genotyping approximately 30 individuals per population in 10 Palinurus elephas populations both from Mediterranean and Atlantic waters. Secondly, developing statistical methods to identify discrepant loci, possibly under selection and implement those methods using the R software environment. It is important to consider that the calculation of the probability distribution of the demographic and mutational parameters for a full genetic data set is numerically difficult for complex demographic history (Stephens 2003). The Approximate Bayesian Computation (ABC), based on summary statistics to infer posterior distributions of variable parameters without explicit likelihood calculations, can surmount this difficulty. This would allow to gather information on different demographic prior values (i.e. effective population sizes, migration rate, microsatellite mutation rate, mutational processes) and assay the sensitivity of inferences to demographic priors by assuming different priors.
Resumo:
In an earlier investigation (Burger et al., 2000) five sediment cores near the RodriguesTriple Junction in the Indian Ocean were studied applying classical statistical methods(fuzzy c-means clustering, linear mixing model, principal component analysis) for theextraction of endmembers and evaluating the spatial and temporal variation ofgeochemical signals. Three main factors of sedimentation were expected by the marinegeologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. Thedisplay of fuzzy membership values and/or factor scores versus depth providedconsistent results for two factors only; the ultra-basic component could not beidentified. The reason for this may be that only traditional statistical methods wereapplied, i.e. the untransformed components were used and the cosine-theta coefficient assimilarity measure.During the last decade considerable progress in compositional data analysis was madeand many case studies were published using new tools for exploratory analysis of thesedata. Therefore it makes sense to check if the application of suitable data transformations,reduction of the D-part simplex to two or three factors and visualinterpretation of the factor scores would lead to a revision of earlier results and toanswers to open questions . In this paper we follow the lines of a paper of R. Tolosana-Delgado et al. (2005) starting with a problem-oriented interpretation of the biplotscattergram, extracting compositional factors, ilr-transformation of the components andvisualization of the factor scores in a spatial context: The compositional factors will beplotted versus depth (time) of the core samples in order to facilitate the identification ofthe expected sources of the sedimentary process.Kew words: compositional data analysis, biplot, deep sea sediments
Resumo:
Studies of large sets of SNP data have proven to be a powerful tool in the analysis of the genetic structure of human populations. In this work, we analyze genotyping data for 2,841 SNPs in 12 Sub-Saharan African populations, including a previously unsampled region of south-eastern Africa (Mozambique). We show that robust results in a world-wide perspective can be obtained when analyzing only 1,000 SNPs. Our main results both confirm the results of previous studies, and show new and interesting features in Sub-Saharan African genetic complexity. There is a strong differentiation of Nilo-Saharans, much beyond what would be expected by geography. Hunter-gatherer populations (Khoisan and Pygmies) show a clear distinctiveness with very intrinsic Pygmy (and not only Khoisan) genetic features. Populations of the West Africa present an unexpected similarity among them, possibly the result of a population expansion. Finally, we find a strong differentiation of the south-eastern Bantu population from Mozambique, which suggests an assimilation of a pre-Bantu substrate by Bantu speakers in the region.
Resumo:
Background: Germline genetic variation is associated with the differential expression of many human genes. The phenotypic effects of this type of variation may be important when considering susceptibility to common genetic diseases. Three regions at 8q24 have recently been identified to independently confer risk of prostate cancer. Variation at 8q24 has also recently been associated with risk of breast and colorectal cancer. However, none of the risk variants map at or relatively close to known genes, with c-MYC mapping a few hundred kilobases distally. Results: This study identifies cis-regulators of germline c-MYC expression in immortalized lymphocytes of HapMap individuals. Quantitative analysis of c-MYC expression in normal prostate tissues suggests an association between overexpression and variants in Region 1 of prostate cancer risk. Somatic c-MYC overexpression correlates with prostate cancer progression and more aggressive tumor forms, which was also a pathological variable associated with Region 1. Expression profiling analysis and modeling of transcriptional regulatory networks predicts a functional association between MYC and the prostate tumor suppressor KLF6. Analysis of MYC/Myc-driven cell transformation and tumorigenesis substantiates a model in which MYC overexpression promotes transformation by down-regulating KLF6. In this model, a feedback loop through E-cadherin down-regulation causes further transactivation of c-MYC.Conclusion: This study proposes that variation at putative 8q24 cis-regulator(s) of transcription can significantly alter germline c-MYC expression levels and, thus, contribute to prostate cancer susceptibility by down-regulating the prostate tumor suppressor KLF6 gene.
Resumo:
Next-generation sequencing techniques such as exome sequencing can successfully detect all genetic variants in a human exome and it has been useful together with the implementation of variant filters to identify causing-disease mutations. Two filters aremainly used for the mutations identification: low allele frequency and the computational annotation of the genetic variant. Bioinformatic tools to predict the effect of a givenvariant may have errors due to the existing bias in databases and sometimes show a limited coincidence among them. Advances in functional and comparative genomics are needed in order to properly annotate these variants.The goal of this study is to: first, functionally annotate Common Variable Immunodeficiency disease (CVID) variants with the available bioinformatic methods in order to assess the reliability of these strategies. Sencondly, as the development of new methods to reduce the number of candidate genetic variants is an active and necessary field of research, we are exploring the utility of gene function information at organism level as a filter for rare disease genes identification. Recently, it has been proposed that only 10-15% of human genes are essential and therefore we would expect that severe rare diseases are mostly caused by mutations on them. Our goal is to determine whether or not these rare and severe diseases are caused by deleterious mutations in these essential genes. If this hypothesis were true, taking into account essential genes as a filter would be an interesting parameter to identify causingdisease mutations.
Resumo:
An interesting case for undergraduate students of general Genetics is to consider that different genes can produce the same or similar phenotypes. We present here an experiment to discover that the same phenotype could be produced by different genes, and then, to carry out the genetic analysis of these genes. For this laboratory study we have used the following Drosophila melanogaster strains: white (white eyes) and scarlet-brown (white eyes).
Resumo:
An interesting case for undergraduate students of general Genetics is to consider that different genes can produce the same or similar phenotypes. We present here an experiment to discover that the same phenotype could be produced by different genes, and then, to carry out the genetic analysis of these genes. For this laboratory study we have used the following Drosophila melanogaster strains: white (white eyes) and scarlet-brown (white eyes).