849 resultados para constrained clustering
Resumo:
The importance of competition between similar species in driving community assembly is much debated. Recently, phylogenetic patterns in species composition have been investigated to help resolve this question: phylogenetic clustering is taken to imply environmental filtering, and phylogenetic overdispersion to indicate limiting similarity between species. We used experimental plant communities with random species compositions and initially even abundance distributions to examine the development of phylogenetic pattern in species abundance distributions. Where composition was held constant by weeding, abundance distributions became overdispersed through time, but only in communities that contained distantly related clades, some with several species (i.e., a mix of closely and distantly related species). Phylogenetic pattern in composition therefore constrained the development of overdispersed abundance distributions, and this might indicate limiting similarity between close relatives and facilitation/complementarity between distant relatives. Comparing the phylogenetic patterns in these communities with those expected from the monoculture abundances of the constituent species revealed that interspecific competition caused the phylogenetic patterns. Opening experimental communities to colonization by all species in the species pool led to convergence in phylogenetic diversity. At convergence, communities were composed of several distantly related but species-rich clades and had overdispersed abundance distributions. This suggests that limiting similarity processes determine which species dominate a community but not which species occur in a community. Crucially, as our study was carried out in experimental communities, we could rule out local evolutionary or dispersal explanations for the patterns and identify ecological processes as the driving force, underlining the advantages of studying these processes in experimental communities. Our results show that phylogenetic relations between species provide a good guide to understanding community structure and add a new perspective to the evidence that niche complementarity is critical in driving community assembly.
Resumo:
Abstract: To cluster textual sequence types (discourse types/modes) in French texts, K-means algorithm with high-dimensional embeddings and fuzzy clustering algorithm were applied on clauses whose POS (part-ofspeech) n-gram profiles were previously extracted. Uni-, bi- and trigrams were used on four 19th century French short stories by Maupassant. For high-dimensional embeddings, power transformations on the chi-squared distances between clauses were explored. Preliminary results show that highdimensional embeddings improve the quality of clustering, contrasting the use of bi and trigrams whose performance is disappointing, possibly because of feature space sparsity.
Resumo:
BACKGROUND: The trithorax group (trxG) and Polycomb group (PcG) proteins are responsible for the maintenance of stable transcriptional patterns of many developmental regulators. They bind to specific regions of DNA and direct the post-translational modifications of histones, playing a role in the dynamics of chromatin structure. RESULTS: We have performed genome-wide expression studies of trx and ash2 mutants in Drosophila melanogaster. Using computational analysis of our microarray data, we have identified 25 clusters of genes potentially regulated by TRX. Most of these clusters consist of genes that encode structural proteins involved in cuticle formation. This organization appears to be a distinctive feature of the regulatory networks of TRX and other chromatin regulators, since we have observed the same arrangement in clusters after experiments performed with ASH2, as well as in experiments performed by others with NURF, dMyc, and ASH1. We have also found many of these clusters to be significantly conserved in D. simulans, D. yakuba, D. pseudoobscura and partially in Anopheles gambiae. CONCLUSION: The analysis of genes governed by chromatin regulators has led to the identification of clusters of functionally related genes conserved in other insect species, suggesting this chromosomal organization is biologically important. Moreover, our results indicate that TRX and other chromatin regulators may act globally on chromatin domains that contain transcriptionally co-regulated genes.
Resumo:
Acquiring lexical information is a complex problem, typically approached by relying on a number of contexts to contribute information for classification. One of the first issues to address in this domain is the determination of such contexts. The work presented here proposes the use of automatically obtained FORMAL role descriptors as features used to draw nouns from the same lexical semantic class together in an unsupervised clustering task. We have dealt with three lexical semantic classes (HUMAN, LOCATION and EVENT) in English. The results obtained show that it is possible to discriminate between elements from different lexical semantic classes using only FORMAL role information, hence validating our initial hypothesis. Also, iterating our method accurately accounts for fine-grained distinctions within lexical classes, namely distinctions involving ambiguous expressions. Moreover, a filtering and bootstrapping strategy employed in extracting FORMAL role descriptors proved to minimize effects of sparse data and noise in our task.
Resumo:
AIMS/HYPOTHESIS: The metabolic syndrome comprises a clustering of cardiovascular risk factors but the underlying mechanism is not known. Mice with targeted disruption of endothelial nitric oxide synthase (eNOS) are hypertensive and insulin resistant. We wondered, whether eNOS deficiency in mice is associated with a phenotype mimicking the human metabolic syndrome. METHODS AND RESULTS: In addition to arterial pressure and insulin sensitivity (euglycaemic hyperinsulinaemic clamp), we measured the plasma concentration of leptin, insulin, cholesterol, triglycerides, free fatty acids, fibrinogen and uric acid in 10 to 12 week old eNOS-/- and wild type mice. We also assessed glucose tolerance under basal conditions and following a metabolic stress with a high fat diet. As expected eNOS-/- mice were hypertensive and insulin resistant, as evidenced by fasting hyperinsulinaemia and a roughly 30 percent lower steady state glucose infusion rate during the clamp. eNOS-/- mice had a 1.5 to 2-fold elevation of the cholesterol, triglyceride and free fatty acid plasma concentration. Even though body weight was comparable, the leptin plasma level was 30% higher in eNOS-/- than in wild type mice. Finally, uric acid and fibrinogen were elevated in the eNOS-/- mice. Whereas under basal conditions, glucose tolerance was comparable in knock out and control mice, on a high fat diet, knock out mice became significantly more glucose intolerant than control mice. CONCLUSIONS: A single gene defect, eNOS deficiency, causes a clustering of cardiovascular risk factors in young mice. We speculate that defective nitric oxide synthesis could trigger many of the abnormalities making up the metabolic syndrome in humans.
Resumo:
The article examines the structure of the collaboration networks of research groups where Slovenian and Spanish PhD students are pursuing their doctorate. The units of analysis are student-supervisor dyads. We use duocentred networks, a novel network structure appropriate for networks which are centred around a dyad. A cluster analysis reveals three typical clusters of research groups. Those which are large and belong to several institutions are labelled under a bridging social capital label. Those which are small, centred in a single institution but have high cohesion are labelled as bonding social capital. Those which are small and with low cohesion are called weak social capital groups. Academic performance of both PhD students and supervisors are highest in bridging groups and lowest in weak groups. Other variables are also found to differ according to the type of research group. At the end, some recommendations regarding academic and research policy are drawn
Resumo:
Canonical correspondence analysis and redundancy analysis are two methods of constrained ordination regularly used in the analysis of ecological data when several response variables (for example, species abundances) are related linearly to several explanatory variables (for example, environmental variables, spatial positions of samples). In this report I demonstrate the advantages of the fuzzy coding of explanatory variables: first, nonlinear relationships can be diagnosed; second, more variance in the responses can be explained; and third, in the presence of categorical explanatory variables (for example, years, regions) the interpretation of the resulting triplot ordination is unified because all explanatory variables are measured at a categorical level.
Resumo:
BACKGROUND: Little is known about engagement in multiple health behaviours in childhood cancer survivors. METHODS: Using latent class analysis, we identified health behaviour patterns in 835 adult survivors of childhood cancer (age 20-35 years) and 1670 age- and sex-matched controls from the general population. Behaviour groups were determined from replies to questions on smoking, drinking, cannabis use, sporting activities, diet, sun protection and skin examination. RESULTS: The model identified four health behaviour patterns: 'risk-avoidance', with a generally healthy behaviour; 'moderate drinking', with higher levels of sporting activities, but moderate alcohol-consumption; 'risk-taking', engaging in several risk behaviours; and 'smoking', smoking but not drinking. Similar proportions of survivors and controls fell into the 'risk-avoiding' (42% vs 44%) and the 'risk-taking' cluster (14% vs 12%), but more survivors were in the 'moderate drinking' (39% vs 28%) and fewer in the 'smoking' cluster (5% vs 16%). Determinants of health behaviour clusters were gender, migration background, income and therapy. CONCLUSION: A comparable proportion of childhood cancer survivors as in the general population engage in multiple health-compromising behaviours. Because of increased vulnerability of survivors, multiple risk behaviours should be addressed in targeted health interventions.
Resumo:
When dealing with the design of service networks, such as healthand EMS services, banking or distributed ticket selling services, thelocation of service centers has a strong influence on the congestion ateach of them, and consequently, on the quality of service. In this paper,several models are presented to consider service congestion. The firstmodel addresses the issue of the location of the least number of single--servercenters such that all the population is served within a standard distance,and nobody stands in line for a time longer than a given time--limit, or withmore than a predetermined number of other clients. We then formulateseveral maximal coverage models, with one or more servers per service center.A new heuristic is developed to solve the models and tested in a 30--nodesnetwork.
Resumo:
The paper presents a new model based on the basic Maximum Capture model,MAXCAP. The New Chance Constrained Maximum Capture modelintroduces astochastic threshold constraint, which recognises the fact that a facilitycan be open only if a minimum level of demand is captured. A metaheuristicbased on MAX MIN ANT system and TABU search procedure is presented tosolve the model. This is the first time that the MAX MIN ANT system isadapted to solve a location problem. Computational experience and anapplication to 55 node network are also presented.
Resumo:
This work proposes an original contribution to the understanding of shermen spatial behavior, based on the behavioral ecology and movement ecology paradigms. Through the analysis of Vessel Monitoring System (VMS) data, we characterized the spatial behavior of Peruvian anchovy shermen at di erent scales: (1) the behavioral modes within shing trips (i.e., searching, shing and cruising); (2) the behavioral patterns among shing trips; (3) the behavioral patterns by shing season conditioned by ecosystem scenarios; and (4) the computation of maps of anchovy presence proxy from the spatial patterns of behavioral mode positions. At the rst scale considered, we compared several Markovian (hidden Markov and semi-Markov models) and discriminative models (random forests, support vector machines and arti cial neural networks) for inferring the behavioral modes associated with VMS tracks. The models were trained under a supervised setting and validated using tracks for which behavioral modes were known (from on-board observers records). Hidden semi-Markov models performed better, and were retained for inferring the behavioral modes on the entire VMS dataset. At the second scale considered, each shing trip was characterized by several features, including the time spent within each behavioral mode. Using a clustering analysis, shing trip patterns were classi ed into groups associated to management zones, eet segments and skippers' personalities. At the third scale considered, we analyzed how ecological conditions shaped shermen behavior. By means of co-inertia analyses, we found signi cant associations between shermen, anchovy and environmental spatial dynamics, and shermen behavioral responses were characterized according to contrasted environmental scenarios. At the fourth scale considered, we investigated whether the spatial behavior of shermen re ected to some extent the spatial distribution of anchovy. Finally, this work provides a wider view of shermen behavior: shermen are not only economic agents, but they are also foragers, constrained by ecosystem variability. To conclude, we discuss how these ndings may be of importance for sheries management, collective behavior analyses and end-to-end models.
Resumo:
Multisensory interactions are observed in species from single-cell organisms to humans. Important early work was primarily carried out in the cat superior colliculus and a set of critical parameters for their occurrence were defined. Primary among these were temporal synchrony and spatial alignment of bisensory inputs. Here, we assessed whether spatial alignment was also a critical parameter for the temporally earliest multisensory interactions that are observed in lower-level sensory cortices of the human. While multisensory interactions in humans have been shown behaviorally for spatially disparate stimuli (e.g. the ventriloquist effect), it is not clear if such effects are due to early sensory level integration or later perceptual level processing. In the present study, we used psychophysical and electrophysiological indices to show that auditory-somatosensory interactions in humans occur via the same early sensory mechanism both when stimuli are in and out of spatial register. Subjects more rapidly detected multisensory than unisensory events. At just 50 ms post-stimulus, neural responses to the multisensory 'whole' were greater than the summed responses from the constituent unisensory 'parts'. For all spatial configurations, this effect followed from a modulation of the strength of brain responses, rather than the activation of regions specifically responsive to multisensory pairs. Using the local auto-regressive average source estimation, we localized the initial auditory-somatosensory interactions to auditory association areas contralateral to the side of somatosensory stimulation. Thus, multisensory interactions can occur across wide peripersonal spatial separations remarkably early in sensory processing and in cortical regions traditionally considered unisensory.
Resumo:
The coverage and volume of geo-referenced datasets are extensive and incessantly¦growing. The systematic capture of geo-referenced information generates large volumes¦of spatio-temporal data to be analyzed. Clustering and visualization play a key¦role in the exploratory data analysis and the extraction of knowledge embedded in¦these data. However, new challenges in visualization and clustering are posed when¦dealing with the special characteristics of this data. For instance, its complex structures,¦large quantity of samples, variables involved in a temporal context, high dimensionality¦and large variability in cluster shapes.¦The central aim of my thesis is to propose new algorithms and methodologies for¦clustering and visualization, in order to assist the knowledge extraction from spatiotemporal¦geo-referenced data, thus improving making decision processes.¦I present two original algorithms, one for clustering: the Fuzzy Growing Hierarchical¦Self-Organizing Networks (FGHSON), and the second for exploratory visual data analysis:¦the Tree-structured Self-organizing Maps Component Planes. In addition, I present¦methodologies that combined with FGHSON and the Tree-structured SOM Component¦Planes allow the integration of space and time seamlessly and simultaneously in¦order to extract knowledge embedded in a temporal context.¦The originality of the FGHSON lies in its capability to reflect the underlying structure¦of a dataset in a hierarchical fuzzy way. A hierarchical fuzzy representation of¦clusters is crucial when data include complex structures with large variability of cluster¦shapes, variances, densities and number of clusters. The most important characteristics¦of the FGHSON include: (1) It does not require an a-priori setup of the number¦of clusters. (2) The algorithm executes several self-organizing processes in parallel.¦Hence, when dealing with large datasets the processes can be distributed reducing the¦computational cost. (3) Only three parameters are necessary to set up the algorithm.¦In the case of the Tree-structured SOM Component Planes, the novelty of this algorithm¦lies in its ability to create a structure that allows the visual exploratory data analysis¦of large high-dimensional datasets. This algorithm creates a hierarchical structure¦of Self-Organizing Map Component Planes, arranging similar variables' projections in¦the same branches of the tree. Hence, similarities on variables' behavior can be easily¦detected (e.g. local correlations, maximal and minimal values and outliers).¦Both FGHSON and the Tree-structured SOM Component Planes were applied in¦several agroecological problems proving to be very efficient in the exploratory analysis¦and clustering of spatio-temporal datasets.¦In this thesis I also tested three soft competitive learning algorithms. Two of them¦well-known non supervised soft competitive algorithms, namely the Self-Organizing¦Maps (SOMs) and the Growing Hierarchical Self-Organizing Maps (GHSOMs); and the¦third was our original contribution, the FGHSON. Although the algorithms presented¦here have been used in several areas, to my knowledge there is not any work applying¦and comparing the performance of those techniques when dealing with spatiotemporal¦geospatial data, as it is presented in this thesis.¦I propose original methodologies to explore spatio-temporal geo-referenced datasets¦through time. Our approach uses time windows to capture temporal similarities and¦variations by using the FGHSON clustering algorithm. The developed methodologies¦are used in two case studies. In the first, the objective was to find similar agroecozones¦through time and in the second one it was to find similar environmental patterns¦shifted in time.¦Several results presented in this thesis have led to new contributions to agroecological¦knowledge, for instance, in sugar cane, and blackberry production.¦Finally, in the framework of this thesis we developed several software tools: (1)¦a Matlab toolbox that implements the FGHSON algorithm, and (2) a program called¦BIS (Bio-inspired Identification of Similar agroecozones) an interactive graphical user¦interface tool which integrates the FGHSON algorithm with Google Earth in order to¦show zones with similar agroecological characteristics.
Resumo:
The sandstone-hosted Beverley uranium deposit is located in terrestrial sediments in the Lake Frome basin in the North Flinders Ranges, South Australia. The deposit is 13 km from the U-rich Mesoproterozoic basement of the Mount Painter inlier, which is being uplifted 100 to 200 m above the basin by neotectonic activity that probably initiated in the early Pliocene. The mineralization was deposited mainly in organic matter-poor Miocene lacustrine sands and partly in the underlying reductive strata comprising organic matter-rich clays and silts. The bulk of the mineralization consists of coffinite and/or uraninite nodules, growing around Co-rich pyrite with an S isotope composition (delta S-34 = 1.0 +/- 0.3 parts per thousand), suggestive of an early diagenetic lacustrine origin. In contrast, authigenic sulfides in the bulk of the sediments have a negative S isotope signature (delta S-34 ranges from -26.2 to -35.5 parts per thousand), indicative of an origin via bacterially mediated sulfate reduction. Minor amounts of Zn-bearing native copper and native lead also support the presence of specific, reducing microenvironments in the ore zone. Small amounts of carnotite are associated with the coffinite ore and also occur beneath a paleosoil horizon overlying the uranium deposit. Provenance studies suggest that the host Miocene sediments were derived from the reworking of Early Cretaceous glacial or glaciolacustrine sediments ultimately derived from Paleozoic terranes in eastern Australia. In contrast, the overlying Pliocene strata were in part derived from the Mesoproterozoic basement inlier. Mass-balance and geochemical data confirm that granites of the Mount Painter domain were the ultimate source of U and BEE at Beverley. U-Pb dating of coffinite and carnotite suggest that the U mineralization is Pliocene (6.7-3.4 Ma). The suitability of the Beverley deposit for efficient mining via in situ leaching, and hence its economic value, are determined by the nature of the hosting sand unit, which provides the permeability and low reactivity required for high fluid flow and low chemical consumption. These favorable sedimentologic and geometrical features result from a complex conjunction of factors, including deposition in lacustrine shore environment, reworking of angular sands of glacial origin, deep Pliocene weathering, and proximity to an active fault exposing extremely U rich rocks.