853 resultados para Unsupervised clustering
Resumo:
SUMMARY There is interest in the potential of companion animal surveillance to provide data to improve pet health and to provide early warning of environmental hazards to people. We implemented a companion animal surveillance system in Calgary, Alberta and the surrounding communities. Informatics technologies automatically extracted electronic medical records from participating veterinary practices and identified cases of enteric syndrome in the warehoused records. The data were analysed using time-series analyses and a retrospective space-time permutation scan statistic. We identified a seasonal pattern of reports of occurrences of enteric syndromes in companion animals and four statistically significant clusters of enteric syndrome cases. The cases within each cluster were examined and information about the animals involved (species, age, sex), their vaccination history, possible exposure or risk behaviour history, information about disease severity, and the aetiological diagnosis was collected. We then assessed whether the cases within the cluster were unusual and if they represented an animal or public health threat. There was often insufficient information recorded in the medical record to characterize the clusters by aetiology or exposures. Space-time analysis of companion animal enteric syndrome cases found evidence of clustering. Collection of more epidemiologically relevant data would enhance the utility of practice-based companion animal surveillance.
Resumo:
Polydnaviruses (genera Ichnovirus and Bracovirus) have a segmented genome of circular double-stranded DNA molecules, replicate in the ovary of parasitic wasps and are essential for successful parasitism of the host. Here we show the first detailed analysis of various segments of a bracovirus, the Chelonus inanitus virus (CiV). Four segments were sequenced and two of them, CiV12 and CiV14, were found to be closely related while CiV14.5 and CiV16.8 were unrelated. CiV12, CiV14.5 and CiV16.8 are unique while CiV14 occurs also nested in another larger segment. All four segments are predicted to contain genes and predictions could be substantiated in most cases. Comparison with databases revealed no significant similarities at either the nucleotide or amino acid level. Inverted repeats with identities between 77% and 92% and lengths between 26 bp and 100 bp were found on all segments outside of predicted genes. Hybridization experiments indicate that CiV12 and CiV14 are both flanked by other virus segments, suggesting that proviral CiV segments are clustered in the genome of the wasp. The integration/excision site of CiV14 was analysed and compared to that of CiV12. On both termini of proviral CiV12 and CiV14 as well as in the excised circular molecule and the rejoined DNA a very similar repeat of 14 bp was found. A model to illustrate where the terminal repeats might recombine to yield the circular molecule is presented. Excision of CiV12 and CiV14 is restricted to the female and sets in at a very specific time-point in pupal-adult development.
Resumo:
The aetiology of childhood cancers remains largely unknown. It has been hypothesized that infections may be involved and that mini-epidemics thereof could result in space-time clustering of incident cases. Most previous studies support spatio-temporal clustering for leukaemia, while results for other diagnostic groups remain mixed. Few studies have corrected for uneven regional population shifts which can lead to spurious detection of clustering. We examined whether there is space-time clustering of childhood cancers in Switzerland identifying cases diagnosed at age <16 years between 1985 and 2010 from the Swiss Childhood Cancer Registry. Knox tests were performed on geocoded residence at birth and diagnosis separately for leukaemia, acute lymphoid leukaemia (ALL), lymphomas, tumours of the central nervous system, neuroblastomas and soft tissue sarcomas. We used Baker's Max statistic to correct for multiple testing and randomly sampled time-, sex- and age-matched controls from the resident population to correct for uneven regional population shifts. We observed space-time clustering of childhood leukaemia at birth (Baker's Max p = 0.045) but not at diagnosis (p = 0.98). Clustering was strongest for a spatial lag of <1 km and a temporal lag of <2 years (Observed/expected close pairs: 124/98; p Knox test = 0.003). A similar clustering pattern was observed for ALL though overall evidence was weaker (Baker's Max p = 0.13). Little evidence of clustering was found for other diagnostic groups (p > 0.2). Our study suggests that childhood leukaemia tends to cluster in space-time due to an etiologic factor present in early life.
Resumo:
Lipid rafts are small laterally mobile cell membrane structures that are highly enriched in lymphocyte signaling molecules. Lipid rafts can form from the assembly of specialized lipids and proteins through hydrophobic associations from saturated acyl chains. GM1 gangliosides are a common lipid raft component and have been shown to be essential in many T cell functions. Current lipid raft theory hypothesizes that certain aspects of T cell signaling can be initiated from the coalescence of these signaling-enriched lipid rafts to sites of receptor engagement. We have described how the specific aggregation of GM1 lipid rafts can cause a reorganization of cell surface molecular associations which include dynamic associations of β1 integrins with GM1 lipid rafts. These associations had pronounced effects on T cell adhesive and migratory states. We show that GM1 lipid raft aggregation can dramatically inhibit T cell migration and chemotaxis on the extracellular matrix constituent fibronectin. This inhibition of migration function was shown to be dependent on the src kinase Lck and PKC-regulated F-actin polymerization to extending pseudopods. Furthermore, GM1 lipid raft clustering could activate T cell adhesion-strengthening mechanisms. These include an increase in cellular rigidity, the creation of polymerized cortical F-actin structures, the induction of high affinity integrin states, an increase in surface area and symmetry of the contact plane, and resistance to shear flow detachment while adherent to fibronectin. This indicates that GM1 lipid raft aggregation defines a novel stimulus to regulate lymphocyte motility and cellular adhesion which could have important implications in T cell homing mechanisms. ^
Resumo:
The small leucine-rich repeat proteoglycans (or SLRPs) are a group of extracellular proteins (ECM) that belong to the leucine-rich repeat (LRR) superfamily of proteins. The LRR is a protein folding motif composed of 20–30 amino acids with leucines in conserved positions. LRR-containing proteins are present in a broad spectrum of organisms and possess diverse cellular functions and localization. In mammals, the SLRPs are abundant in connective tissues, such as bones, cartilage, tendons, skin, and blood vessels. We have discovered a new member of the class I small leucine rich repeat proteoglycan (SLRP) family which is distinct from the other class I SLRPs since it possesses a unique stretch of aspartate residues at its N-terminus. For this reason, we called the molecule asporin. The deduced amino acid sequence is about 50% identical (and 70% similar) to decorin and biglycan. However, asporin does not contain a serine/glycine dipeptide sequence required for the assembly of O-linked glycosaminoglycans and is probably not a proteoglycan. The tissue expression of asporin partially overlaps with the expression of decorin and biglycan. During mouse embryonic development, asporin mRNA expression was detected primarily in the skeleton and other specialized connective tissues; very little asporin message was detected in the major parenchymal organs. The mouse asporin gene structure is similar to that of biglycan and decorin with 8 exons. The asporin gene is localized to human chromosome 9q22-9g21.3 where asporin is part of a SLRP gene cluster that includes ECM2, osteoadherin, and osteoglycin. This gene cluster of four LRR-encoding genes is embedded in a 238 kilobase intron of another novel gene named Tes9orf that is expressed primarily in the testes of the adult mouse. The SLRP genes are not present in Drosophila or C. elegans , but reside in three separate gene clusters in the puffer fish, mice and humans. Targeted disruption of individual mouse SLRP genes display minor connective tissue defects such as skin fragility, tendon laxity, minor growth plate defects, and mild osteoporosis. However, double and triple knockouts of SLRP genes exacerbate these phenotypes. Both the double epiphycan/biglycan and the triple PRELP/fibromodulin/biglycan knockout mice exhibit premature osteoarthritis. ^
Resumo:
This study subdivides the Potter Cove, King George Island, Antarctica, into seafloor regions using multivariate statistical methods. These regions are categories used for comparing, contrasting and quantifying biogeochemical processes and biodiversity between ocean regions geographically but also regions under development within the scope of global change. The division obtained is characterized by the dominating components and interpreted in terms of ruling environmental conditions. The analysis includes in total 42 different environmental variables, interpolated based on samples taken during Australian summer seasons 2010/2011 and 2011/2012. The statistical errors of several interpolation methods (e.g. IDW, Indicator, Ordinary and Co-Kriging) with changing settings have been compared and the most reasonable method has been applied. The multivariate mathematical procedures used are regionalized classification via k means cluster analysis, canonical-correlation analysis and multidimensional scaling. Canonical-correlation analysis identifies the influencing factors in the different parts of the cove. Several methods for the identification of the optimum number of clusters have been tested and 4, 7, 10 as well as 12 were identified as reasonable numbers for clustering the Potter Cove. Especially the results of 10 and 12 clusters identify marine-influenced regions which can be clearly separated from those determined by the geological catchment area and the ones dominated by river discharge.
Resumo:
The Self-OrganizingMap (SOM) is a neural network model that performs an ordered projection of a high dimensional input space in a low-dimensional topological structure. The process in which such mapping is formed is defined by the SOM algorithm, which is a competitive, unsupervised and nonparametric method, since it does not make any assumption about the input data distribution. The feature maps provided by this algorithm have been successfully applied for vector quantization, clustering and high dimensional data visualization processes. However, the initialization of the network topology and the selection of the SOM training parameters are two difficult tasks caused by the unknown distribution of the input signals. A misconfiguration of these parameters can generate a feature map of low-quality, so it is necessary to have some measure of the degree of adaptation of the SOM network to the input data model. The topologypreservation is the most common concept used to implement this measure. Several qualitative and quantitative methods have been proposed for measuring the degree of SOM topologypreservation, particularly using Kohonen's model. In this work, two methods for measuring the topologypreservation of the Growing Cell Structures (GCSs) model are proposed: the topographic function and the topology preserving map
Resumo:
This paper presents an algorithm for generating scale-free networks with adjustable clustering coefficient. The algorithm is based on a random walk procedure combined with a triangle generation scheme which takes into account genetic factors; this way, preferential attachment and clustering control are implemented using only local information. Simulations are presented which support the validity of the scheme, characterizing its tuning capabilities.
Resumo:
A new method for detecting microcalcifications in regions of interest (ROIs) extracted from digitized mammograms is proposed. The top-hat transform is a technique based on mathematical morphology operations and, in this paper, is used to perform contrast enhancement of the mi-crocalcifications. To improve microcalcification detection, a novel image sub-segmentation approach based on the possibilistic fuzzy c-means algorithm is used. From the original ROIs, window-based features, such as the mean and standard deviation, were extracted; these features were used as an input vector in a classifier. The classifier is based on an artificial neural network to identify patterns belonging to microcalcifications and healthy tissue. Our results show that the proposed method is a good alternative for automatically detecting microcalcifications, because this stage is an important part of early breast cancer detection
Resumo:
Industrial applications of computer vision sometimes require detection of atypical objects that occur as small groups of pixels in digital images. These objects are difficult to single out because they are small and randomly distributed. In this work we propose an image segmentation method using the novel Ant System-based Clustering Algorithm (ASCA). ASCA models the foraging behaviour of ants, which move through the data space searching for high data-density regions, and leave pheromone trails on their path. The pheromone map is used to identify the exact number of clusters, and assign the pixels to these clusters using the pheromone gradient. We applied ASCA to detection of microcalcifications in digital mammograms and compared its performance with state-of-the-art clustering algorithms such as 1D Self-Organizing Map, k-Means, Fuzzy c-Means and Possibilistic Fuzzy c-Means. The main advantage of ASCA is that the number of clusters needs not to be known a priori. The experimental results show that ASCA is more efficient than the other algorithms in detecting small clusters of atypical data.
Resumo:
This paper proposes a method for the identification of different partial discharges (PDs) sources through the analysis of a collection of PD signals acquired with a PD measurement system. This method, robust and sensitive enough to cope with noisy data and external interferences, combines the characterization of each signal from the collection, with a clustering procedure, the CLARA algorithm. Several features are proposed for the characterization of the signals, being the wavelet variances, the frequency estimated with the Prony method, and the energy, the most relevant for the performance of the clustering procedure. The result of the unsupervised classification is a set of clusters each containing those signals which are more similar to each other than to those in other clusters. The analysis of the classification results permits both the identification of different PD sources and the discrimination between original PD signals, reflections, noise and external interferences. The methods and graphical tools detailed in this paper have been coded and published as a contributed package of the R environment under a GNU/GPL license.