856 resultados para Data Driven Clustering
Resumo:
Neural stem cells (NSCs) are early precursors of neuronal and glial cells. NSCs are capable of generating identical progeny through virtually unlimited numbers of cell divisions (cell proliferation), producing daughter cells committed to differentiation. Nuclear factor kappa B (NF-kappaB) is an inducible, ubiquitous transcription factor also expressed in neurones, glia and neural stem cells. Recently, several pieces of evidence have been provided for a central role of NF-kappaB in NSC proliferation control. Here, we propose a novel mathematical model for NF-kappaB-driven proliferation of NSCs. We have been able to reconstruct the molecular pathway of activation and inactivation of NF-kappaB and its influence on cell proliferation by a system of nonlinear ordinary differential equations. Then we use a combination of analytical and numerical techniques to study the model dynamics. The results obtained are illustrated by computer simulations and are, in general, in accordance with biological findings reported by several independent laboratories. The model is able to both explain and predict experimental data. Understanding of proliferation mechanisms in NSCs may provide a novel outlook in both potential use in therapeutic approaches, and basic research as well.
Resumo:
This paper is concerned with tensor clustering with the assistance of dimensionality reduction approaches. A class of formulation for tensor clustering is introduced based on tensor Tucker decomposition models. In this formulation, an extra tensor mode is formed by a collection of tensors of the same dimensions and then used to assist a Tucker decomposition in order to achieve data dimensionality reduction. We design two types of clustering models for the tensors: PCA Tensor Clustering model and Non-negative Tensor Clustering model, by utilizing different regularizations. The tensor clustering can thus be solved by the optimization method based on the alternative coordinate scheme. Interestingly, our experiments show that the proposed models yield comparable or even better performance compared to most recent clustering algorithms based on matrix factorization.
Resumo:
In this article, along with others, we take the position that the Null-Subject Parameter (NSP) (Chomsky 1981; Rizzi 1982) cluster of properties is narrower in scope than some originally contended. We test for the resetting of the NSP by English L2 learners of Spanish at the intermediate level, including poverty-of-the stimulus knowledge of the Overt Pronoun Constraint (Montalbetti 1984). Our participants are tested before and after five months' residency in Spain in an effort to see if increased amounts of native exposure are particularly beneficial for parameter resetting. Although we demonstrate NSP resetting for some of the L2 learners, our data essentially demonstrate that even with the advent of time/exposure to native input, there is no immediate gainful effect for NSP resetting.
Resumo:
We propose a geoadditive negative binomial model (Geo-NB-GAM) for regional count data that allows us to address simultaneously some important methodological issues, such as spatial clustering, nonlinearities, and overdispersion. This model is applied to the study of location determinants of inward greenfield investments that occurred during 2003–2007 in 249 European regions. After presenting the data set and showing the presence of overdispersion and spatial clustering, we review the theoretical framework that motivates the choice of the location determinants included in the empirical model, and we highlight some reasons why the relationship between some of the covariates and the dependent variable might be nonlinear. The subsequent section first describes the solutions proposed by previous literature to tackle spatial clustering, nonlinearities, and overdispersion, and then presents the Geo-NB-GAM. The empirical analysis shows the good performance of Geo-NB-GAM. Notably, the inclusion of a geoadditive component (a smooth spatial trend surface) permits us to control for spatial unobserved heterogeneity that induces spatial clustering. Allowing for nonlinearities reveals, in keeping with theoretical predictions, that the positive effect of agglomeration economies fades as the density of economic activities reaches some threshold value. However, no matter how dense the economic activity becomes, our results suggest that congestion costs never overcome positive agglomeration externalities.
Resumo:
Parkinson is a neurodegenerative disease, in which tremor is the main symptom. This paper investigates the use of different classification methods to identify tremors experienced by Parkinsonian patients.Some previous research has focussed tremor analysis on external body signals (e.g., electromyography, accelerometer signals, etc.). Our advantage is that we have access to sub-cortical data, which facilitates the applicability of the obtained results into real medical devices since we are dealing with brain signals directly. Local field potentials (LFP) were recorded in the subthalamic nucleus of 7 Parkinsonian patients through the implanted electrodes of a deep brain stimulation (DBS) device prior to its internalization. Measured LFP signals were preprocessed by means of splinting, down sampling, filtering, normalization and rec-tification. Then, feature extraction was conducted through a multi-level decomposition via a wavelettrans form. Finally, artificial intelligence techniques were applied to feature selection, clustering of tremor types, and tremor detection.The key contribution of this paper is to present initial results which indicate, to a high degree of certainty, that there appear to be two distinct subgroups of patients within the group-1 of patients according to the Consensus Statement of the Movement Disorder Society on Tremor. Such results may well lead to different resultant treatments for the patients involved, depending on how their tremor has been classified. Moreover, we propose a new approach for demand driven stimulation, in which tremor detection is also based on the subtype of tremor the patient has. Applying this knowledge to the tremor detection problem, it can be concluded that the results improve when patient clustering is applied prior to detection.
Resumo:
A strong correlation between the speed of the eddy-driven jet and the width of the Hadley cell is found to exist in the Southern Hemisphere, both in reanalysis data and in twenty-first-century integrations from the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report multimodel archive. Analysis of the space–time spectra of eddy momentum flux reveals that variations in eddy-driven jet speed are related to changes in the mean phase speed of midlatitude eddies. An increase in eddy phase speeds induces a poleward shift of the critical latitudes and a poleward expansion of the region of subtropical wave breaking. The associated changes in eddy momentum flux convergence are balanced by anomalous meridional winds consistent with a wider Hadley cell. At the same time, faster eddies are also associated with a strengthened poleward eddy momentum flux, sustaining a stronger westerly jet in midlatitudes. The proposed mechanism is consistent with the seasonal dependence of the interannual variability of the Hadley cell width and appears to explain at least part of the projected twenty-first-century trends.
Resumo:
The concentrations of sulfate, black carbon (BC) and other aerosols in the Arctic are characterized by high values in late winter and spring (so-called Arctic Haze) and low values in summer. Models have long been struggling to capture this seasonality and especially the high concentrations associated with Arctic Haze. In this study, we evaluate sulfate and BC concentrations from eleven different models driven with the same emission inventory against a comprehensive pan-Arctic measurement data set over a time period of 2 years (2008–2009). The set of models consisted of one Lagrangian particle dispersion model, four chemistry transport models (CTMs), one atmospheric chemistry-weather forecast model and five chemistry climate models (CCMs), of which two were nudged to meteorological analyses and three were running freely. The measurement data set consisted of surface measurements of equivalent BC (eBC) from five stations (Alert, Barrow, Pallas, Tiksi and Zeppelin), elemental carbon (EC) from Station Nord and Alert and aircraft measurements of refractory BC (rBC) from six different campaigns. We find that the models generally captured the measured eBC or rBC and sulfate concentrations quite well, compared to previous comparisons. However, the aerosol seasonality at the surface is still too weak in most models. Concentrations of eBC and sulfate averaged over three surface sites are underestimated in winter/spring in all but one model (model means for January–March underestimated by 59 and 37 % for BC and sulfate, respectively), whereas concentrations in summer are overestimated in the model mean (by 88 and 44 % for July–September), but with overestimates as well as underestimates present in individual models. The most pronounced eBC underestimates, not included in the above multi-site average, are found for the station Tiksi in Siberia where the measured annual mean eBC concentration is 3 times higher than the average annual mean for all other stations. This suggests an underestimate of BC sources in Russia in the emission inventory used. Based on the campaign data, biomass burning was identified as another cause of the modeling problems. For sulfate, very large differences were found in the model ensemble, with an apparent anti-correlation between modeled surface concentrations and total atmospheric columns. There is a strong correlation between observed sulfate and eBC concentrations with consistent sulfate/eBC slopes found for all Arctic stations, indicating that the sources contributing to sulfate and BC are similar throughout the Arctic and that the aerosols are internally mixed and undergo similar removal. However, only three models reproduced this finding, whereas sulfate and BC are weakly correlated in the other models. Overall, no class of models (e.g., CTMs, CCMs) performed better than the others and differences are independent of model resolution.
Resumo:
Blanket bog occupies approximately 6 % of the area of the UK today. The Holocene expansion of this hyperoceanic biome has previously been explained as a consequence of Neolithic forest clearance. However, the present distribution of blanket bog in Great Britain can be predicted accurately with a simple model (PeatStash) based on summer temperature and moisture index thresholds, and the same model correctly predicts the highly disjunct distribution of blanket bog worldwide. This finding suggests that climate, rather than land-use history, controls blanket-bog distribution in the UK and everywhere else. We set out to test this hypothesis for blanket bogs in the UK using bioclimate envelope modelling compared with a database of peat initiation age estimates. We used both pollen-based reconstructions and climate model simulations of climate changes between the mid-Holocene (6000 yr BP, 6 ka) and modern climate to drive PeatStash and predict areas of blanket bog. We compiled data on the timing of blanket-bog initiation, based on 228 age determinations at sites where peat directly overlies mineral soil. The model predicts large areas of northern Britain would have had blanket bog by 6000 yr BP, and the area suitable for peat growth extended to the south after this time. A similar pattern is shown by the basal peat ages and new blanket bog appeared over a larger area during the late Holocene, the greatest expansion being in Ireland, Wales and southwest England, as the model predicts. The expansion was driven by a summer cooling of about 2 °C, shown by both pollen-based reconstructions and climate models. The data show early Holocene (pre-Neolithic) blanket-bog initiation at over half of the sites in the core areas of Scotland, and northern England. The temporal patterns and concurrence of the bioclimate model predictions and initiation data suggest that climate change provides a parsimonious explanation for the early Holocene distribution and later expansion of blanket bogs in the UK, and it is not necessary to invoke anthropogenic activity as a driver of this major landscape change.
Resumo:
Blanket bog occupies approximately 6% of the area of the UK today. The Holocene expansion of this hyperoceanic biome has previously been explained as a consequence of Neolithic forest clearance. However, the present distribution of blanket bog in Great Britain can be predicted accurately with a simple model (PeatStash) based on summer temperature and moisture index thresholds, and the same model correctly predicts the highly disjunct distribution of blanket bog worldwide. This finding suggests that climate, rather than land-use history, controls blanket-bog distribution in the UK and everywhere else. We set out to test this hypothesis for blanket bogs in the UK using bioclimate envelope modelling compared with a database of peat initiation age estimates. We used both pollen-based reconstructions and climate model simulations of climate changes between the mid-Holocene (6000 yr BP, 6 ka) and modern climate to drive PeatStash and predict areas of blanket bog. We compiled data on the timing of blanketbog initiation, based on 228 age determinations at sites where peat directly overlies mineral soil. The model predicts that large areas of northern Britain would have had blanket bog by 6000 yr BP, and the area suitable for peat growth extended to the south after this time. A similar pattern is shown by the basal peat ages and new blanket bog appeared over a larger area during the late Holocene, the greatest expansion being in Ireland,Wales, and southwest England, as the model predicts. The expansion was driven by a summer cooling of about 2 °C, shown by both pollen-based reconstructions and climate models. The data show early Holocene (pre- Neolithic) blanket-bog initiation at over half of the sites in the core areas of Scotland and northern England. The temporal patterns and concurrence of the bioclimate model predictions and initiation data suggest that climate change provides a parsimonious explanation for the early Holocene distribution and later expansion of blanket bogs in the UK, and it is not necessary to invoke anthropogenic activity as a driver of this major landscape change.
Resumo:
With the fast development of wireless communications, ZigBee and semiconductor devices, home automation networks have recently become very popular. Since typical consumer products deployed in home automation networks are often powered by tiny and limited batteries, one of the most challenging research issues is concerning energy reduction and the balancing of energy consumption across the network in order to prolong the home network lifetime for consumer devices. The introduction of clustering and sink mobility techniques into home automation networks have been shown to be an efficient way to improve the network performance and have received significant research attention. Taking inspiration from nature, this paper proposes an Ant Colony Optimization (ACO) based clustering algorithm specifically with mobile sink support for home automation networks. In this work, the network is divided into several clusters and cluster heads are selected within each cluster. Then, a mobile sink communicates with each cluster head to collect data directly through short range communications. The ACO algorithm has been utilized in this work in order to find the optimal mobility trajectory for the mobile sink. Extensive simulation results from this research show that the proposed algorithm significantly improves home network performance when using mobile sinks in terms of energy consumption and network lifetime as compared to other routing algorithms currently deployed for home automation networks.
Resumo:
This article contains raw and processed data related to research published by Bryant et al. [1]. Data was obtained by MS-based proteomics, analysing trichome-enriched, trichome-depleted and whole leaf samples taken from the medicinal plant Artemisia annua and searching the acquired MS/MS data against a recently published contig database [2] and other genomic and proteomic sequence databases for comparison. The processed data shows that an order-of-magnitude more proteins have been identified from trichome-enriched Artemisia annua samples in comparison to previously published data. Proteins known to have a role in the biosynthesis of artemisinin and other highly abundant proteins were found which imply additional enzymatically driven processes occurring within the trichomes that are significant for the biosynthesis of artemisinin.
Resumo:
Subspace clustering groups a set of samples from a union of several linear subspaces into clusters, so that the samples in the same cluster are drawn from the same linear subspace. In the majority of the existing work on subspace clustering, clusters are built based on feature information, while sample correlations in their original spatial structure are simply ignored. Besides, original high-dimensional feature vector contains noisy/redundant information, and the time complexity grows exponentially with the number of dimensions. To address these issues, we propose a tensor low-rank representation (TLRR) and sparse coding-based (TLRRSC) subspace clustering method by simultaneously considering feature information and spatial structures. TLRR seeks the lowest rank representation over original spatial structures along all spatial directions. Sparse coding learns a dictionary along feature spaces, so that each sample can be represented by a few atoms of the learned dictionary. The affinity matrix used for spectral clustering is built from the joint similarities in both spatial and feature spaces. TLRRSC can well capture the global structure and inherent feature information of data, and provide a robust subspace segmentation from corrupted data. Experimental results on both synthetic and real-world data sets show that TLRRSC outperforms several established state-of-the-art methods.
Resumo:
Tensor clustering is an important tool that exploits intrinsically rich structures in real-world multiarray or Tensor datasets. Often in dealing with those datasets, standard practice is to use subspace clustering that is based on vectorizing multiarray data. However, vectorization of tensorial data does not exploit complete structure information. In this paper, we propose a subspace clustering algorithm without adopting any vectorization process. Our approach is based on a novel heterogeneous Tucker decomposition model taking into account cluster membership information. We propose a new clustering algorithm that alternates between different modes of the proposed heterogeneous tensor model. All but the last mode have closed-form updates. Updating the last mode reduces to optimizing over the multinomial manifold for which we investigate second order Riemannian geometry and propose a trust-region algorithm. Numerical experiments show that our proposed algorithm compete effectively with state-of-the-art clustering algorithms that are based on tensor factorization.
Resumo:
Recurrent submicroscopic genomic copy number changes are the result of nonallelic homologous recombination (NAHR). Nonrecurrent aberrations, however, can result from different nonexclusive recombination-repair mechanisms. We previously described small microduplications at Xq28 containing MECP2 in four male patients with a severe neurological phenotype. Here, we report on the fine-mapping and breakpoint analysis of 16 unique microduplications. The size of the overlapping copy number changes varies between 0.3 and 2.3 Mb, and FISH analysis on three patients demonstrated a tandem orientation. Although eight of the 32 breakpoint regions coincide with low-copy repeats, none of the duplications are the result of NAHR. Bioinformatics analysis of the breakpoint regions demonstrated a 2.5-fold higher frequency of Alu interspersed repeats as compared with control regions, as well as a very high GC content (53%). Unexpectedly, we obtained the junction in only one patient by long-range PCR, which revealed nonhomologous end joining as the mechanism. Breakpoint analysis in two other patients by inverse PCR and subsequent array comparative genomic hybridization analysis demonstrated the presence of a second duplicated region more telomeric at Xq28, of which one copy was inserted in between the duplicated MECP2 regions. These data suggest a two-step mechanism in which part of Xq28 is first inserted near the MECP2 locus, followed by breakage-induced replication with strand invasion of the normal sister chromatid. Our results indicate that the mechanism by which copy number changes occur in regions with a complex genomic architecture can yield complex rearrangements.
Resumo:
Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.