947 resultados para association rule mining
Resumo:
Users can rarely reveal their information need in full detail to a search engine within 1--2 words, so search engines need to "hedge their bets" and present diverse results within the precious 10 response slots. Diversity in ranking is of much recent interest. Most existing solutions estimate the marginal utility of an item given a set of items already in the response, and then use variants of greedy set cover. Others design graphs with the items as nodes and choose diverse items based on visit rates (PageRank). Here we introduce a radically new and natural formulation of diversity as finding centers in resistive graphs. Unlike in PageRank, we do not specify the edge resistances (equivalently, conductances) and ask for node visit rates. Instead, we look for a sparse set of center nodes so that the effective conductance from the center to the rest of the graph has maximum entropy. We give a cogent semantic justification for turning PageRank thus on its head. In marked deviation from prior work, our edge resistances are learnt from training data. Inference and learning are NP-hard, but we give practical solutions. In extensive experiments with subtopic retrieval, social network search, and document summarization, our approach convincingly surpasses recently-published diversity algorithms like subtopic cover, max-marginal relevance (MMR), Grasshopper, DivRank, and SVMdiv.
Resumo:
Theoretical and computational frameworks for synaptic plasticity and learning have a long and cherished history, with few parallels within the well-established literature for plasticity of voltage-gated ion channels. In this study, we derive rules for plasticity in the hyperpolarization-activated cyclic nucleotide-gated (HCN) channels, and assess the synergy between synaptic and HCN channel plasticity in establishing stability during synaptic learning. To do this, we employ a conductance-based model for the hippocampal pyramidal neuron, and incorporate synaptic plasticity through the well-established Bienenstock-Cooper-Munro (BCM)-like rule for synaptic plasticity, wherein the direction and strength of the plasticity is dependent on the concentration of calcium influx. Under this framework, we derive a rule for HCN channel plasticity to establish homeostasis in synaptically-driven firing rate, and incorporate such plasticity into our model. In demonstrating that this rule for HCN channel plasticity helps maintain firing rate homeostasis after bidirectional synaptic plasticity, we observe a linear relationship between synaptic plasticity and HCN channel plasticity for maintaining firing rate homeostasis. Motivated by this linear relationship, we derive a calcium-dependent rule for HCN-channel plasticity, and demonstrate that firing rate homeostasis is maintained in the face of synaptic plasticity when moderate and high levels of cytosolic calcium influx induced depression and potentiation of the HCN-channel conductance, respectively. Additionally, we show that such synergy between synaptic and HCN-channel plasticity enhances the stability of synaptic learning through metaplasticity in the BCM-like synaptic plasticity profile. Finally, we demonstrate that the synergistic interaction between synaptic and HCN-channel plasticity preserves robustness of information transfer across the neuron under a rate-coding schema. Our results establish specific physiological roles for experimentally observed plasticity in HCN channels accompanying synaptic plasticity in hippocampal neurons, and uncover potential links between HCN-channel plasticity and calcium influx, dynamic gain control and stable synaptic learning.
Resumo:
In the underlay mode of cognitive radio, secondary users are allowed to transmit when the primary is transmitting, but under tight interference constraints that protect the primary. However, these constraints limit the secondary system performance. Antenna selection (AS)-based multiple antenna techniques, which exploit spatial diversity with less hardware, help improve secondary system performance. We develop a novel and optimal transmit AS rule that minimizes the symbol error probability (SEP) of an average interference-constrained multiple-input-single-output secondary system that operates in the underlay mode. We show that the optimal rule is a non-linear function of the power gain of the channel from the secondary transmit antenna to the primary receiver and from the secondary transmit antenna to the secondary receive antenna. We also propose a simpler, tractable variant of the optimal rule that performs as well as the optimal rule. We then analyze its SEP with L transmit antennas, and extensively benchmark it with several heuristic selection rules proposed in the literature. We also enhance these rules in order to provide a fair comparison, and derive new expressions for their SEPs. The results bring out new inter-relationships between the various rules, and show that the optimal rule can significantly reduce the SEP.
Resumo:
The rapid growth in the field of data mining has lead to the development of various methods for outlier detection. Though detection of outliers has been well explored in the context of numerical data, dealing with categorical data is still evolving. In this paper, we propose a two-phase algorithm for detecting outliers in categorical data based on a novel definition of outliers. In the first phase, this algorithm explores a clustering of the given data, followed by the ranking phase for determining the set of most likely outliers. The proposed algorithm is expected to perform better as it can identify different types of outliers, employing two independent ranking schemes based on the attribute value frequencies and the inherent clustering structure in the given data. Unlike some existing methods, the computational complexity of this algorithm is not affected by the number of outliers to be detected. The efficacy of this algorithm is demonstrated through experiments on various public domain categorical data sets.
Resumo:
This paper primarily intends to develop a GIS (geographical information system)-based data mining approach for optimally selecting the locations and determining installed capacities for setting up distributed biomass power generation systems in the context of decentralized energy planning for rural regions. The optimal locations within a cluster of villages are obtained by matching the installed capacity needed with the demand for power, minimizing the cost of transportation of biomass from dispersed sources to power generation system, and cost of distribution of electricity from the power generation system to demand centers or villages. The methodology was validated by using it for developing an optimal plan for implementing distributed biomass-based power systems for meeting the rural electricity needs of Tumkur district in India consisting of 2700 villages. The approach uses a k-medoid clustering algorithm to divide the total region into clusters of villages and locate biomass power generation systems at the medoids. The optimal value of k is determined iteratively by running the algorithm for the entire search space for different values of k along with demand-supply matching constraints. The optimal value of the k is chosen such that it minimizes the total cost of system installation, costs of transportation of biomass, and transmission and distribution. A smaller region, consisting of 293 villages was selected to study the sensitivity of the results to varying demand and supply parameters. The results of clustering are represented on a GIS map for the region.
Resumo:
Background: Recent research on glioblastoma (GBM) has focused on deducing gene signatures predicting prognosis. The present study evaluated the mRNA expression of selected genes and correlated with outcome to arrive at a prognostic gene signature. Methods: Patients with GBM (n = 123) were prospectively recruited, treated with a uniform protocol and followed up. Expression of 175 genes in GBM tissue was determined using qRT-PCR. A supervised principal component analysis followed by derivation of gene signature was performed. Independent validation of the signature was done using TCGA data. Gene Ontology and KEGG pathway analysis was carried out among patients from TCGA cohort. Results: A 14 gene signature was identified that predicted outcome in GBM. A weighted gene (WG) score was found to be an independent predictor of survival in multivariate analysis in the present cohort (HR = 2.507; B = 0.919; p < 0.001) and in TCGA cohort. Risk stratification by standardized WG score classified patients into low and high risk predicting survival both in our cohort (p = <0.001) and TCGA cohort (p = 0.001). Pathway analysis using the most differentially regulated genes (n = 76) between the low and high risk groups revealed association of activated inflammatory/immune response pathways and mesenchymal subtype in the high risk group. Conclusion: We have identified a 14 gene expression signature that can predict survival in GBM patients. A network analysis revealed activation of inflammatory response pathway specifically in high risk group. These findings may have implications in understanding of gliomagenesis, development of targeted therapies and selection of high risk cancer patients for alternate adjuvant therapies.
Resumo:
Background: Insulin like growth factor binding proteins modulate the mitogenic and pro survival effects of IGF. Elevated expression of IGFBP2 is associated with progression of tumors that include prostate, ovarian, glioma among others. Though implicated in the progression of breast cancer, the molecular mechanisms involved in IGFBP2 actions are not well defined. This study investigates the molecular targets and biological pathways targeted by IGFBP2 in breast cancer. Methods: Transcriptome analysis of breast tumor cells (BT474) with stable knockdown of IGFBP2 and breast tumors having differential expression of IGFBP2 by immunohistochemistry was performed using microarray. Differential gene expression was established using R-Bioconductor package. For validation, gene expression was determined by qPCR. Inhibitors of IGF1R and integrin pathway were utilized to study the mechanism of regulation of beta-catenin. Immunohistochemical and immunocytochemical staining was performed on breast tumors and experimental cells, respectively for beta-catenin and IGFBP2 expression. Results: Knockdown of IGFBP2 resulted in differential expression of 2067 up regulated and 2002 down regulated genes in breast cancer cells. Down regulated genes principally belong to cell cycle, DNA replication, repair, p53 signaling, oxidative phosphorylation, Wnt signaling. Whole genome expression analysis of breast tumors with or without IGFBP2 expression indicated changes in genes belonging to Focal adhesion, Map kinase and Wnt signaling pathways. Interestingly, IGFBP2 knockdown clones showed reduced expression of beta-catenin compared to control cells which was restored upon IGFBP2 re-expression. The regulation of beta-catenin by IGFBP2 was found to be IGF1R and integrin pathway dependent. Furthermore, IGFBP2 and beta-catenin are co-ordinately overexpressed in breast tumors and correlate with lymph node metastasis. Conclusion: This study highlights regulation of beta-catenin by IGFBP2 in breast cancer cells and most importantly, combined expression of IGFBP2 and beta-catenin is associated with lymph node metastasis of breast tumors.
Resumo:
A causative agent in approximately 40% of diarrhea] cases. still remains unidentified. Though many enteroviruses (EVs) are transmitted through fecal-oral route and replicate in the intestinal cells, their association with acute diarrhea has not so far been recognized due to lack of detailed epidemiological investigations. This long-term, detailed molecular epidemiological study aims to conclusively determine the association of non-polio enteroviruses (NPEVs) with acute diarrhea in comaparison with rotavirus (RV) in children. Diarrheal stool specimens from 2161 children aged 0-2 years and 169 children between 2 and 9 years, and 1800 normal stool samples from age-matched healthy children between 0 and 9 years were examined during 2008-2012 for enterovirus (oral polio vaccine strains (OPVs) and NPEVs). Enterovirus serotypes were identified by complete VP1 gene sequence analysis. Enterovirus and rotavirus were detected in 19.01% (380/2330) and 13.82% (322/2330) diarrheal stools. During the study period, annual prevalence of EV- and RV-associated diarrhea ranged between 8% and 22%, but with contrasting seasonal prevalence with RV predominating during winter months and NPEV prevailing in other seasons. NPEVs are associated with epidemics-like outbreaks during which they are detected in up to 50% of diarrheic children, and in non-epidemic seasons in 0-10% of the patients. After subtraction of OPV-positive diarrheal cases (1.81%), while NPEVs are associated with about 17% of acute diarrhea, about 6% of healthy children showed asymptomatic NPEV excretion. Of 37 NPEV serotypes detected in diarrheal children, seven echovirus types 1, 7, 11, 13, 14, 30 and 33 are frequently observed, with Ell being more prevalent followed by E30. In conclusion, NPEVs are significantly associated with acute diarrhea, and NPEVs and rotavirus exhibit contrasting seasonal predominance. This study signifies the need for a new direction of research on enteroviruses involving systematic analysis of their contribution to diarrheal burden. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
Mycobacterium tuberculosis owes its high pathogenic potential to its ability to evade host immune responses and thrive inside the macrophage. The outcome of infection is largely determined by the cellular response comprising a multitude of molecular events. The complexity and inter-relatedness in the processes makes it essential to adopt systems approaches to study them. In this work, we construct a comprehensive network of infection-related processes in a human macrophage comprising 1888 proteins and 14,016 interactions. We then compute response networks based on available gene expression profiles corresponding to states of health, disease and drug treatment. We use a novel formulation for mining response networks that has led to identifying highest activities in the cell. Highest activity paths provide mechanistic insights into pathogenesis and response to treatment. The approach used here serves as a generic framework for mining dynamic changes in genome-scale protein interaction networks.
Resumo:
There is a growing recognition of the need to integrate non-trophic interactions into ecological networks for a better understanding of whole-community organization. To achieve this, the first step is to build networks of individual non-trophic interactions. In this study, we analyzed a network of interdependencies among bird species that participated in heterospecific foraging associations (flocks) in an evergreen forest site in the Western Ghats, India. We found the flock network to contain a small core of highly important species that other species are strongly dependent on, a pattern seen in many other biological networks. Further, we found that structural importance of species in the network was strongly correlated to functional importance of species at the individual flock level. Finally, comparisons with flock networks from other Asian forests showed that the same taxonomic groups were important in general, suggesting that species importance was an intrinsic trait and not dependent on local ecological conditions. Hence, given a list of species in an area, it may be possible to predict which ones are likely to be important. Our study provides a framework for the investigation of other heterospecific foraging associations and associations among species in other non-trophic contexts.
Resumo:
We analytically study the role played by the network topology in sustaining cooperation in a society of myopic agents in an evolutionary setting. In our model, each agent plays the Prisoner's Dilemma (PD) game with its neighbors, as specified by a network. Cooperation is the incumbent strategy, whereas defectors are the mutants. Starting with a population of cooperators, some agents are switched to defection. The agents then play the PD game with their neighbors and compute their fitness. After this, an evolutionary rule, or imitation dynamic is used to update the agent strategy. A defector switches back to cooperation if it has a cooperator neighbor with higher fitness. The network is said to sustain cooperation if almost all defectors switch to cooperation. Earlier work on the sustenance of cooperation has largely consisted of simulation studies, and we seek to complement this body of work by providing analytical insight for the same. We find that in order to sustain cooperation, a network should satisfy some properties such as small average diameter, densification, and irregularity. Real-world networks have been empirically shown to exhibit these properties, and are thus candidates for the sustenance of cooperation. We also analyze some specific graphs to determine whether or not they sustain cooperation. In particular, we find that scale-free graphs belonging to a certain family sustain cooperation, whereas Erdos-Renyi random graphs do not. To the best of our knowledge, ours is the first analytical attempt to determine which networks sustain cooperation in a population of myopic agents in an evolutionary setting.
Resumo:
In many systems, nucleation of a stable solid may occur in the presence of other (often more than one) metastable phases. These may be polymorphic solids or even liquid phases. Sometimes, the metastable phase might have a lower free energy minimum than the liquid but higher than the stable-solid-phase minimum and have characteristics in between the parent liquid and the globally stable solid phase. In such cases, nucleation of the solid phase from the melt may be facilitated by the metastable phase because the latter can ``wet'' the interface between the parent and the daughter phases, even though there may be no signature of the existence of metastable phase in the thermodynamic properties of the parent liquid and the stable solid phase. Straightforward application of classical nucleation theory (CNT) is flawed here as it overestimates the nucleation barrier because surface tension is overestimated (by neglecting the metastable phases of intermediate order) while the thermodynamic free energy gap between daughter and parent phases remains unchanged. In this work, we discuss a density functional theory (DFT)-based statistical mechanical approach to explore and quantify such facilitation. We construct a simple order-parameter-dependent free energy surface that we then use in DFT to calculate (i) the order parameter profile, (ii) the overall nucleation free energy barrier, and (iii) the surface tension between the parent liquid and the metastable solid and also parent liquid and stable solid phases. The theory indeed finds that the nucleation free energy barrier can decrease significantly in the presence of wetting. This approach can provide a microscopic explanation of the Ostwald step rule and the well-known phenomenon of ``disappearing polymorphs'' that depends on temperature and other thermodynamic conditions. Theory reveals a diverse scenario for phase transformation kinetics, some of which may be explored via modem nanoscopic synthetic methods.
Resumo:
In a typical enterprise WLAN, a station has a choice of multiple access points to associate with. The default association policy is based on metrics such as Re-ceived Signal Strength(RSS), and “link quality” to choose a particular access point among many. Such an approach can lead to unequal load sharing and diminished system performance. We consider the RAT (Rate And Throughput) policy [1] which leads to better system performance. The RAT policy has been implemented on home-grown centralized WLAN controller, ADWISER [2] and we demonstrate that the RAT policy indeed provides a better system performance.
Resumo:
Bird species are hypothesized to join mixed-species flocks (flocks hereon) either for direct foraging or anti-predation-related benefits. In this study, conducted in a tropical evergreen forest in the Western Ghats of India, we used intra-flock association patterns to generate a community-wide assessment of flocking benefits for different species. We assumed that individuals needed to be physically proximate to particular heterospecific individuals within flocks to obtain any direct foraging benefit (flushed prey, kleptoparasitism, copying foraging locations). Alternatively, for anti-predation benefits, physical proximity to particular heterospecifics is not required, i.e. just being in the flock vicinity can suffice. Therefore, we used choice of locations within flocks to infer whether individual species are obtaining direct foraging or anti-predation benefits. A small subset of the bird community (5/29 species), composed of all members of the sallying guild, showed non-random physical proximity to heterospecifics within flocks. All preferred associates were from non-sallying guilds, suggesting that the sallying species were likely obtaining direct foraging benefits either in the form of flushed or kleptoparasitized prey. The majority of the species (24/29) chose locations randomly with respect to heterospecifics within flocks and, thus, were likely obtaining antipredation benefits. In summary, our study indicates that direct foraging benefits are important for only a small proportion of species in flocks and that predation is likely to be the main driver of flocking for most participants. Our findings apart, our study provides methodological advances that might be useful in understanding asymmetric interactions in social groups of single and multiple species.
Resumo:
The influence of the flow rule on the bearing capacity of strip foundations placed on sand was investigated using a new kinematic approach of upper-bound limit analysis. The method of stress characteristics was first used to find the mechanism of the failure and to compute the stress field by using the Mohr-Coulomb yield criterion. Once the failure mechanism had been established, the kinematics of the plastic deformation was established, based on the requirements of the upper-bound limit theorem. Both associated and nonassociated plastic flows were considered, and the bearing capacity was obtained by equating the rate of external plastic work to the rate of the internal energy dissipation for both smooth and rough base foundations. The results obtained from the analysis were compared with those available from the literature. (C) 2014 American Society of Civil Engineers.