68 resultados para Association mining
Resumo:
A new chromium(III)-Schiff base complex, [Cr(5-chlorosalprn)(H2O)(2)]ClO4, where salprn=N,N'-propylenebis(salicylideneimine) has been prepared and characterized by electrospray ionization mass spectrometric (ESIMS) analysis and other spectroscopic techniques. Single crystal X-ray data reveal that the complex assumes a trans-diaquo structure, [Cr(C17H18Cl2N2O4)]ClO4.H2O. The effect of phenyl ring substituents on the rate of formation of [O=Cr-V Schiff base](+) has been investigated. The bimolecular rate constant for the formation of O=Cr-V species by the [Cr(Schiff base)(H2O)(2)]ClO4, where the Schiff base=salprn, (1) and 5-chlorosalprn, (2) with PhOI was compared. In the case of (2) the rate was found to be faster by an order of magnitude at pH=4 compared to (1). The introduction of a chloro-substituent on the phenyl ring not only influences the rate of redox reactivity but also the pKa values of aquo ligands of the complexes, indicating the difference in the electronic environment around the metal ion in both (1) and (2).
Resumo:
In data mining, an important goal is to generate an abstraction of the data. Such an abstraction helps in reducing the space and search time requirements of the overall decision making process. Further, it is important that the abstraction is generated from the data with a small number of disk scans. We propose a novel data structure, pattern count tree (PC-tree), that can be built by scanning the database only once. PC-tree is a minimal size complete representation of the data and it can be used to represent dynamic databases with the help of knowledge that is either static or changing. We show that further compactness can be achieved by constructing the PC-tree on segmented patterns. We exploit the flexibility offered by rough sets to realize a rough PC-tree and use it for efficient and effective rough classification. To be consistent with the sizes of the branches of the PC-tree, we use upper and lower approximations of feature sets in a manner different from the conventional rough set theory. We conducted experiments using the proposed classification scheme on a large-scale hand-written digit data set. We use the experimental results to establish the efficacy of the proposed approach. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
With the emergence of large-volume and high-speed streaming data, the recent techniques for stream mining of CFIpsilas (closed frequent itemsets) will become inefficient. When concept drift occurs at a slow rate in high speed data streams, the rate of change of information across different sliding windows will be negligible. So, the user wonpsilat be devoid of change in information if we slide window by multiple transactions at a time. Therefore, we propose a novel approach for mining CFIpsilas cumulatively by making sliding width(ges1) over high speed data streams. However, it is nontrivial to mine CFIpsilas cumulatively over stream, because such growth may lead to the generation of exponential number of candidates for closure checking. In this study, we develop an efficient algorithm, stream-close, for mining CFIpsilas over stream by exploring some interesting properties. Our performance study reveals that stream-close achieves good scalability and has promising results.
Resumo:
Rapid urbanisation in India has posed serious challenges to the decision makers in regional planning involving plethora of issues including provision of basic amenities (like electricity, water, sanitation, transport, etc.). Urban planning entails an understanding of landscape and urban dynamics with causal factors. Identifying, delineating and mapping landscapes on temporal scale provide an opportunity to monitor the changes, which is important for natural resource management and sustainable planning activities. Multi-source, multi-sensor, multi-temporal, multi-frequency or multi-polarization remote sensing data with efficient classification algorithms and pattern recognition techniques aid in capturing these dynamics. This paper analyses the landscape dynamics of Greater Bangalore by: (i) characterisation of direct impervious surface, (ii) computation of forest fragmentation indices and (iii) modeling to quantify and categorise urban changes. Linear unmixing is used for solving the mixed pixel problem of coarse resolution super spectral MODIS data for impervious surface characterisation. Fragmentation indices were used to classify forests – interior, perforated, edge, transitional, patch and undetermined. Based on this, urban growth model was developed to determine the type of urban growth – Infill, Expansion and Outlying growth. This helped in visualising urban growth poles and consequence of earlier policy decisions that can help in evolving strategies for effective land use policies.
Resumo:
In this paper, we consider the problem of association of wireless stations (STAs) with an access network served by a wireless local area network (WLAN) and a 3G cellular network. There is a set of WLAN Access Points (APs) and a set of 3G Base Stations (BSs) and a number of STAs each of which needs to be associated with one of the APs or one of the BSs. We concentrate on downlink bulk elastic transfers. Each association provides each ST with a certain transfer rate. We evaluate an association on the basis of the sum log utility of the transfer rates and seek the utility maximizing association. We also obtain the optimal time scheduling of service from a 3G BS to the associated STAs. We propose a fast iterative heuristic algorithm to compute an association. Numerical results show that our algorithm converges in a few steps yielding an association that is within 1% (in objective value) of the optimal (obtained through exhaustive search); in most cases the algorithm yields an optimal solution.
Resumo:
Dimeric banana lectin and calsepa, tetrameric artocarpin and octameric heltuba are mannose-specific beta-prism I fold lectins of nearly the same tertiary structure. MD simulations on individual subunits and the oligomers provide insights into the changes in the structure brought about in the protomers on oligomerization, including swapping of the N-terminal stretch in one instance. The regions that undergo changes also tend to exhibit dynamic flexibility during MD simulations. The internal symmetries of individual oligomers are substantially retained during the calculations. Energy minimization and simulations were also carried out on models using all possible oligomers by employing the four different protomers. The unique dimerization pattern observed in calsepa could be traced to unique substitutions in a peptide stretch involved in dimerization. The impossibility of a specific mode of oligomerization involving a particular protomer is often expressed in terms of unacceptable steric contacts or dissociation of the oligomer during simulations. The calculations also led to a rationale for the observation of a heltuba tetramer in solution although the lectin exists as an octamer in the crystal, in addition to providing insights into relations among evolution, oligomerization and ligand binding.
Resumo:
Over the past decade, many powerful data mining techniques have been developed to analyze temporal and sequential data. The time is now fertile for addressing problems of larger scope under the purview of temporal data mining. The fourth SIGKDD workshop on temporal data mining focused on the question: What can we infer about the structure of a complex dynamical system from observed temporal data? The goals of the workshop were to critically evaluate the need in this area by bringing together leading researchers from industry and academia, and to identify promising technologies and methodologies for doing the same. We provide a brief summary of the workshop proceedings and ideas arising out of the discussions.
Resumo:
Data mining is concerned with analysing large volumes of (often unstructured) data to automatically discover interesting regularities or relationships which in turn lead to better understanding of the underlying processes. The field of temporal data mining is concerned with such analysis in the case of ordered data streams with temporal interdependencies. Over the last decade many interesting techniques of temporal data mining were proposed and shown to be useful in many applications. Since temporal data mining brings together techniques from different fields such as statistics, machine learning and databases, the literature is scattered among many different sources. In this article, we present an overview of techniques of temporal data mining.We mainly concentrate on algorithms for pattern discovery in sequential data streams.We also describe some recent results regarding statistical analysis of pattern discovery methods.
Resumo:
A method, system, and computer program product for fault data correlation in a diagnostic system are provided. The method includes receiving the fault data including a plurality of faults collected over a period of time, and identifying a plurality of episodes within the fault data, where each episode includes a sequence of the faults. The method further includes calculating a frequency of the episodes within the fault data, calculating a correlation confidence of the faults relative to the episodes as a function of the frequency of the episodes, and outputting a report of the faults with the correlation confidence.
Resumo:
A system for temporal data mining includes a computer readable medium having an application configured to receive at an input module a temporal data series having events with start times and end times, a set of allowed dwelling times and a threshold frequency. The system is further configured to identify, using a candidate identification and tracking module, one or more occurrences in the temporal data series of a candidate episode and increment a count for each identified occurrence. The system is also configured to produce at an output module an output for those episodes whose count of occurrences results in a frequency exceeding the threshold frequency.
Resumo:
We consider several WLAN stations associated at rates r(1), r(2), ... r(k) with an Access Point. Each station (STA) is downloading a long file from a local server, located on the LAN to which the Access Point (AP) is attached, using TCP. We assume that a TCP ACK will be produced after the reception of d packets at an STA. We model these simultaneous TCP-controlled transfers using a semi-Markov process. Our analytical approach leads to a procedure to compute aggregate download, as well as per-STA throughputs, numerically, and the results match simulations very well. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
The solubilities of various solid pollutants in supercritical carbon dioxide were investigated. The intermolecular interactions play a significant role in determining the solubilities of solids in supercritical carbon dioxide. A new model equation was derived by using the concepts of association and activity coefficient model to correlate the solubilities of solids. The model equation combines the association and Wilson activity coefficient models and includes the interaction potentials between the molecules, which are useful in understanding the behavior of the solid solutes in SCCO2. The new model equation involves five adjustable parameters to correlate the solubilities of solids by incorporating the interactions between the molecules. The equation correlated 75 solid systems with an average AARD of around 9%, which was better than the correlations obtained from standard models such as Mendez Santiago-Teja (MT) model and association model. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
IDH1 mutations are frequent genetic alterations in low-grade diffuse gliomas and secondary glioblastoma (GBM). To validate mutation frequency, IDH1 gene at codon 132 was sequenced in 74 diffusely infiltrating astrocytomas: diffuse astrocytoma (DA; World Health Organization WHO] grade II), anaplastic astrocytoma (AA; WHO grade III), and GBM (WHO grade IV). All cases were immunostained with IDH1-R132H monoclonal antibody. Mutational status was correlated with mutant protein expression, patient age, duration of symptoms, and prognosis of patients with GBM. We detected 31 (41.9%) heterozygous IDH1 mutations resulting in arginine-to-histidine substitution (R132H;CGT-CAT). All 12 DAs (100%), 13 of 14 AAs (92.9%), and 6 of 48 GBMs (12.5%) (5/6 83.3%] secondary, and 1/42 2.4%] primary) harbored IDH1 mutations. The correlation between mutational status and protein expression was significant (P < .001). IDH1 mutation status, though not associated with prognosis of patients with GBM, showed significant association with younger age and longer duration of symptoms in the whole cohort (P < .001). Our study validates IDH1 mutant protein expression across various grades of astrocytoma, and demonstrates a high incidence of IDH1 mutations in DA, AA, and secondary GBM.