968 resultados para Discovery learning
Resumo:
An understanding of application I/O access patterns is useful in several situations. First, gaining insight into what applications are doing with their data at a semantic level helps in designing efficient storage systems. Second, it helps create benchmarks that mimic realistic application behavior closely. Third, it enables autonomic systems as the information obtained can be used to adapt the system in a closed loop.All these use cases require the ability to extract the application-level semantics of I/O operations. Methods such as modifying application code to associate I/O operations with semantic tags are intrusive. It is well known that network file system traces are an important source of information that can be obtained non-intrusively and analyzed either online or offline. These traces are a sequence of primitive file system operations and their parameters. Simple counting, statistical analysis or deterministic search techniques are inadequate for discovering application-level semantics in the general case, because of the inherent variation and noise in realistic traces.In this paper, we describe a trace analysis methodology based on Profile Hidden Markov Models. We show that the methodology has powerful discriminatory capabilities that enable it to recognize applications based on the patterns in the traces, and to mark out regions in a long trace that encapsulate sets of primitive operations that represent higher-level application actions. It is robust enough that it can work around discrepancies between training and target traces such as in length and interleaving with other operations. We demonstrate the feasibility of recognizing patterns based on a small sampling of the trace, enabling faster trace analysis. Preliminary experiments show that the method is capable of learning accurate profile models on live traces in an online setting. We present a detailed evaluation of this methodology in a UNIX environment using NFS traces of selected commonly used applications such as compilations as well as on industrial strength benchmarks such as TPC-C and Postmark, and discuss its capabilities and limitations in the context of the use cases mentioned above.
Resumo:
Today's programming languages are supported by powerful third-party APIs. For a given application domain, it is common to have many competing APIs that provide similar functionality. Programmer productivity therefore depends heavily on the programmer's ability to discover suitable APIs both during an initial coding phase, as well as during software maintenance. The aim of this work is to support the discovery and migration of math APIs. Math APIs are at the heart of many application domains ranging from machine learning to scientific computations. Our approach, called MATHFINDER, combines executable specifications of mathematical computations with unit tests (operational specifications) of API methods. Given a math expression, MATHFINDER synthesizes pseudo-code comprised of API methods to compute the expression by mining unit tests of the API methods. We present a sequential version of our unit test mining algorithm and also design a more scalable data-parallel version. We perform extensive evaluation of MATHFINDER (1) for API discovery, where math algorithms are to be implemented from scratch and (2) for API migration, where client programs utilizing a math API are to be migrated to another API. We evaluated the precision and recall of MATHFINDER on a diverse collection of math expressions, culled from algorithms used in a wide range of application areas such as control systems and structural dynamics. In a user study to evaluate the productivity gains obtained by using MATHFINDER for API discovery, the programmers who used MATHFINDER finished their programming tasks twice as fast as their counterparts who used the usual techniques like web and code search, IDE code completion, and manual inspection of library documentation. For the problem of API migration, as a case study, we used MATHFINDER to migrate Weka, a popular machine learning library. Overall, our evaluation shows that MATHFINDER is easy to use, provides highly precise results across several math APIs and application domains even with a small number of unit tests per method, and scales to large collections of unit tests.
Resumo:
M. Galea and Q. Shen. Simultaneous ant colony optimisation algorithms for learning linguistic fuzzy rules. A. Abraham, C. Grosan and V. Ramos (Eds.), Swarm Intelligence in Data Mining, pages 75-99.
Resumo:
Mapping novel terrain from sparse, complex data often requires the resolution of conflicting information from sensors working at different times, locations, and scales, and from experts with different goals and situations. Information fusion methods help resolve inconsistencies in order to distinguish correct from incorrect answers, as when evidence variously suggests that an object's class is car, truck, or airplane. The methods developed here consider a complementary problem, supposing that information from sensors and experts is reliable though inconsistent, as when evidence suggests that an objects class is car, vehicle, or man-made. Underlying relationships among objects are assumed to be unknown to the automated system of the human user. The ARTMAP information fusion system uses distributed code representations that exploit the neural network's capacity for one-to-many learning in order to produce self-organizing expert systems that discover hierarchial knowledge structures. The system infers multi-level relationships among groups of output classes, without any supervised labeling of these relationships. The procedure is illustrated with two image examples.
Resumo:
Classifying novel terrain or objects front sparse, complex data may require the resolution of conflicting information from sensors working at different times, locations, and scales, and from sources with different goals and situations. Information fusion methods can help resolve inconsistencies, as when evidence variously suggests that an object's class is car, truck, or airplane. The methods described here consider a complementary problem, supposing that information from sensors and experts is reliable though inconsistent, as when evidence suggests that an object's class is car, vehicle, and man-made. Underlying relationships among objects are assumed to be unknown to the automated system or the human user. The ARTMAP information fusion system used distributed code representations that exploit the neural network's capacity for one-to-many learning in order to produce self-organizing expert systems that discover hierarchical knowledge structures. The system infers multi-level relationships among groups of output classes, without any supervised labeling of these relationships.
Resumo:
Classifying novel terrain or objects from sparse, complex data may require the resolution of conflicting information from sensors woring at different times, locations, and scales, and from sources with different goals and situations. Information fusion methods can help resolve inconsistencies, as when eveidence variously suggests that and object's class is car, truck, or airplane. The methods described her address a complementary problem, supposing that information from sensors and experts is reliable though inconsistent, as when evidence suggests that an object's class is car, vehicle, and man-made. Underlying relationships among classes are assumed to be unknown to the autonomated system or the human user. The ARTMAP information fusion system uses distributed code representations that exploit the neural network's capacity for one-to-many learning in order to produce self-organizing expert systems that discover hierachical knowlege structures. The fusion system infers multi-level relationships among groups of output classes, without any supervised labeling of these relationships. The procedure is illustrated with two image examples, but is not limited to image domain.
Resumo:
An enterprise information system (EIS) is an integrated data-applications platform characterized by diverse, heterogeneous, and distributed data sources. For many enterprises, a number of business processes still depend heavily on static rule-based methods and extensive human expertise. Enterprises are faced with the need for optimizing operation scheduling, improving resource utilization, discovering useful knowledge, and making data-driven decisions.
This thesis research is focused on real-time optimization and knowledge discovery that addresses workflow optimization, resource allocation, as well as data-driven predictions of process-execution times, order fulfillment, and enterprise service-level performance. In contrast to prior work on data analytics techniques for enterprise performance optimization, the emphasis here is on realizing scalable and real-time enterprise intelligence based on a combination of heterogeneous system simulation, combinatorial optimization, machine-learning algorithms, and statistical methods.
On-demand digital-print service is a representative enterprise requiring a powerful EIS.We use real-life data from Reischling Press, Inc. (RPI), a digit-print-service provider (PSP), to evaluate our optimization algorithms.
In order to handle the increase in volume and diversity of demands, we first present a high-performance, scalable, and real-time production scheduling algorithm for production automation based on an incremental genetic algorithm (IGA). The objective of this algorithm is to optimize the order dispatching sequence and balance resource utilization. Compared to prior work, this solution is scalable for a high volume of orders and it provides fast scheduling solutions for orders that require complex fulfillment procedures. Experimental results highlight its potential benefit in reducing production inefficiencies and enhancing the productivity of an enterprise.
We next discuss analysis and prediction of different attributes involved in hierarchical components of an enterprise. We start from a study of the fundamental processes related to real-time prediction. Our process-execution time and process status prediction models integrate statistical methods with machine-learning algorithms. In addition to improved prediction accuracy compared to stand-alone machine-learning algorithms, it also performs a probabilistic estimation of the predicted status. An order generally consists of multiple series and parallel processes. We next introduce an order-fulfillment prediction model that combines advantages of multiple classification models by incorporating flexible decision-integration mechanisms. Experimental results show that adopting due dates recommended by the model can significantly reduce enterprise late-delivery ratio. Finally, we investigate service-level attributes that reflect the overall performance of an enterprise. We analyze and decompose time-series data into different components according to their hierarchical periodic nature, perform correlation analysis,
and develop univariate prediction models for each component as well as multivariate models for correlated components. Predictions for the original time series are aggregated from the predictions of its components. In addition to a significant increase in mid-term prediction accuracy, this distributed modeling strategy also improves short-term time-series prediction accuracy.
In summary, this thesis research has led to a set of characterization, optimization, and prediction tools for an EIS to derive insightful knowledge from data and use them as guidance for production management. It is expected to provide solutions for enterprises to increase reconfigurability, accomplish more automated procedures, and obtain data-driven recommendations or effective decisions.
Resumo:
Previous studies suggest that selective antagonists of specific subtypes of muscarinic acetylcholine receptors (mAChRs) may provide a novel approach for the treatment of certain central nervous system (CNS) disorders, including epileptic disorders, Parkinson's disease, and dystonia. Unfortunately, previously reported antagonists are not highly selective for specific mAChR subtypes, making it difficult to definitively establish the functional roles and therapeutic potential for individual subtypes of this receptor subfamily. The M 1 mAChR is of particular interest as a potential target for treatment of CNS disorders. We now report the discovery of a novel selective antagonist of M-1 mAChRs, termed VU0255035 [N-(3-oxo-3-(4-(pyridine-4-yl)piperazin-1-yl)propyl)benzo[c][1,2,5]thiadiazole-4-sulfonamide]. Equilibrium radioligand binding and functional studies demonstrate a greater than 75-fold selectivity of VU0255035 for M-1 mAChRs relative to M-2-M-5. Molecular pharmacology and mutagenesis studies indicate that VU0255035 is a competitive orthosteric antagonist of M-1 mAChRs, a surprising finding given the high level of M-1 mAChR selectivity relative to other orthosteric antagonists. Whole-cell patch-clamp recordings demonstrate that VU0255035 inhibits potentiation of N-methyl-D-aspartate receptor currents by the muscarinic agonist carbachol in hippocampal pyramidal cells. VU0255035 has excellent brain penetration in vivo and is efficacious in reducing pilocarpine-induced seizures in mice. We were surprised to find that doses of VU0255035 that reduce pilo-carpine-induced seizures do not induce deficits in contextual freezing, a measure of hippocampus-dependent learning that is disrupted by nonselective mAChR antagonists. Taken together, these data suggest that selective antagonists of M-1 mAChRs do not induce the severe cognitive deficits seen with nonselective mAChR antagonists and could provide a novel approach for the treatment certain of CNS disorders.
Resumo:
Highly selective positive allosteric modulators (PAMs) of metabotropic glutamate receptor subtype 5 (mGluR5) have emerged as a potential approach to treat positive symptoms associated with schizophrenia. mGluR5 plays an important role in both long-term potentiation (LTP) and long-term depression (LTD), suggesting that mGluR5 PAMs may also have utility in improving impaired cognitive function. However, if mGluR5 PAMs shift the balance of LTP and LTD or induce a state in which afferent activity induces lasting changes in synaptic function that are not appropriate for a given pattern of activity, this could disrupt rather than enhance cognitive function. We determined the effect of selective mGluR5 PAMs on the induction of LTP and LTD at the Schaffer collateral-CA1 synapse in the hippocampus. mGluR5-selective PAMs significantly enhanced threshold theta-burst stimulation (TBS)-induced LTP. In addition, mGluR5 PAMs enhanced both DHPG-induced LTD and LTD induced by the delivery of paired-pulse low-frequency stimulation. Selective potentiation of mGluR5 had no effect on LTP induced by suprathreshold TBS or saturated LTP. The finding that potentiation of mGluR5-mediated responses to stimulation of glutamatergic afferents enhances both LTP and LTD and supports the hypothesis that the activation of mGluR5 by endogenous glutamate contributes to both forms of plasticity. Furthermore, two systemically active mGluR5 PAMs enhanced performance in the Morris water maze, a measure of hippocampus-dependent spatial learning. Discovery of small molecules that enhance both LTP and LTD in an activity-appropriate manner shows a unique action on synaptic plasticity that may provide a novel approach for the treatment of impaired cognitive function. Neuropsychopharmacology (2009) 34, 2057-2071; doi:10.1038/npp.2009.30; published online 18 March 2009
Resumo:
This paper presents a preliminary study of developing a novel distributed adaptive real-time learning framework for wide area monitoring of power systems integrated with distributed generations using synchrophasor technology. The framework comprises distributed agents (synchrophasors) for autonomous local condition monitoring and fault detection, and a central unit for generating global view for situation awareness and decision making. Key technologies that can be integrated into this hierarchical distributed learning scheme are discussed to enable real-time information extraction and knowledge discovery for decision making, without explicitly accumulating and storing all raw data by the central unit. Based on this, the configuration of a wide area monitoring system of power systems using synchrophasor technology, and the functionalities for locally installed open-phasor-measurement-units (OpenPMUs) and a central unit are presented. Initial results on anti-islanding protection using the proposed approach are given to illustrate the effectiveness.
Resumo:
This work presents two new score functions based on the Bayesian Dirichlet equivalent uniform (BDeu) score for learning Bayesian network structures. They consider the sensitivity of BDeu to varying parameters of the Dirichlet prior. The scores take on the most adversary and the most beneficial priors among those within a contamination set around the symmetric one. We build these scores in such way that they are decomposable and can be computed efficiently. Because of that, they can be integrated into any state-of-the-art structure learning method that explores the space of directed acyclic graphs and allows decomposable scores. Empirical results suggest that our scores outperform the standard BDeu score in terms of the likelihood of unseen data and in terms of edge discovery with respect to the true network, at least when the training sample size is small. We discuss the relation between these new scores and the accuracy of inferred models. Moreover, our new criteria can be used to identify the amount of data after which learning is saturated, that is, additional data are of little help to improve the resulting model.
Resumo:
With over 50 billion downloads and more than 1.3 million apps in Google’s official market, Android has continued to gain popularity amongst smartphone users worldwide. At the same time there has been a rise in malware targeting the platform, with more recent strains employing highly sophisticated detection avoidance techniques. As traditional signature based methods become less potent in detecting unknown malware, alternatives are needed for timely zero-day discovery. Thus this paper proposes an approach that utilizes ensemble learning for Android malware detection. It combines advantages of static analysis with the efficiency and performance of ensemble machine learning to improve Android malware detection accuracy. The machine learning models are built using a large repository of malware samples and benign apps from a leading antivirus vendor. Experimental results and analysis presented shows that the proposed method which uses a large feature space to leverage the power of ensemble learning is capable of 97.3 % to 99% detection accuracy with very low false positive rates.
Resumo:
The purpose of this article is to investigate the involvement of Information and Learning Services staff in the delivery of the Research Training Programme at the University of Worcester, UK with a focus on researcher receptivity. I believe that by constantly reflecting on the development of that part of the programme delivered by ILS and by examining feedback from the sessions, it is possible to improve and increase the level of researcher receptivity. It is hoped that such examination and reflection will be of value and relevance to the IL community since by reflecting on success and failure in a local context and by mapping this reflection to existing research enables librarians to improve the support provided to researchers within their institutions. This article outlines the support given to research students at the University of Worcester in the past, examines the changes leading to present programme delivery and reflects on considerations for future support. The article is underpinned by reference to current research undertaken in international (albeit Western-centric) contexts. I note that the rationale behind changes is embedded in current adult learning and teaching theory. In an increasingly competitive research environment where funding is dependent on a statistically monitored research output, the aim of such support is to integrate any IL contribution into the wider research training programme. Thus resource discovery becomes part of the reflexive research cycle. Implicit in this investigative reflection is the desire of the IL community to constantly strive towards the positive reception of IL into research support programmes which are perceived by researchers as highly valuable to the process and progress of their work.