783 resultados para Data Mining and Machine Learning
Resumo:
In this paper, a co-operative distributed process mining system (CDPMS) is developed to streamline the workflow along the supply chain in order to offer shorter delivery times, more flexibility and higher customer satisfaction with learning ability. The proposed system is equipped with the ‘distributed process mining’ feature which is used to discover the hidden relationships among each working decision in distributed manner. This method incorporates the concept of data mining and knowledge refinement into decision making process for ensuring ‘doing the right things’ within the workflow. An example of implementation is given, based on the case of slider manufacturer.
Resumo:
Some researchers argue that the top team, rather than the CEO, is a better predictor of an organisation’s fate (Finkelstein & Hambrick, 1996; Knight et al., 1999). However, others suggest that the importance of the top management team (TMT) composition literature is exaggerated (West & Schwenk, 1996). This has stimulated a need for further research on TMTs. While the importance of TMT is well documented in the innovation literature, the organisational environment also plays a key role in determining organisational outcomes. Therefore, the inclusion of both TMT characteristics and organisational variables (climate and organisational learning) in this study provides a more holistic picture of innovation. The research methodologies employed includes (i) interviews with TMT members in 35 Irish software companies (ii) a survey completed by managerial respondents and core workers in these companies (iii) in-depth interviews with TMT members from five companies. Data were gathered in two phases, time 1 (1998-2000) and time 2 (2003). The TMT played an important part in fostering innovation. However, it was a group process, rather than team demography, that was most strongly associated with innovation. Task reflexivity was an important predictor of innovation time 1, time 2). Only one measure of TMT diversity was associated with innovation - tenure diversity -in time 2 only. Organisational context played an important role in determining innovation. This was positively associated with innovation - but with one dimension of organisational learning only. The ability to share information (access to information) was not associated with innovation but the motivation to share information was (perceiving the sharing of information to be valuable). Innovative climate was also associated with innovation. This study suggests that this will lead to innovative outcomes if employees perceive the organisation to support risk, experimentation and other innovative behaviours.
Resumo:
We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
The purpose of this paper is to explain the notion of clustering and a concrete clustering method- agglomerative hierarchical clustering algorithm. It shows how a data mining method like clustering can be applied to the analysis of stocks, traded on the Bulgarian Stock Exchange in order to identify similar temporal behavior of the traded stocks. This problem is solved with the aid of a data mining tool that is called XLMiner™ for Microsoft Excel Office.
Resumo:
Dimensionality reduction is a very important step in the data mining process. In this paper, we consider feature extraction for classification tasks as a technique to overcome problems occurring because of “the curse of dimensionality”. Three different eigenvector-based feature extraction approaches are discussed and three different kinds of applications with respect to classification tasks are considered. The summary of obtained results concerning the accuracy of classification schemes is presented with the conclusion about the search for the most appropriate feature extraction method. The problem how to discover knowledge needed to integrate the feature extraction and classification processes is stated. A decision support system to aid in the integration of the feature extraction and classification processes is proposed. The goals and requirements set for the decision support system and its basic structure are defined. The means of knowledge acquisition needed to build up the proposed system are considered.
Resumo:
The present paper is devoted to creation of cryptographic data security and realization of the packet mode in the distributed information measurement and control system that implements methods of optical spectroscopy for plasma physics research and atomic collisions. This system gives a remote access to information and instrument resources within the Intranet/Internet networks. The system provides remote access to information and hardware resources for the natural sciences within the Intranet/Internet networks. The access to physical equipment is realized through the standard interface servers (PXI, CАМАC, and GPIB), the server providing access to Ethernet devices, and the communication server, which integrates the equipment servers into a uniform information system. The system is used to make research task in optical spectroscopy, as well as to support the process of education at the Department of Physics and Engineering of Petrozavodsk State University.
Resumo:
The real purpose of collecting big data is to identify causality in the hope that this will facilitate credible predictivity . But the search for causality can trap one into infinite regress, and thus one takes refuge in seeking associations between variables in data sets. Regrettably, the mere knowledge of associations does not enable predictivity. Associations need to be embedded within the framework of probability calculus to make coherent predictions. This is so because associations are a feature of probability models, and hence they do not exist outside the framework of a model. Measures of association, like correlation, regression, and mutual information merely refute a preconceived model. Estimated measures of associations do not lead to a probability model; a model is the product of pure thought. This paper discusses these and other fundamentals that are germane to seeking associations in particular, and machine learning in general. ACM Computing Classification System (1998): H.1.2, H.2.4., G.3.
Resumo:
Computer software plays an important role in business, government, society and sciences. To solve real-world problems, it is very important to measure the quality and reliability in the software development life cycle (SDLC). Software Engineering (SE) is the computing field concerned with designing, developing, implementing, maintaining and modifying software. The present paper gives an overview of the Data Mining (DM) techniques that can be applied to various types of SE data in order to solve the challenges posed by SE tasks such as programming, bug detection, debugging and maintenance. A specific DM software is discussed, namely one of the analytical tools for analyzing data and summarizing the relationships that have been identified. The paper concludes that the proposed techniques of DM within the domain of SE could be well applied in fields such as Customer Relationship Management (CRM), eCommerce and eGovernment. ACM Computing Classification System (1998): H.2.8.
Resumo:
Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2015
Resumo:
Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2015
Resumo:
Background: Major Depressive Disorder (MDD) is among the most prevalent and disabling medical conditions worldwide. Identification of clinical and biological markers ("biomarkers") of treatment response could personalize clinical decisions and lead to better outcomes. This paper describes the aims, design, and methods of a discovery study of biomarkers in antidepressant treatment response, conducted by the Canadian Biomarker Integration Network in Depression (CAN-BIND). The CAN-BIND research program investigates and identifies biomarkers that help to predict outcomes in patients with MDD treated with antidepressant medication. The primary objective of this initial study (known as CAN-BIND-1) is to identify individual and integrated neuroimaging, electrophysiological, molecular, and clinical predictors of response to sequential antidepressant monotherapy and adjunctive therapy in MDD. Methods: CAN-BIND-1 is a multisite initiative involving 6 academic health centres working collaboratively with other universities and research centres. In the 16-week protocol, patients with MDD are treated with a first-line antidepressant (escitalopram 10-20 mg/d) that, if clinically warranted after eight weeks, is augmented with an evidence-based, add-on medication (aripiprazole 2-10 mg/d). Comprehensive datasets are obtained using clinical rating scales; behavioural, dimensional, and functioning/quality of life measures; neurocognitive testing; genomic, genetic, and proteomic profiling from blood samples; combined structural and functional magnetic resonance imaging; and electroencephalography. De-identified data from all sites are aggregated within a secure neuroinformatics platform for data integration, management, storage, and analyses. Statistical analyses will include multivariate and machine-learning techniques to identify predictors, moderators, and mediators of treatment response. Discussion: From June 2013 to February 2015, a cohort of 134 participants (85 outpatients with MDD and 49 healthy participants) has been evaluated at baseline. The clinical characteristics of this cohort are similar to other studies of MDD. Recruitment at all sites is ongoing to a target sample of 290 participants. CAN-BIND will identify biomarkers of treatment response in MDD through extensive clinical, molecular, and imaging assessments, in order to improve treatment practice and clinical outcomes. It will also create an innovative, robust platform and database for future research. Trial registration: ClinicalTrials.gov identifier NCT01655706. Registered July 27, 2012.