848 resultados para databases and data mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data mining can be defined as the extraction of implicit, previously un-known, and potentially useful information from data. Numerous re-searchers have been developing security technology and exploring new methods to detect cyber-attacks with the DARPA 1998 dataset for Intrusion Detection and the modified versions of this dataset KDDCup99 and NSL-KDD, but until now no one have examined the performance of the Top 10 data mining algorithms selected by experts in data mining. The compared classification learning algorithms in this thesis are: C4.5, CART, k-NN and Naïve Bayes. The performance of these algorithms are compared with accuracy, error rate and average cost on modified versions of NSL-KDD train and test dataset where the instances are classified into normal and four cyber-attack categories: DoS, Probing, R2L and U2R. Additionally the most important features to detect cyber-attacks in all categories and in each category are evaluated with Weka’s Attribute Evaluator and ranked according to Information Gain. The results show that the classification algorithm with best performance on the dataset is the k-NN algorithm. The most important features to detect cyber-attacks are basic features such as the number of seconds of a network connection, the protocol used for the connection, the network service used, normal or error status of the connection and the number of data bytes sent. The most important features to detect DoS, Probing and R2L attacks are basic features and the least important features are content features. Unlike U2R attacks, where the content features are the most important features to detect attacks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Following the workshop on new developments in daily licensing practice in November 2011, we brought together fourteen representatives from national consortia (from Denmark, Germany, Netherlands and the UK) and publishers (Elsevier, SAGE and Springer) met in Copenhagen on 9 March 2012 to discuss provisions in licences to accommodate new developments. The one day workshop aimed to: present background and ideas regarding the provisions KE Licensing Expert Group developed; introduce and explain the provisions the invited publishers currently use;ascertain agreement on the wording for long term preservation, continuous access and course packs; give insight and more clarity about the use of open access provisions in licences; discuss a roadmap for inclusion of the provisions in the publishers’ licences; result in report to disseminate the outcome of the meeting. Participants of the workshop were: United Kingdom: Lorraine Estelle (Jisc Collections) Denmark: Lotte Eivor Jørgensen (DEFF), Lone Madsen (Southern University of Denmark), Anne Sandfær (DEFF/Knowledge Exchange) Germany: Hildegard Schaeffler (Bavarian State Library), Markus Brammer (TIB) The Netherlands: Wilma Mossink (SURF), Nol Verhagen (University of Amsterdam), Marc Dupuis (SURF/Knowledge Exchange) Publishers: Alicia Wise (Elsevier), Yvonne Campfens (Springer), Bettina Goerner (Springer), Leo Walford (Sage) Knowledge Exchange: Keith Russell The main outcome of the workshop was that it would be valuable to have a standard set of clauses which could used in negotiations, this would make concluding licences a lot easier and more efficient. The comments on the model provisions the Licensing Expert group had drafted will be taken into account and the provisions will be reformulated. Data and text mining is a new development and demand for access to allow for this is growing. It would be easier if there was a simpler way to access materials so they could be more easily mined. However there are still outstanding questions on how authors of articles that have been mined can be properly attributed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The incredible rapid development to huge volumes of air travel, mainly because of jet airliners that appeared to the sky in the 1950s, created the need for systematic research for aviation safety and collecting data about air traffic. The structured data can be analysed easily using queries from databases and running theseresults through graphic tools. However, in analysing narratives that often give more accurate information about the case, mining tools are needed. The analysis of textual data with computers has not been possible until data mining tools have been developed. Their use, at least among aviation, is still at a moderate level. The research aims at discovering lethal trends in the flight safety reports. The narratives of 1,200 flight safety reports from years 1994 – 1996 in Finnish were processed with three text mining tools. One of them was totally language independent, the other had a specific configuration for Finnish and the third originally created for English, but encouraging results had been achieved with Spanish and that is why a Finnish test was undertaken, too. The global rate of accidents is stabilising and the situation can now be regarded as satisfactory, but because of the growth in air traffic, the absolute number of fatal accidents per year might increase, if the flight safety will not be improved. The collection of data and reporting systems have reached their top level. The focal point in increasing the flight safety is analysis. The air traffic has generally been forecasted to grow 5 – 6 per cent annually over the next two decades. During this period, the global air travel will probably double also with relatively conservative expectations of economic growth. This development makes the airline management confront growing pressure due to increasing competition, signify cant rise in fuel prices and the need to reduce the incident rate due to expected growth in air traffic volumes. All this emphasises the urgent need for new tools and methods. All systems provided encouraging results, as well as proved challenges still to be won. Flight safety can be improved through the development and utilisation of sophisticated analysis tools and methods, like data mining, using its results supporting the decision process of the executives.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Il presente elaborato esplora l’attitudine delle organizzazioni nei confronti dei processi di business che le sostengono: dalla semi-assenza di struttura, all’organizzazione funzionale, fino all’avvento del Business Process Reengineering e del Business Process Management, nato come superamento dei limiti e delle problematiche del modello precedente. All’interno del ciclo di vita del BPM, trova spazio la metodologia del process mining, che permette un livello di analisi dei processi a partire dagli event data log, ossia dai dati di registrazione degli eventi, che fanno riferimento a tutte quelle attività supportate da un sistema informativo aziendale. Il process mining può essere visto come naturale ponte che collega le discipline del management basate sui processi (ma non data-driven) e i nuovi sviluppi della business intelligence, capaci di gestire e manipolare l’enorme mole di dati a disposizione delle aziende (ma che non sono process-driven). Nella tesi, i requisiti e le tecnologie che abilitano l’utilizzo della disciplina sono descritti, cosi come le tre tecniche che questa abilita: process discovery, conformance checking e process enhancement. Il process mining è stato utilizzato come strumento principale in un progetto di consulenza da HSPI S.p.A. per conto di un importante cliente italiano, fornitore di piattaforme e di soluzioni IT. Il progetto a cui ho preso parte, descritto all’interno dell’elaborato, ha come scopo quello di sostenere l’organizzazione nel suo piano di improvement delle prestazioni interne e ha permesso di verificare l’applicabilità e i limiti delle tecniche di process mining. Infine, nell’appendice finale, è presente un paper da me realizzato, che raccoglie tutte le applicazioni della disciplina in un contesto di business reale, traendo dati e informazioni da working papers, casi aziendali e da canali diretti. Per la sua validità e completezza, questo documento è stata pubblicato nel sito dell'IEEE Task Force on Process Mining.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ability to accurately predict the lifetime of building components is crucial to optimizing building design, material selection and scheduling of required maintenance. This paper discusses a number of possible data mining methods that can be applied to do the lifetime prediction of metallic components and how different sources of service life information could be integrated to form the basis of the lifetime prediction model

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Experience plays an important role in building management. “How often will this asset need repair?” or “How much time is this repair going to take?” are types of questions that project and facility managers face daily in planning activities. Failure or success in developing good schedules, budgets and other project management tasks depend on the project manager's ability to obtain reliable information to be able to answer these types of questions. Young practitioners tend to rely on information that is based on regional averages and provided by publishing companies. This is in contrast to experienced project managers who tend to rely heavily on personal experience. Another aspect of building management is that many practitioners are seeking to improve available scheduling algorithms, estimating spreadsheets and other project management tools. Such “micro-scale” levels of research are important in providing the required tools for the project manager's tasks. However, even with such tools, low quality input information will produce inaccurate schedules and budgets as output. Thus, it is also important to have a broad approach to research at a more “macro-scale.” Recent trends show that the Architectural, Engineering, Construction (AEC) industry is experiencing explosive growth in its capabilities to generate and collect data. There is a great deal of valuable knowledge that can be obtained from the appropriate use of this data and therefore the need has arisen to analyse this increasing amount of available data. Data Mining can be applied as a powerful tool to extract relevant and useful information from this sea of data. Knowledge Discovery in Databases (KDD) and Data Mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so large amounts of project data may be analysed. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. The project involves the development of a prototype tool to support facility managers, building owners and designers. This final report presents the AIMMTM prototype system and documents how and what data mining techniques can be applied, the results of their application and the benefits gained from the system. The AIMMTM system is capable of searching for useful patterns of knowledge and correlations within the existing building maintenance data to support decision making about future maintenance operations. The application of the AIMMTM prototype system on building models and their maintenance data (supplied by industry partners) utilises various data mining algorithms and the maintenance data is analysed using interactive visual tools. The application of the AIMMTM prototype system to help in improving maintenance management and building life cycle includes: (i) data preparation and cleaning, (ii) integrating meaningful domain attributes, (iii) performing extensive data mining experiments in which visual analysis (using stacked histograms), classification and clustering techniques, associative rule mining algorithm such as “Apriori” and (iv) filtering and refining data mining results, including the potential implications of these results for improving maintenance management. Maintenance data of a variety of asset types were selected for demonstration with the aim of discovering meaningful patterns to assist facility managers in strategic planning and provide a knowledge base to help shape future requirements and design briefing. Utilising the prototype system developed here, positive and interesting results regarding patterns and structures of data have been obtained.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The construction industry has adapted information technology in its processes in terms of computer aided design and drafting, construction documentation and maintenance. The data generated within the construction industry has become increasingly overwhelming. Data mining is a sophisticated data search capability that uses classification algorithms to discover patterns and correlations within a large volume of data. This paper presents the selection and application of data mining techniques on maintenance data of buildings. The results of applying such techniques and potential benefits of utilising their results to identify useful patterns of knowledge and correlations to support decision making of improving the management of building life cycle are presented and discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This report demonstrates the development of: • Development of software agents for data mining • Link data mining to building model in virtual environments • Link knowledge development with building model in virtual environments • Demonstration of software agents for data mining • Populate with maintenance data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Experience plays an important role in building management. “How often will this asset need repair?” or “How much time is this repair going to take?” are types of questions that project and facility managers face daily in planning activities. Failure or success in developing good schedules, budgets and other project management tasks depend on the project manager's ability to obtain reliable information to be able to answer these types of questions. Young practitioners tend to rely on information that is based on regional averages and provided by publishing companies. This is in contrast to experienced project managers who tend to rely heavily on personal experience. Another aspect of building management is that many practitioners are seeking to improve available scheduling algorithms, estimating spreadsheets and other project management tools. Such “micro-scale” levels of research are important in providing the required tools for the project manager's tasks. However, even with such tools, low quality input information will produce inaccurate schedules and budgets as output. Thus, it is also important to have a broad approach to research at a more “macro-scale.” Recent trends show that the Architectural, Engineering, Construction (AEC) industry is experiencing explosive growth in its capabilities to generate and collect data. There is a great deal of valuable knowledge that can be obtained from the appropriate use of this data and therefore the need has arisen to analyse this increasing amount of available data. Data Mining can be applied as a powerful tool to extract relevant and useful information from this sea of data. Knowledge Discovery in Databases (KDD) and Data Mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so large amounts of project data may be analysed. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. The project involves the development of a prototype tool to support facility managers, building owners and designers. This Industry focused report presents the AIMMTM prototype system and documents how and what data mining techniques can be applied, the results of their application and the benefits gained from the system. The AIMMTM system is capable of searching for useful patterns of knowledge and correlations within the existing building maintenance data to support decision making about future maintenance operations. The application of the AIMMTM prototype system on building models and their maintenance data (supplied by industry partners) utilises various data mining algorithms and the maintenance data is analysed using interactive visual tools. The application of the AIMMTM prototype system to help in improving maintenance management and building life cycle includes: (i) data preparation and cleaning, (ii) integrating meaningful domain attributes, (iii) performing extensive data mining experiments in which visual analysis (using stacked histograms), classification and clustering techniques, associative rule mining algorithm such as “Apriori” and (iv) filtering and refining data mining results, including the potential implications of these results for improving maintenance management. Maintenance data of a variety of asset types were selected for demonstration with the aim of discovering meaningful patterns to assist facility managers in strategic planning and provide a knowledge base to help shape future requirements and design briefing. Utilising the prototype system developed here, positive and interesting results regarding patterns and structures of data have been obtained.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The management of main material prices of provincial highway project quota has problems of lag and blindness. Framework of provincial highway project quota data MIS and main material price data warehouse were established based on WEB firstly. Then concrete processes of provincial highway project main material prices were brought forward based on BP neural network algorithmic. After that standard BP algorithmic, additional momentum modify BP network algorithmic, self-adaptive study speed improved BP network algorithmic were compared in predicting highway project main prices. The result indicated that it is feasible to predict highway main material prices using BP NN, and using self-adaptive study speed improved BP network algorithmic is the relatively best one.