Biblioteca Digital

981 resultados para Meta-Data Management

Data Governance and Automated Marketing – A Case Study of Expected Benefits of Organizing Data Governance in an ICT Company

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This research is looking to find out what benefits employees expect the organization of data governance gains for an organization and how it benefits implementing automated marketing capabilities. Quality and usability of the data are crucial for organizations to meet various business needs. Organizations have more data and technology available what can be utilized for example in automated marketing. Data governance addresses the organization of decision rights and accountabilities for the management of an organization’s data assets. With automated marketing it is meant sending a right message, to a right person, at a right time, automatically. The research is a single case study conducted in Finnish ICT-company. The case company was starting to organize data governance and implementing automated marketing capabilities at the time of the research. Empirical material is interviews of the employees of the case company. Content analysis is used to interpret the interviews in order to find the answers to the research questions. Theoretical framework of the research is derived from the morphology of data governance. Findings of the research indicate that the employees expect the organization of data governance among others to improve customer experience, to improve sales, to provide abilities to identify individual customer’s life-situation, ensure that the handling of the data is according to the regulations and improve operational efficiency. The organization of data governance is expected to solve problems in customer data quality that are currently hindering implementation of automated marketing capabilities.

Energy aware software for many-core systems

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Many-core systems provide a great potential in application performance with the massively parallel structure. Such systems are currently being integrated into most parts of daily life from high-end server farms to desktop systems, laptops and mobile devices. Yet, these systems are facing increasing challenges such as high temperature causing physical damage, high electrical bills both for servers and individual users, unpleasant noise levels due to active cooling and unrealistic battery drainage in mobile devices; factors caused directly by poor energy efficiency. Power management has traditionally been an area of research providing hardware solutions or runtime power management in the operating system in form of frequency governors. Energy awareness in application software is currently non-existent. This means that applications are not involved in the power management decisions, nor does any interface between the applications and the runtime system to provide such facilities exist. Power management in the operating system is therefore performed purely based on indirect implications of software execution, usually referred to as the workload. It often results in over-allocation of resources, hence power waste. This thesis discusses power management strategies in many-core systems in the form of increasing application software awareness of energy efficiency. The presented approach allows meta-data descriptions in the applications and is manifested in two design recommendations: 1) Energy-aware mapping 2) Energy-aware execution which allow the applications to directly influence the power management decisions. The recommendations eliminate over-allocation of resources and increase the energy efficiency of the computing system. Both recommendations are fully supported in a provided interface in combination with a novel power management runtime system called Bricktop. The work presented in this thesis allows both new- and legacy software to execute with the most energy efficient mapping on a many-core CPU and with the most energy efficient performance level. A set of case study examples demonstrate realworld energy savings in a wide range of applications without performance degradation.

Management of Traffic Control — A Computer Oriented Approach

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Traffic Management system (TMS) comprises four major sub systems: The Network Database Management system for information to the passengers, Transit Facility Management System for service, planning, and scheduling vehicle and crews, Congestion Management System for traffic forecasting and planning, Safety Management System concerned with safety aspects of passengers and Environment. This work has opened a rather wide frame work of model structures for application on traffic. The facets of these theories are so wide that it seems impossible to present all necessary models in this work. However it could be deduced from the study that the best Traffic Management System is that whichis realistic in all aspects is easy to understand is easy to apply As it is practically difficult to device an ideal fool—proof model, the attempt here has been to make some progress-in that direction.

Adaptive Real-time Anomaly-based Intrusion Detection using Data Mining and Machine Learning Techniques

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.

WREN: electronic management system for facilities management

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper describes a case study of an electronic data management system developed in-house by the Facilities Management Directorate (FMD) of an educational institution in the UK. The FMD Maintenance and Business Services department is responsible for the maintenance of the built-estate owned by the university. The department needs to have a clear definition of the type of work undertaken and the administration that enables any maintenance work to be carried out. These include the management of resources, budget, cash flow and workflow of reactive, preventative and planned maintenance of the campus. In order to be more efficient in supporting the business process, the FMD had decided to move from a paper-based information system to an electronic system, WREN, to support the business process of the FMD. Some of the main advantages of WREN are that it is tailor-made to fit the purpose of the users; it is cost effective when it comes to modifications on the system; and the database can also be used as a knowledge management tool. There is a trade-off; as WREN is tailored to the specific requirements of the FMD, it may not be easy to implement within a different institution without extensive modifications. However, WREN is successful in not only allowing the FMD to carry out the tasks of maintaining and looking after the built-estate of the university, but also has achieved its aim to minimise costs and maximise efficiency.

Towards data warehousing and mining of protein unfolding simulation data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

OBJECTIVES: The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. METHODS: To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis. RESULTS: To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse. CONCLUSIONS: Web and grid services, especially pre-defined data mining services that can run on or 'near' the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.

P-found: grid-enabling distributed repositories of protein folding and unfolding simulations for data mining

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform data mining and other analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data that is used to populate the second component, and a data warehouse that contains important molecular properties. These properties may be used for data mining studies. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular, we look at two aspects: firstly, how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories — this is an important and challenging aspect of P-found, due to the large data volumes involved and the desire of scientists to maintain control of their own data. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling scientific discovery.

The Environmental Data Abstraction Library (EDAL): a modular approach to processing and visualising large environmental data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Environmental Data Abstraction Library provides a modular data management library for bringing new and diverse datatypes together for visualisation within numerous software packages, including the ncWMS viewing service, which already has very wide international uptake. The structure of EDAL is presented along with examples of its use to compare satellite, model and in situ data types within the same visualisation framework. We emphasize the value of this capability for cross calibration of datasets and evaluation of model products against observations, including preparation for data assimilation.

MOLES3: implementing an ISO standards driven data catalogue

Relevância:

90.00% 90.00%

Publicador:

Resumo:

ISO19156 Observations and Measurements (O&M) provides a standardised framework for organising information about the collection of information about the environment. Here we describe the implementation of a specialisation of O&M for environmental data, the Metadata Objects for Linking Environmental Sciences (MOLES3). MOLES3 provides support for organising information about data, and for user navigation around data holdings. The implementation described here, “CEDA-MOLES”, also supports data management functions for the Centre for Environmental Data Archival, CEDA. The previous iteration of MOLES (MOLES2) saw active use over five years, being replaced by CEDA-MOLES in late 2014. During that period important lessons were learnt both about the information needed, as well as how to design and maintain the necessary information systems. In this paper we review the problems encountered in MOLES2; how and why CEDA-MOLES was developed and engineered; the migration of information holdings from MOLES2 to CEDA-MOLES; and, finally, provide an early assessment of MOLES3 (as implemented in CEDA-MOLES) and its limitations. Key drivers for the MOLES3 development included the necessity for improved data provenance, for further structured information to support ISO19115 discovery metadata export (for EU INSPIRE compliance), and to provide appropriate fixed landing pages for Digital Object Identifiers (DOIs) in the presence of evolving datasets. Key lessons learned included the importance of minimising information structure in free text fields, and the necessity to support as much agility in the information infrastructure as possible without compromising on maintainability both by those using the systems internally and externally (e.g. citing in to the information infrastructure), and those responsible for the systems themselves. The migration itself needed to ensure continuity of service and traceability of archived assets.

Consolidating Program Accreditation Data for Regional Review

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Regional accreditors often desire the same metrics and data collected by professional program reviews, but may use different terminologies to describe this information; as a result, some schools must manually translate or even recollect data already stored. This report profiles strategies to proactively consolidate the language and policies of accreditation to avoid duplication of labor and to efficiently route program accreditation data that will be repurposed in regional review. It also suggests ways to select new technologies that can streamline data collection, storage, and presentation.

Gerenciamento de Big Data em plantas industriais

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Nowadays the companies generate great amount of data from different sources, however some of them produce more data than they can analyze. Big Data is a set of data that grows very fast, collected several times during a short period of time. This work focus on the importance of the correct management of Big Data in an industrial plant. Through a case study based on a company that belongs to the pulp and paper area, the problems resolutions are going to be presented with the usage of appropriate data management. In the final chapters, the results achieved by the company are discussed, showing how the correct choice of data to be monitored and analyzed brought benefits to the company, also best practices will be recommended for the Big Data management

Gerenciamento de Big Data em plantas industriais

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Nowadays the companies generate great amount of data from different sources, however some of them produce more data than they can analyze. Big Data is a set of data that grows very fast, collected several times during a short period of time. This work focus on the importance of the correct management of Big Data in an industrial plant. Through a case study based on a company that belongs to the pulp and paper area, the problems resolutions are going to be presented with the usage of appropriate data management. In the final chapters, the results achieved by the company are discussed, showing how the correct choice of data to be monitored and analyzed brought benefits to the company, also best practices will be recommended for the Big Data management

Management and routing algorithms for ad-hoc and sensor networks

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Large scale wireless adhoc networks of computers, sensors, PDAs etc. (i.e. nodes) are revolutionizing connectivity and leading to a paradigm shift from centralized systems to highly distributed and dynamic environments. An example of adhoc networks are sensor networks, which are usually composed by small units able to sense and transmit to a sink elementary data which are successively processed by an external machine. Recent improvements in the memory and computational power of sensors, together with the reduction of energy consumptions, are rapidly changing the potential of such systems, moving the attention towards datacentric sensor networks. A plethora of routing and data management algorithms have been proposed for the network path discovery ranging from broadcasting/floodingbased approaches to those using global positioning systems (GPS). We studied WGrid, a novel decentralized infrastructure that organizes wireless devices in an adhoc manner, where each node has one or more virtual coordinates through which both message routing and data management occur without reliance on either flooding/broadcasting operations or GPS. The resulting adhoc network does not suffer from the deadend problem, which happens in geographicbased routing when a node is unable to locate a neighbor closer to the destination than itself. WGrid allow multidimensional data management capability since nodes' virtual coordinates can act as a distributed database without needing neither special implementation or reorganization. Any kind of data (both single and multidimensional) can be distributed, stored and managed. We will show how a location service can be easily implemented so that any search is reduced to a simple query, like for any other data type. WGrid has then been extended by adopting a replication methodology. We called the resulting algorithm WRGrid. Just like WGrid, WRGrid acts as a distributed database without needing neither special implementation nor reorganization and any kind of data can be distributed, stored and managed. We have evaluated the benefits of replication on data management, finding out, from experimental results, that it can halve the average number of hops in the network. The direct consequence of this fact are a significant improvement on energy consumption and a workload balancing among sensors (number of messages routed by each node). Finally, thanks to the replications, whose number can be arbitrarily chosen, the resulting sensor network can face sensors disconnections/connections, due to failures of sensors, without data loss. Another extension to {WGrid} is {W*Grid} which extends it by strongly improving network recovery performance from link and/or device failures that may happen due to crashes or battery exhaustion of devices or to temporary obstacles. W*Grid guarantees, by construction, at least two disjoint paths between each couple of nodes. This implies that the recovery in W*Grid occurs without broadcasting transmissions and guaranteeing robustness while drastically reducing the energy consumption. An extensive number of simulations shows the efficiency, robustness and traffic road of resulting networks under several scenarios of device density and of number of coordinates. Performance analysis have been compared to existent algorithms in order to validate the results.

Studi di data popularity nell'analisi distribuita su Grid dell'esperimento CMS a LHC

Relevância:

90.00% 90.00%

Publicador:

Resumo:

L’esperimento CMS a LHC ha raccolto ingenti moli di dati durante Run-1, e sta sfruttando il periodo di shutdown (LS1) per evolvere il proprio sistema di calcolo. Tra i possibili miglioramenti al sistema, emergono ampi margini di ottimizzazione nell’uso dello storage ai centri di calcolo di livello Tier-2, che rappresentano - in Worldwide LHC Computing Grid (WLCG)- il fulcro delle risorse dedicate all’analisi distribuita su Grid. In questa tesi viene affrontato uno studio della popolarità dei dati di CMS nell’analisi distribuita su Grid ai Tier-2. Obiettivo del lavoro è dotare il sistema di calcolo di CMS di un sistema per valutare sistematicamente l’ammontare di spazio disco scritto ma non acceduto ai centri Tier-2, contribuendo alla costruzione di un sistema evoluto di data management dinamico che sappia adattarsi elasticamente alle diversi condizioni operative - rimuovendo repliche dei dati non necessarie o aggiungendo repliche dei dati più “popolari” - e dunque, in ultima analisi, che possa aumentare l’“analysis throughput” complessivo. Il Capitolo 1 fornisce una panoramica dell’esperimento CMS a LHC. Il Capitolo 2 descrive il CMS Computing Model nelle sue generalità, focalizzando la sua attenzione principalmente sul data management e sulle infrastrutture ad esso connesse. Il Capitolo 3 descrive il CMS Popularity Service, fornendo una visione d’insieme sui servizi di data popularity già presenti in CMS prima dell’inizio di questo lavoro. Il Capitolo 4 descrive l’architettura del toolkit sviluppato per questa tesi, ponendo le basi per il Capitolo successivo. Il Capitolo 5 presenta e discute gli studi di data popularity condotti sui dati raccolti attraverso l’infrastruttura precedentemente sviluppata. L’appendice A raccoglie due esempi di codice creato per gestire il toolkit attra- verso cui si raccolgono ed elaborano i dati.

SMARTDIAB: a communication and information technology approach for the intelligent monitoring, management and follow-up of type 1 diabetes patients

Relevância:

90.00% 90.00%

Publicador:

Resumo:

SMARTDIAB is a platform designed to support the monitoring, management, and treatment of patients with type 1 diabetes mellitus (T1DM), by combining state-of-the-art approaches in the fields of database (DB) technologies, communications, simulation algorithms, and data mining. SMARTDIAB consists mainly of two units: 1) the patient unit (PU); and 2) the patient management unit (PMU), which communicate with each other for data exchange. The PMU can be accessed by the PU through the internet using devices, such as PCs/laptops with direct internet access or mobile phones via a Wi-Fi/General Packet Radio Service access network. The PU consists of an insulin pump for subcutaneous insulin infusion to the patient and a continuous glucose measurement system. The aforementioned devices running a user-friendly application gather patient's related information and transmit it to the PMU. The PMU consists of a diabetes data management system (DDMS), a decision support system (DSS) that provides risk assessment for long-term diabetes complications, and an insulin infusion advisory system (IIAS), which reside on a Web server. The DDMS can be accessed from both medical personnel and patients, with appropriate security access rights and front-end interfaces. The DDMS, apart from being used for data storage/retrieval, provides also advanced tools for the intelligent processing of the patient's data, supporting the physician in decision making, regarding the patient's treatment. The IIAS is used to close the loop between the insulin pump and the continuous glucose monitoring system, by providing the pump with the appropriate insulin infusion rate in order to keep the patient's glucose levels within predefined limits. The pilot version of the SMARTDIAB has already been implemented, while the platform's evaluation in clinical environment is being in progress.

«
1
2
...
5
6
7
8
9
10
11
...
65
66
»