916 resultados para Data Mining and its Application


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the last decade, a new idea challenging the classical self-non-self viewpoint has become popular amongst immunologists. It is called the Danger Theory. In this conceptual paper, we look at this theory from the perspective of Artificial Immune System practitioners. An overview of the Danger Theory is presented with particular emphasis on analogies in the Artificial Immune Systems world. A number of potential application areas are then used to provide a framing for a critical assessment of the concept, and its relevance for Artificial Immune Systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the last decade, a new idea challenging the classical self-non-self viewpoint has become popular amongst immunologists. It is called the Danger Theory. In this conceptual paper, we look at this theory from the perspective of Artificial Immune System practitioners. An overview of the Danger Theory is presented with particular emphasis on analogies in the Artificial Immune Systems world. A number of potential application areas are then used to provide a framing for a critical assessment of the concept, and its relevance for Artificial Immune Systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the last decade, a new idea challenging the classical self-non-self viewpoint has become popular amongst immunologists. It is called the Danger Theory. In this conceptual paper, we look at this theory from the perspective of Artificial Immune System practitioners. An overview of the Danger Theory is presented with particular emphasis on analogies in the Artificial Immune Systems world. A number of potential application areas are then used to provide a framing for a critical assessment of the concept, and its relevance for Artificial Immune Systems. Notes: Uwe Aickelin, Department of Computing, University of Bradford, Bradford, BD7 1DP

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tsunamis are rare events. However, their impact can be devastating and it may extend to large geographical areas. For low-probability high-impact events like tsunamis, it is crucial to implement all possible actions to mitigate the risk. The tsunami hazard assessment is the result of a scientific process that integrates traditional geological methods, numerical modelling and the analysis of tsunami sources and historical records. For this reason, analysing past events and understanding how they interacted with the land is the only way to inform tsunami source and propagation models, and quantitatively test forecast models like hazard analyses. The primary objective of this thesis is to establish an explicit relationship between the macroscopic intensity, derived from historical descriptions, and the quantitative physical parameters measuring tsunami waves. This is done first by defining an approximate estimation method based on a simplified 1D physical onshore propagation model to convert the available observations into one reference physical metric. Wave height at the coast was chosen as the reference due to its stability and independence of inland effects. This method was then implemented for a set of well-known past events to build a homogeneous dataset with both macroseismic intensity and wave height. By performing an orthogonal regression, a direct and invertible empirical relationship could be established between the two parameters, accounting for their relevant uncertainties. The target relationship is extensively tested and finally applied to the Italian Tsunami Effect Database (ITED), providing a homogeneous estimation of the wave height for all existing tsunami observations in Italy. This provides the opportunity for meaningful comparison for models and simulations, as well as quantitatively testing tsunami hazard models for the Italian coasts and informing tsunami risk management initiatives.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Citizens demand more and more data for making decisions in their daily life. Therefore, mechanisms that allow citizens to understand and analyze linked open data (LOD) in a user-friendly manner are highly required. To this aim, the concept of Open Business Intelligence (OpenBI) is introduced in this position paper. OpenBI facilitates non-expert users to (i) analyze and visualize LOD, thus generating actionable information by means of reporting, OLAP analysis, dashboards or data mining; and to (ii) share the new acquired information as LOD to be reused by anyone. One of the most challenging issues of OpenBI is related to data mining, since non-experts (as citizens) need guidance during preprocessing and application of mining algorithms due to the complexity of the mining process and the low quality of the data sources. This is even worst when dealing with LOD, not only because of the different kind of links among data, but also because of its high dimensionality. As a consequence, in this position paper we advocate that data mining for OpenBI requires data quality-aware mechanisms for guiding non-expert users in obtaining and sharing the most reliable knowledge from the available LOD.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper develops an interactive approach for exploratory spatial data analysis. Measures of attribute similarity and spatial proximity are combined in a clustering model to support the identification of patterns in spatial information. Relationships between the developed clustering approach, spatial data mining and choropleth display are discussed. Analysis of property crime rates in Brisbane, Australia is presented. A surprising finding in this research is that there are substantial inconsistencies in standard choropleth display options found in two widely used commercial geographical information systems, both in terms of definition and performance. The comparative results demonstrate the usefulness and appeal of the developed approach in a geographical information system environment for exploratory spatial data analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform data mining and other analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data that is used to populate the second component, and a data warehouse that contains important molecular properties. These properties may be used for data mining studies. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular, we look at two aspects: firstly, how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories — this is an important and challenging aspect of P-found, due to the large data volumes involved and the desire of scientists to maintain control of their own data. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling scientific discovery.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The organo-clay used in this work was prepared from a Na-montmorillonite (Wyoming-USA deposit) by treatment with water solution of hexadecyltrimethylammonium cations. As organo-clays exhibit strong sorptive capabilities for organic molecules, 2-mercapto-5-amino-1,3,4-thiadiazole organofunctional groups, with potential usefulness in chemical analysis, were incorporated on its solid surface. The physically adsorbed reagent did not present any restrictions in coordinating with several metal ions on the surface. The resultant organo-clay complex exhibited strong sorptive capability for removing mercury ions from water in which other metals and ions were also present. The purpose of this work is to study the selective separation of mercury(II) from aqueous solution using the organo-clay complex, measured by batch and chromatographic column techniques, and its application as preconcentration agent in a chemically modified carbon paste electrode for determination of mercury(II) in aqueous solution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The increase in the number of spatial data collected has motivated the development of geovisualisation techniques, aiming to provide an important resource to support the extraction of knowledge and decision making. One of these techniques are 3D graphs, which provides a dynamic and flexible increase of the results analysis obtained by the spatial data mining algorithms, principally when there are incidences of georeferenced objects in a same local. This work presented as an original contribution the potentialisation of visual resources in a computational environment of spatial data mining and, afterwards, the efficiency of these techniques is demonstrated with the use of a real database. The application has shown to be very interesting in interpreting obtained results, such as patterns that occurred in a same locality and to provide support for activities which could be done as from the visualisation of results. © 2013 Springer-Verlag.