916 resultados para Data Mining and its Application
Resumo:
This paper describes a proposed new approach to the Computer Network Security Intrusion Detection Systems (NIDS) application domain knowledge processing focused on a topic map technology-enabled representation of features of the threat pattern space as well as the knowledge of situated efficacy of alternative candidate algorithms for pattern recognition within the NIDS domain. Thus an integrative knowledge representation framework for virtualisation, data intelligence and learning loop architecting in the NIDS domain is described together with specific aspects of its deployment.
Resumo:
Knowledge-elicitation is a common technique used to produce rules about the operation of a plant from the knowledge that is available from human expertise. Similarly, data-mining is becoming a popular technique to extract rules from the data available from the operation of a plant. In the work reported here knowledge was required to enable the supervisory control of an aluminium hot strip mill by the determination of mill set-points. A method was developed to fuse knowledge-elicitation and data-mining to incorporate the best aspects of each technique, whilst avoiding known problems. Utilisation of the knowledge was through an expert system, which determined schedules of set-points and provided information to human operators. The results show that the method proposed in this paper was effective in producing rules for the on-line control of a complex industrial process. (C) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Svalgaard and Cliver (2010) recently reported a consensus between the various reconstructions of the heliospheric field over recent centuries. This is a significant development because, individually, each has uncertainties introduced by instrument calibration drifts, limited numbers of observatories, and the strength of the correlations employed. However, taken collectively, a consistent picture is emerging. We here show that this consensus extends to more data sets and methods than reported by Svalgaard and Cliver, including that used by Lockwood et al. (1999), when their algorithm is used to predict the heliospheric field rather than the open solar flux. One area where there is still some debate relates to the existence and meaning of a floor value to the heliospheric field. From cosmogenic isotope abundances, Steinhilber et al. (2010) have recently deduced that the near-Earth IMF at the end of the Maunder minimum was 1.80 ± 0.59 nT which is considerably lower than the revised floor of 4nT proposed by Svalgaard and Cliver. We here combine cosmogenic and geomagnetic reconstructions and modern observations (with allowance for the effect of solar wind speed and structure on the near-Earth data) to derive an estimate for the open solar flux of (0.48 ± 0.29) × 1014 Wb at the end of the Maunder minimum. By way of comparison, the largest and smallest annual means recorded by instruments in space between 1965 and 2010 are 5.75 × 1014 Wb and 1.37 × 1014 Wb, respectively, set in 1982 and 2009, and the maximum of the 11 year running means was 4.38 × 1014 Wb in 1986. Hence the average open solar flux during the Maunder minimum is found to have been 11% of its peak value during the recent grand solar maximum.
Resumo:
The phylogenetics of Sternbergia (Amaryllidaceae) were studied using DNA sequences of the plastid ndhF and matK genes and nuclear internal transcribed spacer (ITS) ribosomal region for 38, 37 and 32 ingroup and outgroup accessions, respectively. All members of Sternbergia were represented by at least one accession, except S. minoica and S. schubertii, with additional taxa from Narcissus and Pancratium serving as principal outgroups. Sternbergia was resolved and supported as sister to Narcissus and composed of two primary subclades: S. colchiciflora sister to S. vernalis, S. candida and S. clusiana, with this clade in turn sister to S. lutea and its allies in both Bayesian and bootstrap analyses. A clear relationship between the two vernal flowering members of the genus was recovered, supporting the hypothesis of a single origin of vernal flowering in Sternbergia. However, in the S. lutea complex, the DNA markers examined did not offer sufficient resolving power to separate taxa, providing some support for the idea that S. sicula and S. greuteriana are conspecific with S. lutea
Resumo:
OBJECTIVES: The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. METHODS: To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis. RESULTS: To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse. CONCLUSIONS: Web and grid services, especially pre-defined data mining services that can run on or 'near' the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.
Resumo:
Pocket Data Mining (PDM) is our new term describing collaborative mining of streaming data in mobile and distributed computing environments. With sheer amounts of data streams are now available for subscription on our smart mobile phones, the potential of using this data for decision making using data stream mining techniques has now been achievable owing to the increasing power of these handheld devices. Wireless communication among these devices using Bluetooth and WiFi technologies has opened the door wide for collaborative mining among the mobile devices within the same range that are running data mining techniques targeting the same application. This paper proposes a new architecture that we have prototyped for realizing the significant applications in this area. We have proposed using mobile software agents in this application for several reasons. Most importantly the autonomic intelligent behaviour of the agent technology has been the driving force for using it in this application. Other efficiency reasons are discussed in details in this paper. Experimental results showing the feasibility of the proposed architecture are presented and discussed.
Resumo:
Pocket Data Mining (PDM) describes the full process of analysing data streams in mobile ad hoc distributed environments. Advances in mobile devices like smart phones and tablet computers have made it possible for a wide range of applications to run in such an environment. In this paper, we propose the adoption of data stream classification techniques for PDM. Evident by a thorough experimental study, it has been proved that running heterogeneous/different, or homogeneous/similar data stream classification techniques over vertically partitioned data (data partitioned according to the feature space) results in comparable performance to batch and centralised learning techniques.
Resumo:
This article clarifies what was done with the sub-7-man positions in data-mining Harold van der Heijden's 'HHdbIV' database of chess studies prior to its publication. It emphasises that only positions in the main lines of studies were examined and that the information about uniqueness of move was not incorporated in HHdbIV. There is some reflection on the separate technical and artistic dimensions of study evaluation.
Resumo:
The interpretation of Neotropical fossil phytolith assemblages for palaeoenvironmental and archaeological reconstructions relies on the development of appropriate modern analogues. We analyzed modern phytolith assemblages from the soils of ten distinctive tropical vegetation communities in eastern lowland Bolivia, ranging from terra firme humid evergreen forest to seasonally-inundated savannah. Results show that broad ecosystems – evergreen tropical forest, semi-deciduous dry tropical forest, and savannah – can be clearly differentiated by examination of their phytolith spectra and the application of Principal Component Analysis (PCA). Differences in phytolith assemblages between particular vegetation communities within each of these ecosystems are more subtle, but can still be identified. Comparison of phytolith assemblages with pollen rain data and stable carbon isotope analyses from the same vegetation plots show that these proxies are not only complementary, but significantly improve taxonomic and ecosystem resolution, and therefore our ability to interpret palaeoenvironmental and archaeological records. Our data underline the utility of phytolith analyses for reconstructing Amazon Holocene vegetation histories and pre-Columbian land use, particularly the high spatial resolution possible with terrestrial soil-based phytolith studies.
Resumo:
The Water and Global Change (WATCH) project evaluation of the terrestrial water cycle involves using land surface models and general hydrological models to assess hydrologically important variables including evaporation, soil moisture, and runoff. Such models require meteorological forcing data, and this paper describes the creation of the WATCH Forcing Data for 1958–2001 based on the 40-yr ECMWF Re-Analysis (ERA-40) and for 1901–57 based on reordered reanalysis data. It also discusses and analyses modelindependent estimates of reference crop evaporation. Global average annual cumulative reference crop evaporation was selected as a widely adopted measure of potential evapotranspiration. It exhibits no significant trend from 1979 to 2001 although there are significant long-term increases in global average vapor pressure deficit and concurrent significant decreases in global average net radiation and wind speed. The near-constant global average of annual reference crop evaporation in the late twentieth century masks significant decreases in some regions (e.g., the Murray–Darling basin) with significant increases in others.
Resumo:
In this article, we review the state-of-the-art techniques in mining data streams for mobile and ubiquitous environments. We start the review with a concise background of data stream processing, presenting the building blocks for mining data streams. In a wide range of applications, data streams are required to be processed on small ubiquitous devices like smartphones and sensor devices. Mobile and ubiquitous data mining target these applications with tailored techniques and approaches addressing scarcity of resources and mobility issues. Two categories can be identified for mobile and ubiquitous mining of streaming data: single-node and distributed. This survey will cover both categories. Mining mobile and ubiquitous data require algorithms with the ability to monitor and adapt the working conditions to the available computational resources. We identify the key characteristics of these algorithms and present illustrative applications. Distributed data stream mining in the mobile environment is then discussed, presenting the Pocket Data Mining framework. Mobility of users stimulates the adoption of context-awareness in this area of research. Context-awareness and collaboration are discussed in the Collaborative Data Stream Mining, where agents share knowledge to learn adaptive accurate models.