957 resultados para DATABASES
Resumo:
Emerging high-dimensional data mining applications needs to find interesting clusters embeded in arbitrarily aligned subspaces of lower dimensionality. It is difficult to cluster high-dimensional data objects, when they are sparse and skewed. Updations are quite common in dynamic databases and they are usually processed in batch mode. In very large dynamic databases, it is necessary to perform incremental cluster analysis only to the updations. We present a incremental clustering algorithm for subspace clustering in very high dimensions, which handles both insertion and deletions of datapoints to the backend databases.
Resumo:
Practical usage of machine learning is gaining strategic importance in enterprises looking for business intelligence. However, most enterprise data is distributed in multiple relational databases with expert-designed schema. Using traditional single-table machine learning techniques over such data not only incur a computational penalty for converting to a flat form (mega-join), even the human-specified semantic information present in the relations is lost. In this paper, we present a practical, two-phase hierarchical meta-classification algorithm for relational databases with a semantic divide and conquer approach. We propose a recursive, prediction aggregation technique over heterogeneous classifiers applied on individual database tables. The proposed algorithm was evaluated on three diverse datasets. namely TPCH, PKDD and UCI benchmarks and showed considerable reduction in classification time without any loss of prediction accuracy. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
NrichD
Resumo:
In this paper we derive an approach for the effective utilization of thermodynamic data in phase-field simulations. While the most widely used methodology for multi-component alloys is following the work by Eiken et al. (2006), wherein, an extrapolative scheme is utilized in conjunction with the TQ interface for deriving the driving force for phase transformation, a corresponding simplistic method based on the formulation of a parabolic free-energy model incorporating all the thermodynamics has been laid out for binary alloys in the work by Folch and Plapp (2005). In the following, we extend this latter approach for multi-component alloys in the framework of the grand-potential formalism. The coupling is applied for the case of the binary eutectic solidification in the Cr-Ni alloy and two-phase solidification in the ternary eutectic alloy (Al-Cr-Ni). A thermodynamic justification entails the basis of the formulation and places it in context of the bigger picture of Integrated Computational Materials Engineering. (C) 2015 Elsevier Ltd. All rights reserved.
Resumo:
Reliable turbulent channel flow databases at several Reynolds numbers have been established by large eddy simulation (LES), with two of them validated by comparing with typical direct numerical simulation (DNS) results. Furthermore, the statistics, such as velocity profile, turbulent intensities and shear stress, were obtained as well as the temporal and spatial structure of turbulent bursts. Based on the LES databases available, the conditional sampling methods are used to detect the structures of burst events. A method to deterimine the grouping parameter from the probability distribution function (pdf) curve of the time separation between ejection events is proposed to avoid the errors in detected results. And thus, the dependence of average burst period on thresholds is considerably weakened. Meanwhile, the average burst-to-bed area ratios are detected. It is found that the Reynolds number exhibits little effect on the burst period and burst-to-bed area ratio.
Resumo:
Tesis leida en la Universidad de Aberdeen. 178 p.
Resumo:
The US Fish and Wildlife Service Cape Romain National Wildlife Refuge (CRNWR) and the Center for Coastal Environmental Health and Biomolecular Research (CCEHBR) at Charleston are interested in assessing the status of our coastal resources in light of increased coastal development and recreational use. Through an Interagency Agreement (FWS #1448-40181-00-H-001), an ecological characterization was undertaken to describe the status of and potential impacts to resources at CRNWR. This report describes historic fisheries-independent or non-commercial data relevant to CRNWR that can be used to evaluate the role of the Refuge as habitat for nearshore and offshore fish species. The purpose of this document is two-fold, first to give resource managers an understanding of fisheries data that have been collected over the years and, second, to illustrate how these data can be applied to address specific management issues. This report provides an overview of historic fisheries data collected along the southeast coast, as well as basic summaries of that data relevant to CRNWR, indicating how these data can be used to address specific questions of interest to Refuge managers and biologists.
Resumo:
This paper reports the availability of a database of protein structural domains (DDBASE), an alignment database of homologous proteins (HOMSTRAD) and a database of structurally aligned superfamilies (CAMPASS) on the World Wide Web (WWW). DDBASE contains information on the organization of structural domains and their boundaries; it includes only one representative domain from each of the homologous families. This database has been derived by identifying the presence of structural domains in proteins on the basis of inter-secondary structural distances using the program DIAL [Sowdhamini & Blundell (1995), Protein Sci. 4, 506-520]. The alignment of proteins in superfamilies has been performed on the basis of the structural features and relationships of individual residues using the program COMPARER [Sali & Blundell (1990), J. Mol. Biol. 212, 403-428]. The alignment databases contain information on the conserved structural features in homologous proteins and those belonging to superfamilies. Available data include the sequence alignments in structure-annotated formats and the provision for viewing superposed structures of proteins using a graphical interface. Such information, which is freely accessible on the WWW, should be of value to crystallographers in the comparison of newly determined protein structures with previously identified protein domains or existing families.
Resumo:
Images represent a valuable source of information for the construction industry. Due to technological advancements in digital imaging, the increasing use of digital cameras is leading to an ever-increasing volume of images being stored in construction image databases and thus makes it hard for engineers to retrieve useful information from them. Content-Based Search Engines are tools that utilize the rich image content and apply pattern recognition methods in order to retrieve similar images. In this paper, we illustrate several project management tasks and show how Content-Based Search Engines can facilitate automatic retrieval, and indexing of construction images in image databases.
Resumo:
In the modern and dynamic construction environment it is important to access information in a fast and efficient manner in order to improve the decision making processes for construction managers. This capability is, in most cases, straightforward with today’s technologies for data types with an inherent structure that resides primarily on established database structures like estimating and scheduling software. However, previous research has demonstrated that a significant percentage of construction data is stored in semi-structured or unstructured data formats (text, images, etc.) and that manually locating and identifying such data is a very hard and time-consuming task. This paper focuses on construction site image data and presents a novel image retrieval model that interfaces with established construction data management structures. This model is designed to retrieve images from related objects in project models or construction databases using location, date, and material information (extracted from the image content with pattern recognition techniques).
Resumo:
One of the most important kinds of queries in Spatial Network Databases (SNDB) to support location-based services (LBS) is the shortest path query. Given an object in a network, e.g. a location of a car on a road network, and a set of objects of interests, e.g. hotels,gas station, and car, the shortest path query returns the shortest path from the query object to interested objects. The studies of shortest path query have two kinds of ways, online processing and preprocessing. The studies of preprocessing suppose that the interest objects are static. This paper proposes a shortest path algorithm with a set of index structures to support the situation of moving objects. This algorithm can transform a dynamic problem to a static problem. In this paper we focus on road networks. However, our algorithms do not use any domain specific information, and therefore can be applied to any network. This algorithm’s complexity is O(klog2 i), and traditional Dijkstra’s complexity is O((i + k)2).
Resumo:
中国计算机学会