14 resultados para Structured data
em Cambridge University Engineering Department Publications Database
Resumo:
Compared with structured data sources that are usually stored and analyzed in spreadsheets, relational databases, and single data tables, unstructured construction data sources such as text documents, site images, web pages, and project schedules have been less intensively studied due to additional challenges in data preparation, representation, and analysis. In this paper, our vision for data management and mining addressing such challenges are presented, together with related research results from previous work, as well as our recent developments of data mining on text-based, web-based, image-based, and network-based construction databases.
Resumo:
The Architecture, Engineering, Construction and Facilities Management (AEC/FM) industry is rapidly becoming a multidisciplinary, multinational and multi-billion dollar economy, involving large numbers of actors working concurrently at different locations and using heterogeneous software and hardware technologies. Since the beginning of the last decade, a great deal of effort has been spent within the field of construction IT in order to integrate data and information from most computer tools used to carry out engineering projects. For this purpose, a number of integration models have been developed, like web-centric systems and construction project modeling, a useful approach in representing construction projects and integrating data from various civil engineering applications. In the modern, distributed and dynamic construction environment it is important to retrieve and exchange information from different sources and in different data formats in order to improve the processes supported by these systems. Previous research demonstrated that a major hurdle in AEC/FM data integration in such systems is caused by its variety of data types and that a significant part of the data is stored in semi-structured or unstructured formats. Therefore, new integrative approaches are needed to handle non-structured data types like images and text files. This research is focused on the integration of construction site images. These images are a significant part of the construction documentation with thousands stored in site photographs logs of large scale projects. However, locating and identifying such data needed for the important decision making processes is a very hard and time-consuming task, while so far, there are no automated methods for associating them with other related objects. Therefore, automated methods for the integration of construction images are important for construction information management. During this research, processes for retrieval, classification, and integration of construction images in AEC/FM model based systems have been explored. Specifically, a combination of techniques from the areas of image and video processing, computer vision, information retrieval, statistics and content-based image and video retrieval have been deployed in order to develop a methodology for the retrieval of related construction site image data from components of a project model. This method has been tested on available construction site images from a variety of sources like past and current building construction and transportation projects and is able to automatically classify, store, integrate and retrieve image data files in inter-organizational systems so as to allow their usage in project management related tasks.
Resumo:
Many data are naturally modeled by an unobserved hierarchical structure. In this paper we propose a flexible nonparametric prior over unknown data hierarchies. The approach uses nested stick-breaking processes to allow for trees of unbounded width and depth, where data can live at any node and are infinitely exchangeable. One can view our model as providing infinite mixtures where the components have a dependency structure corresponding to an evolutionary diffusion down a tree. By using a stick-breaking approach, we can apply Markov chain Monte Carlo methods based on slice sampling to perform Bayesian inference and simulate from the posterior distribution on trees. We apply our method to hierarchical clustering of images and topic modeling of text data.
Resumo:
In this paper we present a new, compact derivation of state-space formulae for the so-called discretisation-based solution of the H∞ sampled-data control problem. Our approach is based on the established technique of continuous time-lifting, which is used to isometrically map the continuous-time, linear, periodically time-varying, sampled-data problem to a discretetime, linear, time-invariant problem. State-space formulae are derived for the equivalent, discrete-time problem by solving a set of two-point, boundary-value problems. The formulae accommodate a direct feed-through term from the disturbance inputs to the controlled outputs of the original plant and are simple, requiring the computation of only a single matrix exponential. It is also shown that the resultant formulae can be easily re-structured to give a numerically robust algorithm for computing the state-space matrices. © 1997 Elsevier Science Ltd. All rights reserved.
Resumo:
CAD software can be structured as a set of modular 'software tools' only if there is some agreement on the data structures which are to be passed between tools. Beyond this basic requirement, it is desirable to give the agreed structures the status of 'data types' in the language used for interactive design. The ultimate refinement is to have a data management capability which 'understands' how to manipulate such data types. In this paper the requirements of CACSD are formulated from the point of view of Database Management Systems. Progress towards meeting these requirements in both the DBMS and the CACSD community is reviewed. The conclusion reached is that there has been considerable movement towards the realisation of software tools for CACSD, but that this owes more to modern ideas about programming languages, than to DBMS developments. The DBMS field has identified some useful concepts, but further significant progress is expected to come from the exploitation of concepts such as object-oriented programming, logic programming, or functional programming.
Resumo:
We present a novel, implementation friendly and occlusion aware semi-supervised video segmentation algorithm using tree structured graphical models, which delivers pixel labels alongwith their uncertainty estimates. Our motivation to employ supervision is to tackle a task-specific segmentation problem where the semantic objects are pre-defined by the user. The video model we propose for this problem is based on a tree structured approximation of a patch based undirected mixture model, which includes a novel time-series and a soft label Random Forest classifier participating in a feedback mechanism. We demonstrate the efficacy of our model in cutting out foreground objects and multi-class segmentation problems in lengthy and complex road scene sequences. Our results have wide applicability, including harvesting labelled video data for training discriminative models, shape/pose/articulation learning and large scale statistical analysis to develop priors for video segmentation. © 2011 IEEE.
Resumo:
This article reports a case study application of a systematic approach to modelling complex organisations, centred on simulation modelling (SM). The approach leads to populated instances of complementary model types, in ways that systematically capture, validate and facilitate various uses of organisational understandings, knowledge and data normally distributed amongst multiple knowledge holders. The model-driven approach to decision making enables improved manufacturing responsiveness. Literature on modelling technologies relevant to manufacturing systems organisation design and change is presented, as is literature on production planning and control. This provides a rationale for the development of a new modelling methodology which combines the use of enterprise, causal loop and SM. Subsequently, this article describes how in the case of a specific manufacturing enterprise the combined modelling techniques have informed the choice of alternative production planning and control policies. An example enterprise model of a capacitor manufacturing company is illustrated as derivative causal-loop models that structure and enable the design and use of a general purpose simulation model.
Resumo:
A fundamental problem in the analysis of structured relational data like graphs, networks, databases, and matrices is to extract a summary of the common structure underlying relations between individual entities. Relational data are typically encoded in the form of arrays; invariance to the ordering of rows and columns corresponds to exchangeable arrays. Results in probability theory due to Aldous, Hoover and Kallenberg show that exchangeable arrays can be represented in terms of a random measurable function which constitutes the natural model parameter in a Bayesian model. We obtain a flexible yet simple Bayesian nonparametric model by placing a Gaussian process prior on the parameter function. Efficient inference utilises elliptical slice sampling combined with a random sparse approximation to the Gaussian process. We demonstrate applications of the model to network data and clarify its relation to models in the literature, several of which emerge as special cases.
Resumo:
© Springer International Publishing Switzerland 2015. Making sound asset management decisions, such as whether to replace or maintain an ageing underground water pipe, are critical to ensure that organisations maximise the performance of their assets. These decisions are only as good as the data that supports them, and hence many asset management organisations are in desperate need to improve the quality of their data. This chapter reviews the key academic research on data quality (DQ) and Information Quality (IQ) (used interchangeably in this chapter) in asset management, combines this with the current DQ problems faced by asset management organisations in various business sectors, and presents a classification of the most important DQ problems that need to be tackled by asset management organisations. In this research, eleven semi structured interviews were carried out with asset management professionals in a range of business sectors in the UK. The problems described in the academic literature were cross checked against the problems found in industry. In order to support asset management professionals in solving these problems, we categorised them into seven different DQ dimensions, used in the academic literature, so that it is clear how these problems fit within the standard frameworks for assessing and improving data quality. Asset management professionals can therefore now use these frameworks to underpin their DQ improvement initiatives while focussing on the most critical DQ problems.