168 resultados para data movement problem

em Queensland University of Technology - ePrints Archive


Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we present WebPut, a prototype system that adopts a novel web-based approach to the data imputation problem. Towards this, Webput utilizes the available information in an incomplete database in conjunction with the data consistency principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme that efficiently leverages our suite of data imputation queries to automatically select the most effective imputation query for each missing value. A greedy iterative algorithm is also proposed to schedule the imputation order of the different missing values in a database, and in turn the issuing of their corresponding imputation queries, for improving the accuracy and efficiency of WebPut. Experiments based on several real-world data collections demonstrate that WebPut outperforms existing approaches.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we present WebPut, a prototype system that adopts a novel web-based approach to the data imputation problem. Towards this, Webput utilizes the available information in an incomplete database in conjunction with the data consistency principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme that efficiently leverages our suite of data imputation queries to automatically select the most effective imputation query for each missing value. A greedy iterative algorithm is proposed to schedule the imputation order of the different missing values in a database, and in turn the issuing of their corresponding imputation queries, for improving the accuracy and efficiency of WebPut. Moreover, several optimization techniques are also proposed to reduce the cost of estimating the confidence of imputation queries at both the tuple-level and the database-level. Experiments based on several real-world data collections demonstrate not only the effectiveness of WebPut compared to existing approaches, but also the efficiency of our proposed algorithms and optimization techniques.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Increasingly larger scale applications are generating an unprecedented amount of data. However, the increasing gap between computation and I/O capacity on High End Computing machines makes a severe bottleneck for data analysis. Instead of moving data from its source to the output storage, in-situ analytics processes output data while simulations are running. However, in-situ data analysis incurs much more computing resource contentions with simulations. Such contentions severely damage the performance of simulation on HPE. Since different data processing strategies have different impact on performance and cost, there is a consequent need for flexibility in the location of data analytics. In this paper, we explore and analyze several potential data-analytics placement strategies along the I/O path. To find out the best strategy to reduce data movement in given situation, we propose a flexible data analytics (FlexAnalytics) framework in this paper. Based on this framework, a FlexAnalytics prototype system is developed for analytics placement. FlexAnalytics system enhances the scalability and flexibility of current I/O stack on HEC platforms and is useful for data pre-processing, runtime data analysis and visualization, as well as for large-scale data transfer. Two use cases – scientific data compression and remote visualization – have been applied in the study to verify the performance of FlexAnalytics. Experimental results demonstrate that FlexAnalytics framework increases data transition bandwidth and improves the application end-to-end transfer performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Active Grids are a form of grid infrastructure where the grid network is active and programmable. These grids directly support applications with value added services such as data migration, compression, adaptation and monitoring. Services such as these are particularly important for eResearch applications which by their very nature are performance critical and data intensive. We propose an architecture for improving the flexibility of Active Grids through web services. These enable Active Grid services to be easily and flexibly configured, monitored and deployed from practically any platform or application. The architecture is called WeSPNI ('Web Services based on Programmable Networks Infrastructure'). We present the architecture together with some early experimental results on using web services to monitor data movement in an active grid.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This research is aimed at addressing problems in the field of asset management relating to risk analysis and decision making based on data from a Supervisory Control and Data Acquisition (SCADA) system. It is apparent that determining risk likelihood in risk analysis is difficult, especially when historical information is unreliable. This relates to a problem in SCADA data analysis because of nested data. A further problem is in providing beneficial information from a SCADA system to a managerial level information system (e.g. Enterprise Resource Planning/ERP). A Hierarchical Model is developed to address the problems. The model is composed of three different Analyses: Hierarchical Analysis, Failure Mode and Effect Analysis, and Interdependence Analysis. The significant contributions from the model include: (a) a new risk analysis model, namely an Interdependence Risk Analysis Model which does not rely on the existence of historical information because it utilises Interdependence Relationships to determine the risk likelihood, (b) improvement of the SCADA data analysis problem by addressing the nested data problem through the Hierarchical Analysis, and (c) presentation of a framework to provide beneficial information from SCADA systems to ERP systems. The case study of a Water Treatment Plant is utilised for model validation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

As a part of vital infrastructure and transportation network, bridge structures must function safely at all times. Bridges are designed to have a long life span. At any point in time, however, some bridges are aged. The ageing of bridge structures, given the rapidly growing demand of heavy and fast inter-city passages and continuous increase of freight transportation, would require diligence on bridge owners to ensure that the infrastructure is healthy at reasonable cost. In recent decades, a new technique, structural health monitoring (SHM), has emerged to meet this challenge. In this new engineering discipline, structural modal identification and damage detection have formed a vital component. Witnessed by an increasing number of publications is that the change in vibration characteristics is widely and deeply investigated to assess structural damage. Although a number of publications have addressed the feasibility of various methods through experimental verifications, few of them have focused on steel truss bridges. Finding a feasible vibration-based damage indicator for steel truss bridges and solving the difficulties in practical modal identification to support damage detection motivated this research project. This research was to derive an innovative method to assess structural damage in steel truss bridges. First, it proposed a new damage indicator that relies on optimising the correlation between theoretical and measured modal strain energy. The optimisation is powered by a newly proposed multilayer genetic algorithm. In addition, a selection criterion for damage-sensitive modes has been studied to achieve more efficient and accurate damage detection results. Second, in order to support the proposed damage indicator, the research studied the applications of two state-of-the-art modal identification techniques by considering some practical difficulties: the limited instrumentation, the influence of environmental noise, the difficulties in finite element model updating, and the data selection problem in the output-only modal identification methods. The numerical (by a planer truss model) and experimental (by a laboratory through truss bridge) verifications have proved the effectiveness and feasibility of the proposed damage detection scheme. The modal strain energy-based indicator was found to be sensitive to the damage in steel truss bridges with incomplete measurement. It has shown the damage indicator's potential in practical applications of steel truss bridges. Lastly, the achievement and limitation of this study, and lessons learnt from the modal analysis have been summarised.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we propose an approach which attempts to solve the problem of surveillance event detection, assuming that we know the definition of the events. To facilitate the discussion, we first define two concepts. The event of interest refers to the event that the user requests the system to detect; and the background activities are any other events in the video corpus. This is an unsolved problem due to many factors as listed below: 1) Occlusions and clustering: The surveillance scenes which are of significant interest at locations such as airports, railway stations, shopping centers are often crowded, where occlusions and clustering of people are frequently encountered. This significantly affects the feature extraction step, and for instance, trajectories generated by object tracking algorithms are usually not robust under such a situation. 2) The requirement for real time detection: The system should process the video fast enough in both of the feature extraction and the detection step to facilitate real time operation. 3) Massive size of the training data set: Suppose there is an event that lasts for 1 minute in a video with a frame rate of 25fps, the number of frames for this events is 60X25 = 1500. If we want to have a training data set with many positive instances of the event, the video is likely to be very large in size (i.e. hundreds of thousands of frames or more). How to handle such a large data set is a problem frequently encountered in this application. 4) Difficulty in separating the event of interest from background activities: The events of interest often co-exist with a set of background activities. Temporal groundtruth typically very ambiguous, as it does not distinguish the event of interest from a wide range of co-existing background activities. However, it is not practical to annotate the locations of the events in large amounts of video data. This problem becomes more serious in the detection of multi-agent interactions, since the location of these events can often not be constrained to within a bounding box. 5) Challenges in determining the temporal boundaries of the events: An event can occur at any arbitrary time with an arbitrary duration. The temporal segmentation of events is difficult and ambiguous, and also affected by other factors such as occlusions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recent road safety statistics show that the decades-long fatalities decreasing trend is stopping and stagnating. Statistics further show that crashes are mostly driven by human error, compared to other factors such as environmental conditions and mechanical defects. Within human error, the dominant error source is perceptive errors, which represent about 50% of the total. The next two sources are interpretation and evaluation, which accounts together with perception for more than 75% of human error related crashes. Those statistics show that allowing drivers to perceive and understand their environment better, or supplement them when they are clearly at fault, is a solution to a good assessment of road risk, and, as a consequence, further decreasing fatalities. To answer this problem, currently deployed driving assistance systems combine more and more information from diverse sources (sensors) to enhance the driver's perception of their environment. However, because of inherent limitations in range and field of view, these systems' perception of their environment remains largely limited to a small interest zone around a single vehicle. Such limitations can be overcomed by increasing the interest zone through a cooperative process. Cooperative Systems (CS), a specific subset of Intelligent Transportation Systems (ITS), aim at compensating for local systems' limitations by associating embedded information technology and intervehicular communication technology (IVC). With CS, information sources are not limited to a single vehicle anymore. From this distribution arises the concept of extended or augmented perception. Augmented perception allows extending an actor's perceptive horizon beyond its "natural" limits not only by fusing information from multiple in-vehicle sensors but also information obtained from remote sensors. The end result of an augmented perception and data fusion chain is known as an augmented map. It is a repository where any relevant information about objects in the environment, and the environment itself, can be stored in a layered architecture. This thesis aims at demonstrating that augmented perception has better performance than noncooperative approaches, and that it can be used to successfully identify road risk. We found it was necessary to evaluate the performance of augmented perception, in order to obtain a better knowledge on their limitations. Indeed, while many promising results have already been obtained, the feasibility of building an augmented map from exchanged local perception information and, then, using this information beneficially for road users, has not been thoroughly assessed yet. The limitations of augmented perception, and underlying technologies, have not be thoroughly assessed yet. Most notably, many questions remain unanswered as to the IVC performance and their ability to deliver appropriate quality of service to support life-saving critical systems. This is especially true as the road environment is a complex, highly variable setting where many sources of imperfections and errors exist, not only limited to IVC. We provide at first a discussion on these limitations and a performance model built to incorporate them, created from empirical data collected on test tracks. Our results are more pessimistic than existing literature, suggesting IVC limitations have been underestimated. Then, we develop a new CS-applications simulation architecture. This architecture is used to obtain new results on the safety benefits of a cooperative safety application (EEBL), and then to support further study on augmented perception. At first, we confirm earlier results in terms of crashes numbers decrease, but raise doubts on benefits in terms of crashes' severity. In the next step, we implement an augmented perception architecture tasked with creating an augmented map. Our approach is aimed at providing a generalist architecture that can use many different types of sensors to create the map, and which is not limited to any specific application. The data association problem is tackled with an MHT approach based on the Belief Theory. Then, augmented and single-vehicle perceptions are compared in a reference driving scenario for risk assessment,taking into account the IVC limitations obtained earlier; we show their impact on the augmented map's performance. Our results show that augmented perception performs better than non-cooperative approaches, allowing to almost tripling the advance warning time before a crash. IVC limitations appear to have no significant effect on the previous performance, although this might be valid only for our specific scenario. Eventually, we propose a new approach using augmented perception to identify road risk through a surrogate: near-miss events. A CS-based approach is designed and validated to detect near-miss events, and then compared to a non-cooperative approach based on vehicles equiped with local sensors only. The cooperative approach shows a significant improvement in the number of events that can be detected, especially at the higher rates of system's deployment.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper addresses the problem of joint identification of infinite-frequency added mass and fluid memory models of marine structures from finite frequency data. This problem is relevant for cases where the code used to compute the hydrodynamic coefficients of the marine structure does not give the infinite-frequency added mass. This case is typical of codes based on 2D-potential theory since most 3D-potential-theory codes solve the boundary value associated with the infinite frequency. The method proposed in this paper presents a simpler alternative approach to other methods previously presented in the literature. The advantage of the proposed method is that the same identification procedure can be used to identify the fluid-memory models with or without having access to the infinite-frequency added mass coefficient. Therefore, it provides an extension that puts the two identification problems into the same framework. The method also exploits the constraints related to relative degree and low-frequency asymptotic values of the hydrodynamic coefficients derived from the physics of the problem, which are used as prior information to refine the obtained models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The development and maintenance of large and complex ontologies are often time-consuming and error-prone. Thus, automated ontology learning and revision have attracted intensive research interest. In data-centric applications where ontologies are designed or automatically learnt from the data, when new data instances are added that contradict to the ontology, it is often desirable to incrementally revise the ontology according to the added data. This problem can be intuitively formulated as the problem of revising a TBox by an ABox. In this paper we introduce a model-theoretic approach to such an ontology revision problem by using a novel alternative semantic characterisation of DL-Lite ontologies. We show some desired properties for our ontology revision. We have also developed an algorithm for reasoning with the ontology revision without computing the revision result. The algorithm is efficient as its computational complexity is in coNP in the worst case and in PTIME when the size of the new data is bounded.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The requirement of distributed computing of all-to-all comparison (ATAC) problems in heterogeneous systems is increasingly important in various domains. Though Hadoop-based solutions are widely used, they are inefficient for the ATAC pattern, which is fundamentally different from the MapReduce pattern for which Hadoop is designed. They exhibit poor data locality and unbalanced allocation of comparison tasks, particularly in heterogeneous systems. The results in massive data movement at runtime and ineffective utilization of computing resources, affecting the overall computing performance significantly. To address these problems, a scalable and efficient data and task distribution strategy is presented in this paper for processing large-scale ATAC problems in heterogeneous systems. It not only saves storage space but also achieves load balancing and good data locality for all comparison tasks. Experiments of bioinformatics examples show that about 89\% of the ideal performance capacity of the multiple machines have be achieved through using the approach presented in this paper.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In daily activities people are using a number of available means for the achievement of balance, such as the use of hands and the co-ordination of balance. One of the approaches that explains this relationship between perception and action is the ecological theory that is based on the work of a) Bernstein (1967), who imposed the problem of ‘the degrees of freedom’, b) Gibson (1979), who referred to the theory of perception and the way which the information is received from the environment in order for a certain movement to be achieved, c) Newell (1986), who proposed that movement can derive from the interaction of the constraints that imposed from the environment and the organism and d) Kugler, Kelso and Turvey (1982), who showed the way which “the degrees of freedom” are connected and interact. According to the above mentioned theories, the development of movement co-ordination can result from the different constraints that imposed into the organism-environment system. The close relation between the environmental and organismic constraints, as well as their interaction is responsible for the movement system that will be activated. These constraints apart from shaping the co-ordination of specific movements can be a rate limiting factor, to a certain degree, in the acquisition and mastering of a new skill. This frame of work can be an essential tool for the study of catching an object (e.g., a ball). The importance of this study becomes obvious due to the fact that movements that involved in catching an object are representative of every day actions and characteristic of the interaction between perception and action.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the study of complex neurobiological movement systems, measurement indeterminacy has typically been overcome by imposing artificial modelling constraints to reduce the number of unknowns (e.g., reducing all muscle, bone and ligament forces crossing a joint to a single vector). However, this approach prevents human movement scientists from investigating more fully the role, functionality and ubiquity of coordinative structures or functional motor synergies. Advancements in measurement methods and analysis techniques are required if the contribution of individual component parts or degrees of freedom of these task-specific structural units is to be established, thereby effectively solving the indeterminacy problem by reducing the number of unknowns. A further benefit of establishing more of the unknowns is that human movement scientists will be able to gain greater insight into ubiquitous processes of physical self-organising that underpin the formation of coordinative structures and the confluence of organismic, environmental and task constraints that determine the exact morphology of these special-purpose devices.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, the train scheduling problem is modelled as a blocking parallel-machine job shop scheduling (BPMJSS) problem. In the model, trains, single-track sections and multiple-track sections, respectively, are synonymous with jobs, single machines and parallel machines, and an operation is regarded as the movement/traversal of a train across a section. Due to the lack of buffer space, the real-life case should consider blocking or hold-while-wait constraints, which means that a track section cannot release and must hold the train until next section on the routing becomes available. Based on literature review and our analysis, it is very hard to find a feasible complete schedule directly for BPMJSS problems. Firstly, a parallel-machine job-shop-scheduling (PMJSS) problem is solved by an improved shifting bottleneck procedure (SBP) algorithm without considering blocking conditions. Inspired by the proposed SBP algorithm, feasibility satisfaction procedure (FSP) algorithm is developed to solve and analyse the BPMJSS problem, by an alternative graph model that is an extension of the classical disjunctive graph models. The proposed algorithms have been implemented and validated using real-world data from Queensland Rail. Sensitivity analysis has been applied by considering train length, upgrading track sections, increasing train speed and changing bottleneck sections. The outcomes show that the proposed methodology would be a very useful tool for the real-life train scheduling problems

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Problem-based learning (PBL) is a pedagogical methodology that presents the learner with a problem to be solved to stimulate and situate learning. This paper presents key characteristics of a problem-based learning environment that determines its suitability as a data source for workrelated research studies. To date, little has been written about the availability and validity of PBL environments as a data source and its suitability for work-related research. We describe problembased learning and use a research project case study to illustrate the challenges associated with industry work samples. We then describe the PBL course used in our research case study and use this example to illustrate the key attributes of problem-based learning environments and show how the chosen PBL environment met the work-related research requirements of the research case study. We propose that the more realistic the PBL work context and work group composition, the better the PBL environment as a data source for a work-related research. The work context is more realistic when relevant and complex project-based problems are tackled in industry-like work conditions over longer time frames. Work group composition is more realistic when participants with industry-level education and experience enact specialized roles in different disciplines within a professional community.