940 resultados para Remediation time prediction


Relevância:

80.00% 80.00%

Publicador:

Resumo:

An Automatic Vehicle Location (AVL) system is a computer-based vehicle tracking system that is capable of determining a vehicle's location in real time. As a major technology of the Advanced Public Transportation System (APTS), AVL systems have been widely deployed by transit agencies for purposes such as real-time operation monitoring, computer-aided dispatching, and arrival time prediction. AVL systems make a large amount of transit performance data available that are valuable for transit performance management and planning purposes. However, the difficulties of extracting useful information from the huge spatial-temporal database have hindered off-line applications of the AVL data. ^ In this study, a data mining process, including data integration, cluster analysis, and multiple regression, is proposed. The AVL-generated data are first integrated into a Geographic Information System (GIS) platform. The model-based cluster method is employed to investigate the spatial and temporal patterns of transit travel speeds, which may be easily translated into travel time. The transit speed variations along the route segments are identified. Transit service periods such as morning peak, mid-day, afternoon peak, and evening periods are determined based on analyses of transit travel speed variations for different times of day. The seasonal patterns of transit performance are investigated by using the analysis of variance (ANOVA). Travel speed models based on the clustered time-of-day intervals are developed using important factors identified as having significant effects on speed for different time-of-day periods. ^ It has been found that transit performance varied from different seasons and different time-of-day periods. The geographic location of a transit route segment also plays a role in the variation of the transit performance. The results of this research indicate that advanced data mining techniques have good potential in providing automated techniques of assisting transit agencies in service planning, scheduling, and operations control. ^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This dissertation presents and evaluates a methodology for scheduling medical application workloads in virtualized computing environments. Such environments are being widely adopted by providers of "cloud computing" services. In the context of provisioning resources for medical applications, such environments allow users to deploy applications on distributed computing resources while keeping their data secure. Furthermore, higher level services that further abstract the infrastructure-related issues can be built on top of such infrastructures. For example, a medical imaging service can allow medical professionals to process their data in the cloud, easing them from the burden of having to deploy and manage these resources themselves. In this work, we focus on issues related to scheduling scientific workloads on virtualized environments. We build upon the knowledge base of traditional parallel job scheduling to address the specific case of medical applications while harnessing the benefits afforded by virtualization technology. To this end, we provide the following contributions: (1) An in-depth analysis of the execution characteristics of the target applications when run in virtualized environments. (2) A performance prediction methodology applicable to the target environment. (3) A scheduling algorithm that harnesses application knowledge and virtualization-related benefits to provide strong scheduling performance and quality of service guarantees. In the process of addressing these pertinent issues for our target user base (i.e. medical professionals and researchers), we provide insight that benefits a large community of scientific application users in industry and academia. Our execution time prediction and scheduling methodologies are implemented and evaluated on a real system running popular scientific applications. We find that we are able to predict the execution time of a number of these applications with an average error of 15%. Our scheduling methodology, which is tested with medical image processing workloads, is compared to that of two baseline scheduling solutions and we find that it outperforms them in terms of both the number of jobs processed and resource utilization by 20–30%, without violating any deadlines. We conclude that our solution is a viable approach to supporting the computational needs of medical users, even if the cloud computing paradigm is not widely adopted in its current form.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

While news stories are an important traditional medium to broadcast and consume news, microblogging has recently emerged as a place where people can dis- cuss, disseminate, collect or report information about news. However, the massive information in the microblogosphere makes it hard for readers to keep up with these real-time updates. This is especially a problem when it comes to breaking news, where people are more eager to know “what is happening”. Therefore, this dis- sertation is intended as an exploratory effort to investigate computational methods to augment human effort when monitoring the development of breaking news on a given topic from a microblog stream by extractively summarizing the updates in a timely manner. More specifically, given an interest in a topic, either entered as a query or presented as an initial news report, a microblog temporal summarization system is proposed to filter microblog posts from a stream with three primary concerns: topical relevance, novelty, and salience. Considering the relatively high arrival rate of microblog streams, a cascade framework consisting of three stages is proposed to progressively reduce quantity of posts. For each step in the cascade, this dissertation studies methods that improve over current baselines. In the relevance filtering stage, query and document expansion techniques are applied to mitigate sparsity and vocabulary mismatch issues. The use of word embedding as a basis for filtering is also explored, using unsupervised and supervised modeling to characterize lexical and semantic similarity. In the novelty filtering stage, several statistical ways of characterizing novelty are investigated and ensemble learning techniques are used to integrate results from these diverse techniques. These results are compared with a baseline clustering approach using both standard and delay-discounted measures. In the salience filtering stage, because of the real-time prediction requirement a method of learning verb phrase usage from past relevant news reports is used in conjunction with some standard measures for characterizing writing quality. Following a Cranfield-like evaluation paradigm, this dissertation includes a se- ries of experiments to evaluate the proposed methods for each step, and for the end- to-end system. New microblog novelty and salience judgments are created, building on existing relevance judgments from the TREC Microblog track. The results point to future research directions at the intersection of social media, computational jour- nalism, information retrieval, automatic summarization, and machine learning.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Natural language processing has achieved great success in a wide range of ap- plications, producing both commercial language services and open-source language tools. However, most methods take a static or batch approach, assuming that the model has all information it needs and makes a one-time prediction. In this disser- tation, we study dynamic problems where the input comes in a sequence instead of all at once, and the output must be produced while the input is arriving. In these problems, predictions are often made based only on partial information. We see this dynamic setting in many real-time, interactive applications. These problems usually involve a trade-off between the amount of input received (cost) and the quality of the output prediction (accuracy). Therefore, the evaluation considers both objectives (e.g., plotting a Pareto curve). Our goal is to develop a formal understanding of sequential prediction and decision-making problems in natural language processing and to propose efficient solutions. Toward this end, we present meta-algorithms that take an existent batch model and produce a dynamic model to handle sequential inputs and outputs. Webuild our framework upon theories of Markov Decision Process (MDP), which allows learning to trade off competing objectives in a principled way. The main machine learning techniques we use are from imitation learning and reinforcement learning, and we advance current techniques to tackle problems arising in our settings. We evaluate our algorithm on a variety of applications, including dependency parsing, machine translation, and question answering. We show that our approach achieves a better cost-accuracy trade-off than the batch approach and heuristic-based decision- making approaches. We first propose a general framework for cost-sensitive prediction, where dif- ferent parts of the input come at different costs. We formulate a decision-making process that selects pieces of the input sequentially, and the selection is adaptive to each instance. Our approach is evaluated on both standard classification tasks and a structured prediction task (dependency parsing). We show that it achieves similar prediction quality to methods that use all input, while inducing a much smaller cost. Next, we extend the framework to problems where the input is revealed incremen- tally in a fixed order. We study two applications: simultaneous machine translation and quiz bowl (incremental text classification). We discuss challenges in this set- ting and show that adding domain knowledge eases the decision-making problem. A central theme throughout the chapters is an MDP formulation of a challenging problem with sequential input/output and trade-off decisions, accompanied by a learning algorithm that solves the MDP.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Steady-state and time-resolved fluorescence measurements are reported for several crude oils and their saturates, aromatics, resins, and asphaltenes (SARA) fractions (saturates, aromatics and resins), isolated from maltene after pentane precipitation of the asphaltenes. There is a clear relationship between the American Petroleum Institute (API) grade of the crude oils and their fluorescence emission intensity and maxima. Dilution of the crude oil samples with cyclohexane results in a significant increase of emission intensity and a blue shift, which is a clear indication of the presence of energy-transfer processes between the emissive chromophores present in the crude oil. Both the fluorescence spectra and the mean fluorescence lifetimes of the three SARA fractions and their mixtures indicate that the aromatics and resins are the major contributors to the emission of crude oils. Total synchronous fluorescence scan (TSFS) spectral maps are preferable to steady-state fluorescence spectra for discriminating between the fractions, making TSFS maps a particularly interesting choice for the development of fluorescence-based methods for the characterization and classification of crude oils. More detailed studies, using a much wider range of excitation and emission wavelengths, are necessary to determine the utility of time-resolved fluorescence (TRF) data for this purpose. Preliminary models constructed using TSFS spectra from 21 crude oil samples show a very good correlation (R(2) > 0.88) between the calculated and measured values of API and the SARA fraction concentrations. The use of models based on a fast fluorescence measurement may thus be an alternative to tedious and time-consuming chemical analysis in refineries.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Dormancy release was studied in four populations of annual ryegrass (Lolium rigidum) seeds to determine whether loss of dormancy in the field can be predicted from temperature alone or whether seed water content (WC) must also be considered. Freshly matured seeds were after-ripened at the northern and southern extremes of the Western Australian cereal cropping region and at constant 37degreesC. Seed WC was allowed to fluctuate with prevailing humidity, but full hydration was avoided by excluding rainfall. Dormancy was measured regularly during after-ripening by germinating seeds with 12-hourly light or in darkness. Germination was lower in darkness than in light/dark and dormancy release was slower when germination was tested in darkness. Seeds were consistently drier, and dormancy release was slower, during after-ripening at 37degreesC than under field conditions. However, within each population, the rate of dormancy release in the field (north and south) in terms of thermal time was unaffected by after-ripening site. While low seed WC slowed dormancy release in seeds held at 37degreesC, dormancy release in seeds after-ripened under Western Australian field conditions was adequately described by thermal after-ripening time, without the need to account for changes in WC elicited by fluctuating environmental humidity.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Nonlinear Dynamics, Vol. 29

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on short- time stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present simple procedures for the prediction of a real valued sequence. The algorithms are based on a combinationof several simple predictors. We show that if the sequence is a realization of a bounded stationary and ergodic random process then the average of squared errors converges, almost surely, to that of the optimum, given by the Bayes predictor. We offer an analog result for the prediction of stationary gaussian processes.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present a simple randomized procedure for the prediction of a binary sequence. The algorithm uses ideas from recent developments of the theory of the prediction of individual sequences. We show that if thesequence is a realization of a stationary and ergodic random process then the average number of mistakes converges, almost surely, to that of the optimum, given by the Bayes predictor.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Cardiovascular risk assessment might be improved with the addition of emerging, new tests derived from atherosclerosis imaging, laboratory tests or functional tests. This article reviews relative risk, odds ratios, receiver-operating curves, posttest risk calculations based on likelihood ratios, the net reclassification improvement and integrated discrimination. This serves to determine whether a new test has an added clinical value on top of conventional risk testing and how this can be verified statistically. Two clinically meaningful examples serve to illustrate novel approaches. This work serves as a review and basic work for the development of new guidelines on cardiovascular risk prediction, taking into account emerging tests, to be proposed by members of the 'Taskforce on Vascular Risk Prediction' under the auspices of the Working Group 'Swiss Atherosclerosis' of the Swiss Society of Cardiology in the future.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A new structure of Radial Basis Function (RBF) neural network called the Dual-orthogonal RBF Network (DRBF) is introduced for nonlinear time series prediction. The hidden nodes of a conventional RBF network compare the Euclidean distance between the network input vector and the centres, and the node responses are radially symmetrical. But in time series prediction where the system input vectors are lagged system outputs, which are usually highly correlated, the Euclidean distance measure may not be appropriate. The DRBF network modifies the distance metric by introducing a classification function which is based on the estimation data set. Training the DRBF networks consists of two stages. Learning the classification related basis functions and the important input nodes, followed by selecting the regressors and learning the weights of the hidden nodes. In both cases, a forward Orthogonal Least Squares (OLS) selection procedure is applied, initially to select the important input nodes and then to select the important centres. Simulation results of single-step and multi-step ahead predictions over a test data set are included to demonstrate the effectiveness of the new approach.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The calculation of interval forecasts for highly persistent autoregressive (AR) time series based on the bootstrap is considered. Three methods are considered for countering the small-sample bias of least-squares estimation for processes which have roots close to the unit circle: a bootstrap bias-corrected OLS estimator; the use of the Roy–Fuller estimator in place of OLS; and the use of the Andrews–Chen estimator in place of OLS. All three methods of bias correction yield superior results to the bootstrap in the absence of bias correction. Of the three correction methods, the bootstrap prediction intervals based on the Roy–Fuller estimator are generally superior to the other two. The small-sample performance of bootstrap prediction intervals based on the Roy–Fuller estimator are investigated when the order of the AR model is unknown, and has to be determined using an information criterion.