Biblioteca Digital

872 resultados para Elements, High Trhoughput Data, elettrofisiologia, elaborazione dati, analisi Real Time

Finding frequent itemsets in high-speed data streams

Relevância:

100.00% 100.00%

Publicador:

Veja mais

A principled approach to interactive hierarchical non-linear visualization of high-dimensional data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hierarchical visualization systems are desirable because a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex high-dimensional data sets. We extend an existing locally linear hierarchical visualization system PhiVis [1] in several directions: bf(1) we allow for em non-linear projection manifolds (the basic building block is the Generative Topographic Mapping -- GTM), bf(2) we introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree, bf(3) we describe folding patterns of low-dimensional projection manifold in high-dimensional data space by computing and visualizing the manifold's local directional curvatures. Quantities such as magnification factors [3] and directional curvatures are helpful for understanding the layout of the nonlinear projection manifold in the data space and for further refinement of the hierarchical visualization plot. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. We demonstrate the visualization system principle of the approach on a complex 12-dimensional data set and mention possible applications in the pharmaceutical industry.

Veja mais

High level data fusion

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the question of how to obtain effective fusion of identification information such that it is robust to the quality of this information. As well as technical issues data fusion is encumbered with a collection of (potentially confusing) practical considerations. These considerations are described during the early chapters in which a framework for data fusion is developed. Following this process of diversification it becomes clear that the original question is not well posed and requires more precise specification. We use the framework to focus on some of the technical issues relevant to the question being addressed. We show that fusion of hard decisions through use of an adaptive version of the maximum a posteriori decision rule yields acceptable performance. Better performance is possible using probability level fusion as long as the probabilities are accurate. Of particular interest is the prevalence of overconfidence and the effect it has on fused performance. The production of accurate probabilities from poor quality data forms the latter part of the thesis. Two approaches are taken. Firstly the probabilities may be moderated at source (either analytically or numerically). Secondly, the probabilities may be transformed at the fusion centre. In each case an improvement in fused performance is demonstrated. We therefore conclude that in order to obtain robust fusion care should be taken to model the probabilities accurately; either at the source or centrally.

Veja mais

Techniques for equalisation of speech channels for high-speed data transmission

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Analysis of the Bovespa Futures and Spot Indexes With High Frequency Data. Chinese Business Review

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data from the World Federation of Exchanges show that Brazil’s Sao Paulo stock exchange is one of the largest worldwide in terms of market value. Thus, the objective of this study is to obtain univariate and bivariate forecasting models based on intraday data from the futures and spot markets of the BOVESPA index. The interest is to verify if there exist arbitrage opportunities in Brazilian financial market. To this end, three econometric forecasting models were built: ARFIMA, vector autoregressive (VAR), and vector error correction (VEC). Furthermore, it presents the results of a Granger causality test for the aforementioned series. This type of study shows that it is important to identify arbitrage opportunities in financial markets and, in particular, in the application of these models on data of this nature. In terms of the forecasts made with these models, VEC showed better results. The causality test shows that futures BOVESPA index Granger causes spot BOVESPA index. This result may indicate arbitrage opportunities in Brazil.

Veja mais

Measuring, Modeling, and Forecasting Volatility and Correlations from High-Frequency Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation contains four essays that all share a common purpose: developing new methodologies to exploit the potential of high-frequency data for the measurement, modeling and forecasting of financial assets volatility and correlations. The first two chapters provide useful tools for univariate applications while the last two chapters develop multivariate methodologies. In chapter 1, we introduce a new class of univariate volatility models named FloGARCH models. FloGARCH models provide a parsimonious joint model for low frequency returns and realized measures, and are sufficiently flexible to capture long memory as well as asymmetries related to leverage effects. We analyze the performances of the models in a realistic numerical study and on the basis of a data set composed of 65 equities. Using more than 10 years of high-frequency transactions, we document significant statistical gains related to the FloGARCH models in terms of in-sample fit, out-of-sample fit and forecasting accuracy compared to classical and Realized GARCH models. In chapter 2, using 12 years of high-frequency transactions for 55 U.S. stocks, we argue that combining low-frequency exogenous economic indicators with high-frequency financial data improves the ability of conditionally heteroskedastic models to forecast the volatility of returns, their full multi-step ahead conditional distribution and the multi-period Value-at-Risk. Using a refined version of the Realized LGARCH model allowing for time-varying intercept and implemented with realized kernels, we document that nominal corporate profits and term spreads have strong long-run predictive ability and generate accurate risk measures forecasts over long-horizon. The results are based on several loss functions and tests, including the Model Confidence Set. Chapter 3 is a joint work with David Veredas. We study the class of disentangled realized estimators for the integrated covariance matrix of Brownian semimartingales with finite activity jumps. These estimators separate correlations and volatilities. We analyze different combinations of quantile- and median-based realized volatilities, and four estimators of realized correlations with three synchronization schemes. Their finite sample properties are studied under four data generating processes, in presence, or not, of microstructure noise, and under synchronous and asynchronous trading. The main finding is that the pre-averaged version of disentangled estimators based on Gaussian ranks (for the correlations) and median deviations (for the volatilities) provide a precise, computationally efficient, and easy alternative to measure integrated covariances on the basis of noisy and asynchronous prices. Along these lines, a minimum variance portfolio application shows the superiority of this disentangled realized estimator in terms of numerous performance metrics. Chapter 4 is co-authored with Niels S. Hansen, Asger Lunde and Kasper V. Olesen, all affiliated with CREATES at Aarhus University. We propose to use the Realized Beta GARCH model to exploit the potential of high-frequency data in commodity markets. The model produces high quality forecasts of pairwise correlations between commodities which can be used to construct a composite covariance matrix. We evaluate the quality of this matrix in a portfolio context and compare it to models used in the industry. We demonstrate significant economic gains in a realistic setting including short selling constraints and transaction costs.

Veja mais

Acoustic keyword spotting in speech with applications to data mining

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Keyword Spotting is the task of detecting keywords of interest within continu- ous speech. The applications of this technology range from call centre dialogue systems to covert speech surveillance devices. Keyword spotting is particularly well suited to data mining tasks such as real-time keyword monitoring and unre- stricted vocabulary audio document indexing. However, to date, many keyword spotting approaches have su®ered from poor detection rates, high false alarm rates, or slow execution times, thus reducing their commercial viability. This work investigates the application of keyword spotting to data mining tasks. The thesis makes a number of major contributions to the ¯eld of keyword spotting. The ¯rst major contribution is the development of a novel keyword veri¯cation method named Cohort Word Veri¯cation. This method combines high level lin- guistic information with cohort-based veri¯cation techniques to obtain dramatic improvements in veri¯cation performance, in particular for the problematic short duration target word class. The second major contribution is the development of a novel audio document indexing technique named Dynamic Match Lattice Spotting. This technique aug- ments lattice-based audio indexing principles with dynamic sequence matching techniques to provide robustness to erroneous lattice realisations. The resulting algorithm obtains signi¯cant improvement in detection rate over lattice-based audio document indexing while still maintaining extremely fast search speeds. The third major contribution is the study of multiple veri¯er fusion for the task of keyword veri¯cation. The reported experiments demonstrate that substantial improvements in veri¯cation performance can be obtained through the fusion of multiple keyword veri¯ers. The research focuses on combinations of speech background model based veri¯ers and cohort word veri¯ers. The ¯nal major contribution is a comprehensive study of the e®ects of limited training data for keyword spotting. This study is performed with consideration as to how these e®ects impact the immediate development and deployment of speech technologies for non-English languages.

Veja mais

What if you're really different? Case studies of children with high functioning autism participating in the Get REAL programme who had atypical learning trajectories.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Evaluation of the Get REAL programme in an inclusive primary school setting has indicated its effectiveness in promoting pro-social behaviour for children with high functioning Autism. However, two children with co-morbid diagnoses and complex personal circumstances showed less consistent improvements. In order to explain their unique trajectories, not readily derived from quantitative studies, an exploratory case study approach was used to examine contextual influences on patterns of progress. Multiple data sources included coded video footage from the Get REAL programme, school reports on conduct, and parents and classroom teacher reports using the Strengths and Difficulties Questionnaire. While results provide support for the efficacy of the Get REAL programme for the two children, they also highlight the value of co-ordinated strategies and collaborative individualised approaches in more complex cases. This paper outlines the Get REAL intervention and a range of other school and support agency strategies impacting progress.

Veja mais

Assessment of precision timing and real-time data networks for digital substation automation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This project researched the performance of emerging digital technology for high voltage electricity substations that significantly improves safety for staff and reduces the potential impact on the environment of equipment failure. The experimental evaluation used a scale model of a substation control system that incorporated real substation control and networking equipment with real-time simulation of the power system. The outcomes confirm that it is possible to implement Ethernet networks in high voltage substations that meet the needs of utilities; however component-level testing of devices is necessary to achieve this. The assessment results have been used to further develop international standards for substation communication and precision timing.

Veja mais

Achieving high reliability for ambiguity resolutions with multiple GNSS constellations

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Global Navigation Satellite Systems (GNSS)-based observation systems can provide high precision positioning and navigation solutions in real time, in the order of subcentimetre if we make use of carrier phase measurements in the differential mode and deal with all the bias and noise terms well. However, these carrier phase measurements are ambiguous due to unknown, integer numbers of cycles. One key challenge in the differential carrier phase mode is to fix the integer ambiguities correctly. On the other hand, in the safety of life or liability-critical applications, such as for vehicle safety positioning and aviation, not only is high accuracy required, but also the reliability requirement is important. This PhD research studies to achieve high reliability for ambiguity resolution (AR) in a multi-GNSS environment. GNSS ambiguity estimation and validation problems are the focus of the research effort. Particularly, we study the case of multiple constellations that include initial to full operations of foreseeable Galileo, GLONASS and Compass and QZSS navigation systems from next few years to the end of the decade. Since real observation data is only available from GPS and GLONASS systems, the simulation method named Virtual Galileo Constellation (VGC) is applied to generate observational data from another constellation in the data analysis. In addition, both full ambiguity resolution (FAR) and partial ambiguity resolution (PAR) algorithms are used in processing single and dual constellation data. Firstly, a brief overview of related work on AR methods and reliability theory is given. Next, a modified inverse integer Cholesky decorrelation method and its performance on AR are presented. Subsequently, a new measure of decorrelation performance called orthogonality defect is introduced and compared with other measures. Furthermore, a new AR scheme considering the ambiguity validation requirement in the control of the search space size is proposed to improve the search efficiency. With respect to the reliability of AR, we also discuss the computation of the ambiguity success rate (ASR) and confirm that the success rate computed with the integer bootstrapping method is quite a sharp approximation to the actual integer least-squares (ILS) method success rate. The advantages of multi-GNSS constellations are examined in terms of the PAR technique involving the predefined ASR. Finally, a novel satellite selection algorithm for reliable ambiguity resolution called SARA is developed. In summary, the study demonstrats that when the ASR is close to one, the reliability of AR can be guaranteed and the ambiguity validation is effective. The work then focuses on new strategies to improve the ASR, including a partial ambiguity resolution procedure with a predefined success rate and a novel satellite selection strategy with a high success rate. The proposed strategies bring significant benefits of multi-GNSS signals to real-time high precision and high reliability positioning services.

Veja mais

Developing a dashboard for real-time data stream composition and visualization

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Technological advances have led to an influx of affordable hardware that supports sensing, computation and communication. This hardware is increasingly deployed in public and private spaces, tracking and aggregating a wealth of real-time environmental data. Although these technologies are the focus of several research areas, there is a lack of research dealing with the problem of making these capabilities accessible to everyday users. This thesis represents a first step towards developing systems that will allow users to leverage the available infrastructure and create custom tailored solutions. It explores how this notion can be utilized in the context of energy monitoring to improve conventional approaches. The project adopted a user-centered design process to inform the development of a flexible system for real-time data stream composition and visualization. This system features an extensible architecture and defines a unified API for heterogeneous data streams. Rather than displaying the data in a predetermined fashion, it makes this information available as building blocks that can be combined and shared. It is based on the insight that individual users have diverse information needs and presentation preferences. Therefore, it allows users to compose rich information displays, incorporating personally relevant data from an extensive information ecosystem. The prototype was evaluated in an exploratory study to observe its natural use in a real-world setting, gathering empirical usage statistics and conducting semi-structured interviews. The results show that a high degree of customization does not warrant sustained usage. Other factors were identified, yielding recommendations for increasing the impact on energy consumption.

Veja mais

Real-time data processing of interferential imaging spectrometer based on FPGA

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Based on the data processing technologies of interferential spectrometer, a sort of real-time data processing system on chip of interferential imaging spectrometer was studied based on large capacitance and high speed field programmable gate array( FPGA) device. The system integrates both interferograrn sampling and spectrum rebuilding on a single chip of FPGA and makes them being accomplished in real-time with advantages such as small cubage, fast speed and high reliability. It establishes a good technical foundation in the applications of imaging spectrometer on target detection and recognition in real-time.

Veja mais

Real-Time and Data-Driven Operation Optimization and Knowledge Discovery for an Enterprise Information System

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An enterprise information system (EIS) is an integrated data-applications platform characterized by diverse, heterogeneous, and distributed data sources. For many enterprises, a number of business processes still depend heavily on static rule-based methods and extensive human expertise. Enterprises are faced with the need for optimizing operation scheduling, improving resource utilization, discovering useful knowledge, and making data-driven decisions.

This thesis research is focused on real-time optimization and knowledge discovery that addresses workflow optimization, resource allocation, as well as data-driven predictions of process-execution times, order fulfillment, and enterprise service-level performance. In contrast to prior work on data analytics techniques for enterprise performance optimization, the emphasis here is on realizing scalable and real-time enterprise intelligence based on a combination of heterogeneous system simulation, combinatorial optimization, machine-learning algorithms, and statistical methods.

On-demand digital-print service is a representative enterprise requiring a powerful EIS.We use real-life data from Reischling Press, Inc. (RPI), a digit-print-service provider (PSP), to evaluate our optimization algorithms.

In order to handle the increase in volume and diversity of demands, we first present a high-performance, scalable, and real-time production scheduling algorithm for production automation based on an incremental genetic algorithm (IGA). The objective of this algorithm is to optimize the order dispatching sequence and balance resource utilization. Compared to prior work, this solution is scalable for a high volume of orders and it provides fast scheduling solutions for orders that require complex fulfillment procedures. Experimental results highlight its potential benefit in reducing production inefficiencies and enhancing the productivity of an enterprise.

We next discuss analysis and prediction of different attributes involved in hierarchical components of an enterprise. We start from a study of the fundamental processes related to real-time prediction. Our process-execution time and process status prediction models integrate statistical methods with machine-learning algorithms. In addition to improved prediction accuracy compared to stand-alone machine-learning algorithms, it also performs a probabilistic estimation of the predicted status. An order generally consists of multiple series and parallel processes. We next introduce an order-fulfillment prediction model that combines advantages of multiple classification models by incorporating flexible decision-integration mechanisms. Experimental results show that adopting due dates recommended by the model can significantly reduce enterprise late-delivery ratio. Finally, we investigate service-level attributes that reflect the overall performance of an enterprise. We analyze and decompose time-series data into different components according to their hierarchical periodic nature, perform correlation analysis,

and develop univariate prediction models for each component as well as multivariate models for correlated components. Predictions for the original time series are aggregated from the predictions of its components. In addition to a significant increase in mid-term prediction accuracy, this distributed modeling strategy also improves short-term time-series prediction accuracy.

In summary, this thesis research has led to a set of characterization, optimization, and prediction tools for an EIS to derive insightful knowledge from data and use them as guidance for production management. It is expected to provide solutions for enterprises to increase reconfigurability, accomplish more automated procedures, and obtain data-driven recommendations or effective decisions.

Veja mais

Real-time sea surface air temperature retrieval over Southern Ocean using AMSR-E data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The AMSR-E satellite data and in-situ data were applied to retrieve sea surface air temperature (Ta) over the Southern Ocean. The in-situ data were obtained from the 24~(th) -26~(th) Chinese Antarctic Expeditions during 2008-2010. First, Ta was used to analyze the relativity with the bright temperature (Tb) from the twelve channels of AMSR-E, and no high relativity was found between Ta and Tb from any of the channels. The highest relativity was 0.38 (with 23.8 GHz). The dataset for the modeling was obtained by using in-situ data to match up with Tb, and two methods were applied to build the retrieval model. In multi-parameters regression method, the Tbs from 12 channels were used to the model and the region was divided into two parts according to the latitude of 50Â°S. The retrieval results were compared with the in-situ data. The Root Mean Square Error (RMS) and relativity of high latitude zone were 0.96â„ƒand 0.93, respectively. And those of low latitude zone were 1.29 â„ƒ and 0.96, respectively. Artificial neural network (ANN) method was applied to retrieve Ta.The RMS and relativity were 1.26 â„ƒ and 0.98, respectively.

Veja mais

NanoStreams: A Hardware and Software Stack for Real-Time Analytics on Fast Data Streams

Relevância:

100.00% 100.00%

Publicador:

Veja mais

872 resultados para Elements, High Trhoughput Data, elettrofisiologia, elaborazione dati, analisi Real Time

Filtro por publicador