919 resultados para data-driven Stochastic Subspace Identification (SSI-data)
Resumo:
Vehicle activated signs (VAS) display a warning message when drivers exceed a particular threshold. VAS are often installed on local roads to display a warning message depending on the speed of the approaching vehicles. VAS are usually powered by electricity; however, battery and solar powered VAS are also commonplace. This thesis investigated devel-opment of an automatic trigger speed of vehicle activated signs in order to influence driver behaviour, the effect of which has been measured in terms of reduced mean speed and low standard deviation. A comprehen-sive understanding of the effectiveness of the trigger speed of the VAS on driver behaviour was established by systematically collecting data. Specif-ically, data on time of day, speed, length and direction of the vehicle have been collected for the purpose, using Doppler radar installed at the road. A data driven calibration method for the radar used in the experiment has also been developed and evaluated. Results indicate that trigger speed of the VAS had variable effect on driv-ers’ speed at different sites and at different times of the day. It is evident that the optimal trigger speed should be set near the 85th percentile speed, to be able to lower the standard deviation. In the case of battery and solar powered VAS, trigger speeds between the 50th and 85th per-centile offered the best compromise between safety and power consump-tion. Results also indicate that different classes of vehicles report differ-ences in mean speed and standard deviation; on a highway, the mean speed of cars differs slightly from the mean speed of trucks, whereas a significant difference was observed between the classes of vehicles on lo-cal roads. A differential trigger speed was therefore investigated for the sake of completion. A data driven approach using Random forest was found to be appropriate in predicting trigger speeds respective to types of vehicles and traffic conditions. The fact that the predicted trigger speed was found to be consistently around the 85th percentile speed justifies the choice of the automatic model.
Resumo:
Using the Pricing Equation, in a panel-data framework, we construct a novel consistent estimator of the stochastic discount factor (SDF) mimicking portfolio which relies on the fact that its logarithm is the ìcommon featureîin every asset return of the economy. Our estimator is a simple function of asset returns and does not depend on any parametric function representing preferences, making it suitable for testing di§erent preference speciÖcations or investigating intertemporal substitution puzzles.
Resumo:
Using the Pricing Equation in a panel-data framework, we construct a novel consistent estimator of the stochastic discount factor (SDF) which relies on the fact that its logarithm is the "common feature" in every asset return of the economy. Our estimator is a simple function of asset returns and does not depend on any parametric function representing preferences. The techniques discussed in this paper were applied to two relevant issues in macroeconomics and finance: the first asks what type of parametric preference-representation could be validated by asset-return data, and the second asks whether or not our SDF estimator can price returns in an out-of-sample forecasting exercise. In formal testing, we cannot reject standard preference specifications used in the macro/finance literature. Estimates of the relative risk-aversion coefficient are between 1 and 2, and statistically equal to unity. We also show that our SDF proxy can price reasonably well the returns of stocks with a higher capitalization level, whereas it shows some difficulty in pricing stocks with a lower level of capitalization.
Resumo:
This work shows a computational methodology for the determination of synchronous machines parameters using load rejection test data. The quadrature axis parameters are obtained with a rejection under an arbitrary reference, reducing the present difficulties.
Resumo:
This work shows a computational methodology for the determination of synchronous machines parameters using load rejection test data. By machine modeling one can obtain the quadrature parameters through a load rejection under an arbitrary reference, reducing the present difficulties. The proposed method is applied to a real machine.
Resumo:
This paper proposes a method by simulated annealing for building roof contours identification from LiDAR-derived digital elevation model. Our method is based on the concept of first extracting aboveground objects and then identifying those objects that are building roof contours. First, to detect aboveground objects (buildings, trees, etc.), the digital elevation model is segmented through a recursive splitting technique followed by a region merging process. Vectorization and polygonization are used to obtain polyline representations of the detected aboveground objects. Second, building roof contours are identified from among the aboveground objects by optimizing a Markov-random-field-based energy function that embodies roof contour attributes and spatial constraints. The solution of this function is a polygon set corresponding to building roof contours and is found by using a minimization technique, like the Simulated Annealing algorithm. Experiments carried out with laser scanning digital elevation model showed that the methodology works properly, as it provides roof contour information with approximately 90% shape accuracy and no verified false positives.
Resumo:
Gravitational waves from a variety of sources are predicted to superpose to create a stochastic background. This background is expected to contain unique information from throughout the history of the Universe that is unavailable through standard electromagnetic observations, making its study of fundamental importance to understanding the evolution of the Universe. We carry out a search for the stochastic background with the latest data from the LIGO and Virgo detectors. Consistent with predictions from most stochastic gravitational-wave background models, the data display no evidence of a stochastic gravitational-wave signal. Assuming a gravitational-wave spectrum of Omega(GW)(f) = Omega(alpha)(f/f(ref))(alpha), we place 95% confidence level upper limits on the energy density of the background in each of four frequency bands spanning 41.5-1726 Hz. In the frequency band of 41.5-169.25 Hz for a spectral index of alpha = 0, we constrain the energy density of the stochastic background to be Omega(GW)(f) < 5.6 x 10(-6). For the 600-1000 Hz band, Omega(GW)(f) < 0.14(f/900 Hz)(3), a factor of 2.5 lower than the best previously reported upper limits. We find Omega(GW)(f) < 1.8 x 10(-4) using a spectral index of zero for 170-600 Hz and Omega(GW)(f) < 1.0(f/1300 Hz)(3) for 1000-1726 Hz, bands in which no previous direct limits have been placed. The limits in these four bands are the lowest direct measurements to date on the stochastic background. We discuss the implications of these results in light of the recent claim by the BICEP2 experiment of the possible evidence for inflationary gravitational waves.
Resumo:
Complexity in time series is an intriguing feature of living dynamical systems, with potential use for identification of system state. Although various methods have been proposed for measuring physiologic complexity, uncorrelated time series are often assigned high values of complexity, errouneously classifying them as a complex physiological signals. Here, we propose and discuss a method for complex system analysis based on generalized statistical formalism and surrogate time series. Sample entropy (SampEn) was rewritten inspired in Tsallis generalized entropy, as function of q parameter (qSampEn). qSDiff curves were calculated, which consist of differences between original and surrogate series qSampEn. We evaluated qSDiff for 125 real heart rate variability (HRV) dynamics, divided into groups of 70 healthy, 44 congestive heart failure (CHF), and 11 atrial fibrillation (AF) subjects, and for simulated series of stochastic and chaotic process. The evaluations showed that, for nonperiodic signals, qSDiff curves have a maximum point (qSDiff(max)) for q not equal 1. Values of q where the maximum point occurs and where qSDiff is zero were also evaluated. Only qSDiff(max) values were capable of distinguish HRV groups (p-values 5.10 x 10(-3); 1.11 x 10(-7), and 5.50 x 10(-7) for healthy vs. CHF, healthy vs. AF, and CHF vs. AF, respectively), consistently with the concept of physiologic complexity, and suggests a potential use for chaotic system analysis. (C) 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.4758815]
Resumo:
The reproductive performance of cattle may be influenced by several factors, but mineral imbalances are crucial in terms of direct effects on reproduction. Several studies have shown that elements such as calcium, copper, iron, magnesium, selenium, and zinc are essential for reproduction and can prevent oxidative stress. However, toxic elements such as lead, nickel, and arsenic can have adverse effects on reproduction. In this paper, we applied a simple and fast method of multi-element analysis to bovine semen samples from Zebu and European classes used in reproduction programs and artificial insemination. Samples were analyzed by inductively coupled plasma spectrometry (ICP-MS) using aqueous medium calibration and the samples were diluted in a proportion of 1:50 in a solution containing 0.01% (vol/vol) Triton X-100 and 0.5% (vol/vol) nitric acid. Rhodium, iridium, and yttrium were used as the internal standards for ICP-MS analysis. To develop a reliable method of tracing the class of bovine semen, we used data mining techniques that make it possible to classify unknown samples after checking the differentiation of known-class samples. Based on the determination of 15 elements in 41 samples of bovine semen, 3 machine-learning tools for classification were applied to determine cattle class. Our results demonstrate the potential of support vector machine (SVM), multilayer perceptron (MLP), and random forest (RF) chemometric tools to identify cattle class. Moreover, the selection tools made it possible to reduce the number of chemical elements needed from 15 to just 8.
Resumo:
A common interest in gene expression data analysis is to identify from a large pool of candidate genes the genes that present significant changes in expression levels between a treatment and a control biological condition. Usually, it is done using a statistic value and a cutoff value that are used to separate the genes differentially and nondifferentially expressed. In this paper, we propose a Bayesian approach to identify genes differentially expressed calculating sequentially credibility intervals from predictive densities which are constructed using the sampled mean treatment effect from all genes in study excluding the treatment effect of genes previously identified with statistical evidence for difference. We compare our Bayesian approach with the standard ones based on the use of the t-test and modified t-tests via a simulation study, using small sample sizes which are common in gene expression data analysis. Results obtained report evidence that the proposed approach performs better than standard ones, especially for cases with mean differences and increases in treatment variance in relation to control variance. We also apply the methodologies to a well-known publicly available data set on Escherichia coli bacterium.
Resumo:
Abstract Background One goal of gene expression profiling is to identify signature genes that robustly distinguish different types or grades of tumors. Several tumor classifiers based on expression profiling have been proposed using microarray technique. Due to important differences in the probabilistic models of microarray and SAGE technologies, it is important to develop suitable techniques to select specific genes from SAGE measurements. Results A new framework to select specific genes that distinguish different biological states based on the analysis of SAGE data is proposed. The new framework applies the bolstered error for the identification of strong genes that separate the biological states in a feature space defined by the gene expression of a training set. Credibility intervals defined from a probabilistic model of SAGE measurements are used to identify the genes that distinguish the different states with more reliability among all gene groups selected by the strong genes method. A score taking into account the credibility and the bolstered error values in order to rank the groups of considered genes is proposed. Results obtained using SAGE data from gliomas are presented, thus corroborating the introduced methodology. Conclusion The model representing counting data, such as SAGE, provides additional statistical information that allows a more robust analysis. The additional statistical information provided by the probabilistic model is incorporated in the methodology described in the paper. The introduced method is suitable to identify signature genes that lead to a good separation of the biological states using SAGE and may be adapted for other counting methods such as Massive Parallel Signature Sequencing (MPSS) or the recent Sequencing-By-Synthesis (SBS) technique. Some of such genes identified by the proposed method may be useful to generate classifiers.
Resumo:
With the increasing production of information from e-government initiatives, there is also the need to transform a large volume of unstructured data into useful information for society. All this information should be easily accessible and made available in a meaningful and effective way in order to achieve semantic interoperability in electronic government services, which is a challenge to be pursued by governments round the world. Our aim is to discuss the context of e-Government Big Data and to present a framework to promote semantic interoperability through automatic generation of ontologies from unstructured information found in the Internet. We propose the use of fuzzy mechanisms to deal with natural language terms and present some related works found in this area. The results achieved in this study are based on the architectural definition and major components and requirements in order to compose the proposed framework. With this, it is possible to take advantage of the large volume of information generated from e-Government initiatives and use it to benefit society.
Resumo:
ÈN]A trans-oceanic section at 24.5°N in the North Atlantic has been sampled at a decadal frequency. This work demonstrates that the wind-driven component of the Meridional Overturning Circulation (MOC) may be monitored using autonomous profiling floats deployed in the eastern North Atlantic Subtropical Gyre. More than 500 CTD vertical profiles from the surface to 2000 m depth, spanning one year (from April 2002 to March 2003), are used to compute the geostrophic transport stream function at 24.5°N. The baroclinic transport obtained from the autonomous profiling floats is not statistically different than that from three hydrographic cruises carried out in 1957, 1981 and 1992. A good agreement is found between the geostrophic transport stream function and the transport derived from the wind field through the Sverdrup relation.
Resumo:
Ontology design and population -core aspects of semantic technologies- re- cently have become fields of great interest due to the increasing need of domain-specific knowledge bases that can boost the use of Semantic Web. For building such knowledge resources, the state of the art tools for ontology design require a lot of human work. Producing meaningful schemas and populating them with domain-specific data is in fact a very difficult and time-consuming task. Even more if the task consists in modelling knowledge at a web scale. The primary aim of this work is to investigate a novel and flexible method- ology for automatically learning ontology from textual data, lightening the human workload required for conceptualizing domain-specific knowledge and populating an extracted schema with real data, speeding up the whole ontology production process. Here computational linguistics plays a fundamental role, from automati- cally identifying facts from natural language and extracting frame of relations among recognized entities, to producing linked data with which extending existing knowledge bases or creating new ones. In the state of the art, automatic ontology learning systems are mainly based on plain-pipelined linguistics classifiers performing tasks such as Named Entity recognition, Entity resolution, Taxonomy and Relation extraction [11]. These approaches present some weaknesses, specially in capturing struc- tures through which the meaning of complex concepts is expressed [24]. Humans, in fact, tend to organize knowledge in well-defined patterns, which include participant entities and meaningful relations linking entities with each other. In literature, these structures have been called Semantic Frames by Fill- 6 Introduction more [20], or more recently as Knowledge Patterns [23]. Some NLP studies has recently shown the possibility of performing more accurate deep parsing with the ability of logically understanding the structure of discourse [7]. In this work, some of these technologies have been investigated and em- ployed to produce accurate ontology schemas. The long-term goal is to collect large amounts of semantically structured information from the web of crowds, through an automated process, in order to identify and investigate the cognitive patterns used by human to organize their knowledge.