934 resultados para Approximate filtering
Resumo:
The analysis of complex networks is usually based on key properties such as small-worldness and vertex degree distribution. The presence of symmetric motifs on the other hand has been related to redundancy and thus robustness of the networks. In this paper we propose a method for detecting approximate axial symmetries in networks. For each pair of nodes, we define a continuous-time quantum walk which is evolved through time. By measuring the probability that the quantum walker to visits each node of the network in this time frame, we are able to determine whether the two vertices are symmetrical with respect to any axis of the graph. Moreover, we show that we are able to successfully detect approximate axial symmetries too. We show the efficacy of our approach by analysing both synthetic and real-world data. © 2012 Springer-Verlag Berlin Heidelberg.
Resumo:
We show both numerically and experimentally that dispersion management can be realized by manipulating the dispersion of a filter in a passively mode-locked fibre laser. A programmable filter the dispersion of which can be software configured is employed in the laser. Solitons, stretched-pulses, and dissipative solitons can be targeted reliably by controlling the filter transmission function only, while the length of fibres is fixed in the laser. This technique shows remarkable advantages in controlling operation regimes in ultrafast fibre lasers, in contrast to the traditional technique in which dispersion management is achieved by optimizing the relative length of fibres with opposite-sign dispersion. Our versatile ultrafast fibre laser will be attractive for applications requiring different pulse profiles such as in optical signal processing and optical communications.
Resumo:
Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier's feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space
Resumo:
This dissertation develops a new figure of merit to measure the similarity (or dissimilarity) of Gaussian distributions through a novel concept that relates the Fisher distance to the percentage of data overlap. The derivations are expanded to provide a generalized mathematical platform for determining an optimal separating boundary of Gaussian distributions in multiple dimensions. Real-world data used for implementation and in carrying out feasibility studies were provided by Beckman-Coulter. It is noted that although the data used is flow cytometric in nature, the mathematics are general in their derivation to include other types of data as long as their statistical behavior approximate Gaussian distributions. ^ Because this new figure of merit is heavily based on the statistical nature of the data, a new filtering technique is introduced to accommodate for the accumulation process involved with histogram data. When data is accumulated into a frequency histogram, the data is inherently smoothed in a linear fashion, since an averaging effect is taking place as the histogram is generated. This new filtering scheme addresses data that is accumulated in the uneven resolution of the channels of the frequency histogram. ^ The qualitative interpretation of flow cytometric data is currently a time consuming and imprecise method for evaluating histogram data. This method offers a broader spectrum of capabilities in the analysis of histograms, since the figure of merit derived in this dissertation integrates within its mathematics both a measure of similarity and the percentage of overlap between the distributions under analysis. ^
Resumo:
We develop a new autoregressive conditional process to capture both the changes and the persistency of the intraday seasonal (U-shape) pattern of volatility in essay 1. Unlike other procedures, this approach allows for the intraday volatility pattern to change over time without the filtering process injecting a spurious pattern of noise into the filtered series. We show that prior deterministic filtering procedures are special cases of the autoregressive conditional filtering process presented here. Lagrange multiplier tests prove that the stochastic seasonal variance component is statistically significant. Specification tests using the correlogram and cross-spectral analyses prove the reliability of the autoregressive conditional filtering process. In essay 2 we develop a new methodology to decompose return variance in order to examine the informativeness embedded in the return series. The variance is decomposed into the information arrival component and the noise factor component. This decomposition methodology differs from previous studies in that both the informational variance and the noise variance are time-varying. Furthermore, the covariance of the informational component and the noisy component is no longer restricted to be zero. The resultant measure of price informativeness is defined as the informational variance divided by the total variance of the returns. The noisy rational expectations model predicts that uninformed traders react to price changes more than informed traders, since uninformed traders cannot distinguish between price changes caused by information arrivals and price changes caused by noise. This hypothesis is tested in essay 3 using intraday data with the intraday seasonal volatility component removed, as based on the procedure in the first essay. The resultant seasonally adjusted variance series is decomposed into components caused by unexpected information arrivals and by noise in order to examine informativeness.
Resumo:
With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.
Resumo:
My work presents a place-specific analysis of how gender paradigms interact across and within spatial scales: the global, the regional, the national and the personal. It briefly outlines the concepts and measures defining the international gender paradigm, and explores the filtration of this paradigm into assessments and understandings of gender and gender dynamics by and within Barbados. It does this by analyzing the contents of reports of the Barbados government to international bodies assessing the country’s performance in the area of gender equality, and by analyzing gender-comparative content of local print news media over the decade of the 1990s, and the first decade of the 2000s. It contextualizes the discussion within the realm of social and economic development. The work shows how the almost singular focus on “women” in the international gender paradigm may depreciate valid gender concerns of men and thus hinder the overall goal of achieving gender equality, that is, achieving just, inclusive societies.
Resumo:
With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.
Resumo:
We develop a new autoregressive conditional process to capture both the changes and the persistency of the intraday seasonal (U-shape) pattern of volatility in essay 1. Unlike other procedures, this approach allows for the intraday volatility pattern to change over time without the filtering process injecting a spurious pattern of noise into the filtered series. We show that prior deterministic filtering procedures are special cases of the autoregressive conditional filtering process presented here. Lagrange multiplier tests prove that the stochastic seasonal variance component is statistically significant. Specification tests using the correlogram and cross-spectral analyses prove the reliability of the autoregressive conditional filtering process. In essay 2 we develop a new methodology to decompose return variance in order to examine the informativeness embedded in the return series. The variance is decomposed into the information arrival component and the noise factor component. This decomposition methodology differs from previous studies in that both the informational variance and the noise variance are time-varying. Furthermore, the covariance of the informational component and the noisy component is no longer restricted to be zero. The resultant measure of price informativeness is defined as the informational variance divided by the total variance of the returns. The noisy rational expectations model predicts that uninformed traders react to price changes more than informed traders, since uninformed traders cannot distinguish between price changes caused by information arrivals and price changes caused by noise. This hypothesis is tested in essay 3 using intraday data with the intraday seasonal volatility component removed, as based on the procedure in the first essay. The resultant seasonally adjusted variance series is decomposed into components caused by unexpected information arrivals and by noise in order to examine informativeness.
Resumo:
Existe una cantidad enorme de información en Internet acerca de incontables temas, y cada día esta información se expande más y más. En teoría, los programas informáticos podrían beneficiarse de esta gran cantidad de información disponible para establecer nuevas conexiones entre conceptos, pero esta información a menudo aparece en formatos no estructurados como texto en lenguaje natural. Por esta razón, es muy importante conseguir obtener automáticamente información de fuentes de diferentes tipos, procesarla, filtrarla y enriquecerla, para lograr maximizar el conocimiento que podemos obtener de Internet. Este proyecto consta de dos partes diferentes. En la primera se explora el filtrado de información. La entrada del sistema consiste en una serie de tripletas proporcionadas por la Universidad de Coimbra (ellos obtuvieron las tripletas mediante un proceso de extracción de información a partir de texto en lenguaje natural). Sin embargo, debido a la complejidad de la tarea de extracción, algunas de las tripletas son de dudosa calidad y necesitan pasar por un proceso de filtrado. Dadas estas tripletas acerca de un tema concreto, la entrada será estudiada para averiguar qué información es relevante al tema y qué información debe ser descartada. Para ello, la entrada será comparada con una fuente de conocimiento online. En la segunda parte de este proyecto, se explora el enriquecimiento de información. Se emplean diferentes fuentes de texto online escritas en lenguaje natural (en inglés) y se extrae información de ellas que pueda ser relevante al tema especificado. Algunas de estas fuentes de conocimiento están escritas en inglés común, y otras están escritas en inglés simple, un subconjunto controlado del lenguaje que consta de vocabulario reducido y estructuras sintácticas más simples. Se estudia cómo esto afecta a la calidad de las tripletas extraídas, y si la información obtenida de fuentes escritas en inglés simple es de una calidad superior a aquella extraída de fuentes en inglés común.
Resumo:
Acknowledgements This work has been partially supported by the European project Marrying Ontologies and Software Technologies (EU ICT2008-216691), the European project Knowledge Driven Data Exploitation (EU FP7/IAPP2011-286348), the UK EPSRC project WhatIf (EP/J014354/1). The authors thank Prof. Ian Horrocks and Dr. Giorgos Stoilos for their helpful discussion on role subsumptions. The authors thank Rafael S. Gonçalves et al. for providing their hotspots ontologies. The authors also thank BoC-group for providing their ADOxx Metamodelling ontologies.
Resumo:
Peer reviewed
Resumo:
Postprint
Resumo:
Anthropogenic CO2 emissions are acidifying the world's oceans. A growing body of evidence is showing that ocean acidification impacts growth and developmental rates of marine invertebrates. Here we test the impact of elevated seawater pCO2 (129 Pa, 1271 µatm) on early development, larval metabolic and feeding rates in a marine model organism, the sea urchin Strongylocentrotus purpuratus. Growth and development was assessed by measuring total body length, body rod length, postoral rod length and posterolateral rod length. Comparing these parameters between treatments suggests that larvae suffer from a developmental delay (by ca. 8%) rather than from the previously postulated reductions in size at comparable developmental stages. Further, we found maximum increases in respiration rates of + 100 % under elevated pCO2, while body length corrected feeding rates did not differ between larvae from both treatments. Calculating scope for growth illustrates that larvae raised under high pCO2 spent an average of 39 to 45% of the available energy for somatic growth, while control larvae could allocate between 78 and 80% of the available energy into growth processes. Our results highlight the importance of defining a standard frame of reference when comparing a given parameter between treatments, as observed differences can be easily due to comparison of different larval ages with their specific set of biological characters.