216 resultados para Filtering
Resumo:
Big Data presents many challenges related to volume, whether one is interested in studying past datasets or, even more problematically, attempting to work with live streams of data. The most obvious challenge, in a ‘noisy’ environment such as contemporary social media, is to collect the pertinent information; be that information for a specific study, tweets which can inform emergency services or other responders to an ongoing crisis, or give an advantage to those involved in prediction markets. Often, such a process is iterative, with keywords and hashtags changing with the passage of time, and both collection and analytic methodologies need to be continually adapted to respond to this changing information. While many of the data sets collected and analyzed are preformed, that is they are built around a particular keyword, hashtag, or set of authors, they still contain a large volume of information, much of which is unnecessary for the current purpose and/or potentially useful for future projects. Accordingly, this panel considers methods for separating and combining data to optimize big data research and report findings to stakeholders. The first paper considers possible coding mechanisms for incoming tweets during a crisis, taking a large stream of incoming tweets and selecting which of those need to be immediately placed in front of responders, for manual filtering and possible action. The paper suggests two solutions for this, content analysis and user profiling. In the former case, aspects of the tweet are assigned a score to assess its likely relationship to the topic at hand, and the urgency of the information, whilst the latter attempts to identify those users who are either serving as amplifiers of information or are known as an authoritative source. Through these techniques, the information contained in a large dataset could be filtered down to match the expected capacity of emergency responders, and knowledge as to the core keywords or hashtags relating to the current event is constantly refined for future data collection. The second paper is also concerned with identifying significant tweets, but in this case tweets relevant to particular prediction market; tennis betting. As increasing numbers of professional sports men and women create Twitter accounts to communicate with their fans, information is being shared regarding injuries, form and emotions which have the potential to impact on future results. As has already been demonstrated with leading US sports, such information is extremely valuable. Tennis, as with American Football (NFL) and Baseball (MLB) has paid subscription services which manually filter incoming news sources, including tweets, for information valuable to gamblers, gambling operators, and fantasy sports players. However, whilst such services are still niche operations, much of the value of information is lost by the time it reaches one of these services. The paper thus considers how information could be filtered from twitter user lists and hash tag or keyword monitoring, assessing the value of the source, information, and the prediction markets to which it may relate. The third paper examines methods for collecting Twitter data and following changes in an ongoing, dynamic social movement, such as the Occupy Wall Street movement. It involves the development of technical infrastructure to collect and make the tweets available for exploration and analysis. A strategy to respond to changes in the social movement is also required or the resulting tweets will only reflect the discussions and strategies the movement used at the time the keyword list is created — in a way, keyword creation is part strategy and part art. In this paper we describe strategies for the creation of a social media archive, specifically tweets related to the Occupy Wall Street movement, and methods for continuing to adapt data collection strategies as the movement’s presence in Twitter changes over time. We also discuss the opportunities and methods to extract data smaller slices of data from an archive of social media data to support a multitude of research projects in multiple fields of study. The common theme amongst these papers is that of constructing a data set, filtering it for a specific purpose, and then using the resulting information to aid in future data collection. The intention is that through the papers presented, and subsequent discussion, the panel will inform the wider research community not only on the objectives and limitations of data collection, live analytics, and filtering, but also on current and in-development methodologies that could be adopted by those working with such datasets, and how such approaches could be customized depending on the project stakeholders.
Resumo:
Topic modelling has been widely used in the fields of information retrieval, text mining, machine learning, etc. In this paper, we propose a novel model, Pattern Enhanced Topic Model (PETM), which makes improvements to topic modelling by semantically representing topics with discriminative patterns, and also makes innovative contributions to information filtering by utilising the proposed PETM to determine document relevance based on topics distribution and maximum matched patterns proposed in this paper. Extensive experiments are conducted to evaluate the effectiveness of PETM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model significantly outperforms both state-of-the-art term-based models and pattern-based models.
Resumo:
In vivo confocal microscopy (IVCM) is an emerging technology that provides minimally invasive, high resolution, steady-state assessment of the ocular surface at the cellular level. Several challenges still remain but, at present, IVCM may be considered a promising technique for clinical diagnosis and management. This mini-review summarizes some key findings in IVCM of the ocular surface, focusing on recent and promising attempts to move “from bench to bedside”. IVCM allows prompt diagnosis, disease course follow-up, and management of potentially blinding atypical forms of infectious processes, such as acanthamoeba and fungal keratitis. This technology has improved our knowledge of corneal alterations and some of the processes that affect the visual outcome after lamellar keratoplasty and excimer keratorefractive surgery. In dry eye disease, IVCM has provided new information on the whole-ocular surface morphofunctional unit. It has also improved understanding of pathophysiologic mechanisms and helped in the assessment of prognosis and treatment. IVCM is particularly useful in the study of corneal nerves, enabling description of the morphology, density, and disease- or surgically induced alterations of nerves, particularly the subbasal nerve plexus. In glaucoma, IVCM constitutes an important aid to evaluate filtering blebs, to better understand the conjunctival wound healing process, and to assess corneal changes induced by topical antiglaucoma medications and their preservatives. IVCM has significantly enhanced our understanding of the ocular response to contact lens wear. It has provided new perspectives at a cellular level on a wide range of contact lens complications, revealing findings that were not previously possible to image in the living human eye. The final section of this mini-review provides a focus on advances in confocal microscopy imaging. These include 2D wide-field mapping, 3D reconstruction of the cornea and automated image analysis.
Resumo:
The 2008 NASA Astrobiology Roadmap provides one way of theorising this developing field, a way which has become the normative model for the discipline: science-and scholarship-driven funding for space. By contrast, a novel re-evaluation of funding policies is undertaken in this article to reframe astrobiology, terraforming and associated space travel and research. Textual visualisation, discourse and numeric analytical methods, and value theory are applied to historical data and contemporary sources to re-investigate significant drivers and constraints on the mechanisms of enabling space exploration. Two data sets are identified and compared: the business objectives and outcomes of major 15th-17th century European joint-stock exploration and trading companies and a case study of a current space industry entrepreneur company. Comparison of these analyses suggests that viable funding policy drivers can exist outside the normative science and scholarship-driven roadmap. The two drivers identified in this study are (1) the intrinsic value of space as a territory to be experienced and enjoyed, not just studied, and (2) the instrumental, commercial value of exploiting these experiences by developing infrastructure and retail revenues. Filtering of these results also offers an investment rationale for companies operating in, or about to enter, the space business marketplace.
Resumo:
The presence of spam in a document ranking is a major issue for Web search engines. Common approaches that cope with spam remove from the document rankings those pages that are likely to contain spam. These approaches are implemented as post-retrieval processes, that filter out spam pages only after documents have been retrieved with respect to a user’s query. In this paper we suggest to remove spam pages at indexing time, therefore obtaining a pruned index that is virtually “spam-free”. We investigate the benefits of this approach from three points of view: indexing time, index size, and retrieval performances. Not surprisingly, we found that the strategy decreases both the time required by the indexing process and the space required for storing the index. Surprisingly instead, we found that by considering a spam-pruned version of a collection’s index, no difference in retrieval performance is found when compared to that obtained by traditional post-retrieval spam filtering approaches.
Resumo:
Rakaposhi is a synchronous stream cipher, which uses three main components: a non-linear feedback shift register (NLFSR), a dynamic linear feedback shift register (DLFSR) and a non-linear filtering function (NLF). NLFSR consists of 128 bits and is initialised by the secret key K. DLFSR holds 192 bits and is initialised by an initial vector (IV). NLF takes 8-bit inputs and returns a single output bit. The work identifies weaknesses and properties of the cipher. The main observation is that the initialisation procedure has the so-called sliding property. The property can be used to launch distinguishing and key recovery attacks. The distinguisher needs four observations of the related (K,IV) pairs. The key recovery algorithm allows to discover the secret key K after observing 29 pairs of (K,IV). Based on the proposed related-key attack, the number of related (K,IV) pairs is 2(128 + 192)/4 pairs. Further the cipher is studied when the registers enter short cycles. When NLFSR is set to all ones, then the cipher degenerates to a linear feedback shift register with a non-linear filter. Consequently, the initial state (and Secret Key and IV) can be recovered with complexity 263.87. If DLFSR is set to all zeros, then NLF reduces to a low non-linearity filter function. As the result, the cipher is insecure allowing the adversary to distinguish it from a random cipher after 217 observations of keystream bits. There is also the key recovery algorithm that allows to find the secret key with complexity 2 54.
Resumo:
E-mail spam has remained a scourge and menacing nuisance for users, internet and network service operators and providers, in spite of the anti-spam techniques available; and spammers are relentlessly circumventing these anti-spam techniques embedded or installed in form of software products on both client and server sides of both fixed and mobile devices to their advantage. This continuous evasion degrades the capabilities of these anti-spam techniques as none of them provides a comprehensive reliable solution to the problem posed by spam and spammers. Major problem for instance arises when these anti-spam techniques misjudge or misclassify legitimate emails as spam (false positive); or fail to deliver or block spam on the SMTP server (false negative); and the spam passes-on to the receiver, and yet this server from where it originates does not notice or even have an auto alert service to indicate that the spam it was designed to prevent has slipped and moved on to the receiver’s SMTP server; and the receiver’s SMTP server still fail to stop the spam from reaching user’s device and with no auto alert mechanism to inform itself of this inability; thus causing a staggering cost in loss of time, effort and finance. This paper takes a comparative literature overview of some of these anti-spam techniques, especially the filtering technological endorsements designed to prevent spam, their merits and demerits to entrench their capability enhancements, as well as evaluative analytical recommendations that will be subject to further research.
Resumo:
Autonomous navigation and picture compilation tasks require robust feature descriptions or models. Given the non Gaussian nature of sensor observations, it will be shown that Gaussian mixture models provide a general probabilistic representation allowing analytical solutions to the update and prediction operations in the general Bayesian filtering problem. Each operation in the Bayesian filter for Gaussian mixture models multiplicatively increases the number of parameters in the representation leading to the need for a re-parameterisation step. A computationally efficient re-parameterisation step will be demonstrated resulting in a compact and accurate estimate of the true distribution.
Resumo:
This thesis describes the investigation of an Aircraft Dynamic Navigation (ADN) approach, which incorporates an Aircraft Dynamic Model (ADM) directly into the navigation filter of a fixed-wing aircraft or UAV. The result is a novel approach that offers both performance improvements and increased reliability during short-term GPS outages. This is important in allowing future UAVs to achieve routine, unconstrained, and safe operations in commercial environments. The primary contribution of this research is the formulation Unscented Kalman Filter (UKF) which incorporates a complex, non-linear, laterally and longitudinally coupled, ADM, and sensor suite consisting of a Global Positioning System (GPS) receiver, Inertial Measurement Unit (IMU), Electronic Compass (EC), and Air Data (AD) Pitot Static System.
Resumo:
As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide evaluation of allelic variation in apple (Malus×domestica) breeding germplasm. For genome-wide SNP discovery, 27 apple cultivars were chosen to represent worldwide breeding germplasm and re-sequenced at low coverage with the Illumina Genome Analyzer II. Following alignment of these sequences to the whole genome sequence of 'Golden Delicious', SNPs were identified using SoapSNP. A total of 2,113,120 SNPs were detected, corresponding to one SNP to every 288 bp of the genome. The Illumina GoldenGate® assay was then used to validate a subset of 144 SNPs with a range of characteristics, using a set of 160 apple accessions. This validation assay enabled fine-tuning of the final subset of SNPs for the Illumina Infinium® II system. The set of stringent filtering criteria developed allowed choice of a set of SNPs that not only exhibited an even distribution across the apple genome and a range of minor allele frequencies to ensure utility across germplasm, but also were located in putative exonic regions to maximize genotyping success rate. A total of 7867 apple SNPs was established for the IRSC apple 8K SNP array v1, of which 5554 were polymorphic after evaluation in segregating families and a germplasm collection. This publicly available genomics resource will provide an unprecedented resolution of SNP haplotypes, which will enable marker-locus-trait association discovery, description of the genetic architecture of quantitative traits, investigation of genetic variation (neutral and functional), and genomic selection in apple.
Resumo:
This paper presents a method for the estimation of thrust model parameters of uninhabited airborne systems using specific flight tests. Particular tests are proposed to simplify the estimation. The proposed estimation method is based on three steps. The first step uses a regression model in which the thrust is assumed constant. This allows us to obtain biased initial estimates of the aerodynamic coeficients of the surge model. In the second step, a robust nonlinear state estimator is implemented using the initial parameter estimates, and the model is augmented by considering the thrust as random walk. In the third step, the estimate of the thrust obtained by the observer is used to fit a polynomial model in terms of the propeller advanced ratio. We consider a numerical example based on Monte-Carlo simulations to quantify the sampling properties of the proposed estimator given realistic flight conditions.
Resumo:
A robust visual tracking system requires an object appearance model that is able to handle occlusion, pose, and illumination variations in the video stream. This can be difficult to accomplish when the model is trained using only a single image. In this paper, we first propose a tracking approach based on affine subspaces (constructed from several images) which are able to accommodate the abovementioned variations. We use affine subspaces not only to represent the object, but also the candidate areas that the object may occupy. We furthermore propose a novel approach to measure affine subspace-to-subspace distance via the use of non-Euclidean geometry of Grassmann manifolds. The tracking problem is then considered as an inference task in a Markov Chain Monte Carlo framework via particle filtering. Quantitative evaluation on challenging video sequences indicates that the proposed approach obtains considerably better performance than several recent state-of-the-art methods such as Tracking-Learning-Detection and MILtrack.
Resumo:
This paper presents a practical recursive fault detection and diagnosis (FDD) scheme for online identification of actuator faults for unmanned aerial systems (UASs) based on the unscented Kalman filtering (UKF) method. The proposed FDD algorithm aims to monitor health status of actuators and provide indication of actuator faults with reliability, offering necessary information for the design of fault-tolerant flight control systems to compensate for side-effects and improve fail-safe capability when actuator faults occur. The fault detection is conducted by designing separate UKFs to detect aileron and elevator faults using a nonlinear six degree-of-freedom (DOF) UAS model. The fault diagnosis is achieved by isolating true faults by using the Bayesian Classifier (BC) method together with a decision criterion to avoid false alarms. High-fidelity simulations with and without measurement noise are conducted with practical constraints considered for typical actuator fault scenarios, and the proposed FDD exhibits consistent effectiveness in identifying occurrence of actuator faults, verifying its suitability for integration into the design of fault-tolerant flight control systems for emergency landing of UASs.
Resumo:
The development of global navigation satellite systems (GNSS) provides a solution of many applied problems with increasingly higher quality and accuracy nowadays. Researches that are carried out by the Bavarian Academy of Sciences and Humanities in Munich (BAW) in the field of airborne gravimetry are based on sophisticated data processing from high frequency GNSS receiver for kinematic aircraft positioning. Applied algorithms for inertial acceleration determination are based on the high sampling rate (50Hz) and on reducing of such factors as ionosphere scintillation and multipath at aircraft /antenna near field effects. The quality of the GNSS derived kinematic height are studied also by intercomparison with lift height variations collected by a precise high sampling rate vertical scale [1]. This work is aimed at the ways of more accurate determination of mini-aircraft altitude by means of high frequency GNSS receivers, in particular by considering their dynamic behaviour.
Resumo:
Современный этап развития комплексов автоматического управления и навигации малогабаритными БЛА многократного применения предъявляет высокие требования к автономности, точности и миниатюрности данных систем. Противоречивость требований диктует использование функционального и алгоритмического объединения нескольких разнотипных источников навигационной информации в едином вычислительном процессе на основе методов оптимальной фильтрации. Получили широкое развитие бесплатформенные инерциальные навигационные системы (БИНС) на основе комплексирования данных микромеханических датчиков инерциальной информации и датчиков параметров движения в воздушном потоке с данными спутниковых навигационных систем (СНС). Однако в современных условиях такой подход не в полной мере реализует требования к помехозащищённости, автономности и точности получаемой навигационной информации. Одновременно с этим достигли значительного прогресса навигационные системы, использующие принципы корреляционно экстремальной навигации по оптическим ориентирам и цифровым картам местности. Предлагается схема построения автономной автоматической навигационной системы (АНС) для БЛА многоразового применения на основе объединения алгоритмов БИНС, спутниковой навигационной системы и оптической навигационной системы. The modern stage of automatic control and guidance systems development for small unmanned aerial vehicles (UAV) is determined by advanced requirements for autonomy, accuracy and size of the systems. The contradictory of the requirements dictates novel functional and algorithmic tight coupling of several different onboard sensors into one computational process, which is based on methods of optimal filtering. Nowadays, data fusion of micro-electro mechanical sensors of inertial measurement units, barometric pressure sensors, and signals of global navigation satellite systems (GNSS) receivers is widely used in numerous strap down inertial navigation systems (INS). However, the systems do not fully comply with such requirements as jamming immunity, fault tolerance, autonomy, and accuracy of navigation. At the same time, the significant progress has been recently demonstrated by the navigation systems, which use the correlation extremal principle applied for optical data flow and digital maps. This article proposes a new architecture of automatic navigation management system (ANMS) for small UAV, which combines algorithms of strap down INS, satellite navigation and optical navigation system.