899 resultados para Information Filtering, Pattern Mining, Relevance Feature Discovery, Text Mining
Resumo:
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos
Resumo:
Recommender systems attempt to predict items in which a user might be interested, given some information about the user's and items' profiles. Most existing recommender systems use content-based or collaborative filtering methods or hybrid methods that combine both techniques (see the sidebar for more details). We created Informed Recommender to address the problem of using consumer opinion about products, expressed online in free-form text, to generate product recommendations. Informed recommender uses prioritized consumer product reviews to make recommendations. Using text-mining techniques, it maps each piece of each review comment automatically into an ontology
Resumo:
These slides support students in understanding how to respond to the challenge of: "I’ve been told not to use Google or Wikipedia to research my essay. What else is there?" The powerpoint guides students in how to identify high quality, up to date and relevant resources on the web that they can reliably draw upon for their academic assignments. The slides were created by the subject liaison librarian who supports the School of Electronics and Computer Science at the University of Southampton, Fiona Nichols.
Resumo:
Vam monitoritzar paràmetres físics i químics, macroinvertebrats bentònics, clorofil·la a, productors primaris i matèria orgànica durant un any (2001-2002) per examinar els efectes d'una font puntual sobre la composició taxonòmica, la estructura de la comunitat, l'organització funcional, la utilització de l'habitat i la estoquiometria al riu la Tordera (Catalunya). Aigües avall de la font puntual, concentració de nutrients, cabal i conductivitat eren majors que al tram d'aigües amunt, mentre que oxigen dissolt era menor. La densitat de macroinvertebrats era més elevada al tram d'aigües avall però la biomassa era similar als dos trams. La riquesa taxonòmica al tram de dalt era un 20% més alt que al tram de baix. Els anàlisis d'ordenació separen clarament els dos trams en el primer eix, mentre que els dos trams presentaven una pauta temporal similar en el segon eix. La similaritat entre els dos trams en composició taxonòmica, densitats i biomasses després de les crescudes d'abril i maig de 2002, indiquen que les pertorbacions del cabal poden actuar com a un mecanisme de reinici de la comunitat bentònica i jugar un paper important per a la restauració d'ecosistemes fluvials. Els dos trams presentaven una biomassa de perifiton, plantes vasculars, CPOM i FPOM similars, mentre que clorofil·la a, algues filamentoses, molses i SPOM eren majors al tram d'aigües avall. La densitat relativa de trituradors era menor sota la font puntual mentre que col·lectors i filtradors van ser afavorits. La biomassa relativa de trituradors també era menor sota la font puntual, però la biomassa de col·lectors i depredadors va augmentar. Les relacions entre densitat de grups tròfics i els seus recursos eren rarament significatives. La relació s'explicava millor amb la biomassa de macroinvertebrats. Els dos trams compartien la mateixa relació per raspadors, col·lectors i filtradors però no per trituradors i depredadors. La densitat i la biomassa de macroinvertebrats es trobaven positivament correlacionades amb la quantitat de recursos tròfics i la complexitat d'habitat, mentre que la riquesa taxonòmica es trobava negativament relacionada amb paràmetres hidràulics. La influència dels substrats inorgànics prenia menor rellevància per a la distribució dels macroinvertebrats. Els anàlisis d'ordenació mostren com les variables del microhabitat de major rellevància eren CPOM, clorofil·la a, algues filamentoses i velocitat. La cobertura de sorra només era significativa per al tram d'aigües amunt i les molses, al d'aigües avall. El número de correlacions significatives entre macroinvertebrats i les variables del microhabitat era més elevat per al tram de dalt que per al de baix, bàsicament per diferències en composició taxonòmica. La biomassa de macroinvertebrats va aportar una informació semblant a la obtinguda per la densitat. Perifiton i molses tenien uns continguts de nutrients similars en els dos trams. Els %C i %N d'algues filamentoses també eren similars en els dos trams però el %P sota la font puntual era el doble que al tram de dalt. Les relacions estoquiomètriques en CPOM, FPOM i SPOM eren considerablement menors sota la font puntual. Els continguts elementals i relacions van ser molt variables entre taxons de macroinvertebrats però no van resultar significativament diferents entre els dos trams. Dípters, tricòpters i efemeròpters presentaven una estoquiometria similar, mentre que el C i el N eren inferiors en moluscs i el P en coleòpters. Els depredadors presentaven un contingut en C i N més elevat que la resta de grups tròfics, mentre que el P era major en els filtradors. Els desequilibris elementals entre consumidors i recursos eren menors en el tram d'aigües avall. A la tardor i l'hivern la major font de nutrients va ser la BOM mentre que a la primavera i a l'estiu va ser el perifiton.
Resumo:
This thesis proposes a solution to the problem of estimating the motion of an Unmanned Underwater Vehicle (UUV). Our approach is based on the integration of the incremental measurements which are provided by a vision system. When the vehicle is close to the underwater terrain, it constructs a visual map (so called "mosaic") of the area where the mission takes place while, at the same time, it localizes itself on this map, following the Concurrent Mapping and Localization strategy. The proposed methodology to achieve this goal is based on a feature-based mosaicking algorithm. A down-looking camera is attached to the underwater vehicle. As the vehicle moves, a sequence of images of the sea-floor is acquired by the camera. For every image of the sequence, a set of characteristic features is detected by means of a corner detector. Then, their correspondences are found in the next image of the sequence. Solving the correspondence problem in an accurate and reliable way is a difficult task in computer vision. We consider different alternatives to solve this problem by introducing a detailed analysis of the textural characteristics of the image. This is done in two phases: first comparing different texture operators individually, and next selecting those that best characterize the point/matching pair and using them together to obtain a more robust characterization. Various alternatives are also studied to merge the information provided by the individual texture operators. Finally, the best approach in terms of robustness and efficiency is proposed. After the correspondences have been solved, for every pair of consecutive images we obtain a list of image features in the first image and their matchings in the next frame. Our aim is now to recover the apparent motion of the camera from these features. Although an accurate texture analysis is devoted to the matching pro-cedure, some false matches (known as outliers) could still appear among the right correspon-dences. For this reason, a robust estimation technique is used to estimate the planar transformation (homography) which explains the dominant motion of the image. Next, this homography is used to warp the processed image to the common mosaic frame, constructing a composite image formed by every frame of the sequence. With the aim of estimating the position of the vehicle as the mosaic is being constructed, the 3D motion of the vehicle can be computed from the measurements obtained by a sonar altimeter and the incremental motion computed from the homography. Unfortunately, as the mosaic increases in size, image local alignment errors increase the inaccuracies associated to the position of the vehicle. Occasionally, the trajectory described by the vehicle may cross over itself. In this situation new information is available, and the system can readjust the position estimates. Our proposal consists not only in localizing the vehicle, but also in readjusting the trajectory described by the vehicle when crossover information is obtained. This is achieved by implementing an Augmented State Kalman Filter (ASKF). Kalman filtering appears as an adequate framework to deal with position estimates and their associated covariances. Finally, some experimental results are shown. A laboratory setup has been used to analyze and evaluate the accuracy of the mosaicking system. This setup enables a quantitative measurement of the accumulated errors of the mosaics created in the lab. Then, the results obtained from real sea trials using the URIS underwater vehicle are shown.
Resumo:
La comunitat científica que treballa en Intel·ligència Artificial (IA) ha dut a terme una gran quantitat de treball en com la IA pot ajudar a les persones a trobar el que volen dins d'Internet. La idea dels sistemes recomanadors ha estat extensament acceptada pels usuaris. La tasca principal d'un sistema recomanador és localitzar ítems, fonts d'informació i persones relacionades amb els interessos i preferències d'una persona o d'un grup de persones. Això comporta la construcció de models d'usuari i l'habilitat d'anticipar i predir les preferències de l'usuari. Aquesta tesi està focalitzada en l'estudi de tècniques d'IA que millorin el rendiment dels sistemes recomanadors. Inicialment, s'ha dut a terme un anàlisis detallat de l'actual estat de l'art en aquest camp. Aquest treball ha estat organitzat en forma de taxonomia on els sistemes recomanadors existents a Internet es classifiquen en 8 dimensions generals. Aquesta taxonomia ens aporta una base de coneixement indispensable pel disseny de la nostra proposta. El raonament basat en casos (CBR) és un paradigma per aprendre i raonar a partir de la experiència adequat per sistemes recomanadors degut als seus fonaments en el raonament humà. Aquesta tesi planteja una nova proposta de CBR aplicat al camp de la recomanació i un mecanisme d'oblit per perfils basats en casos que controla la rellevància i edat de les experiències passades. Els resultats experimentals demostren que aquesta proposta adapta millor els perfils als usuaris i soluciona el problema de la utilitat que pateixen el sistemes basats en CBR. Els sistemes recomanadors milloren espectacularment la qualitat dels resultats quan informació sobre els altres usuaris és utilitzada quan es recomana a un usuari concret. Aquesta tesi proposa l'agentificació dels sistemes recomanadors per tal de treure profit de propietats interessants dels agents com ara la proactivitat, la encapsulació o l'habilitat social. La col·laboració entre agents es realitza a partir del mètode de filtratge basat en la opinió i del mètode col·laboratiu de filtratge a partir de confiança. Els dos mètodes es basen en un model social de confiança que fa que els agents siguin menys vulnerables als altres quan col·laboren. Els resultats experimentals demostren que els agents recomanadors col·laboratius proposats milloren el rendiment del sistema mentre que preserven la privacitat de les dades personals de l'usuari. Finalment, aquesta tesi també proposa un procediment per avaluar sistemes recomanadors que permet la discussió científica dels resultats. Aquesta proposta simula el comportament dels usuaris al llarg del temps basat en perfils d'usuari reals. Esperem que aquesta metodologia d'avaluació contribueixi al progrés d'aquesta àrea de recerca.
Resumo:
Germin and germin-like proteins (GLPs) are encoded by a family of genes found in all plants. They are part of the cupin superfamily of biochemically diverse proteins, a superfamily that has a conserved tertiary structure, though with limited similarity in primary sequence. The subgroups of GLPs have different enzyme functions that include the two hydrogen peroxide-generating enzymes, oxalate oxidase (OxO) and superoxide dismutase. This review summarizes the sequence and structural details of GLPs and also discusses their evolutionary progression, particularly their amplification in gene number during the evolution of the land plants. In terms of function, the GLPs are known to be differentially expressed during specific periods of plant growth and development, a pattern of evolutionary subfunctionalization. They are also implicated in the response of plants to biotic (viruses, bacteria, mycorrhizae, fungi, insects, nematodes, and parasitic plants) and abiotic (salt, heat/cold, drought, nutrient, and metal) stress. Most detailed data come from studies of fungal pathogenesis in cereals. This involvement with the protection of plants from environmental stress of various types has led to numerous plant breeding studies that have found links between GLPs and QTLs for disease and stress resistance. In addition the OxO enzyme has considerable commercial significance, based principally on its use in the medical diagnosis of oxalate concentration in plasma and urine. Finally, this review provides information on the nutritional importance of these proteins in the human diet, as several members are known to be allergenic, a feature related to their thermal stability and evolutionary connection to the seed storage proteins, also members of the cupin superfamily.
Resumo:
In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.
Resumo:
Knowledge-elicitation is a common technique used to produce rules about the operation of a plant from the knowledge that is available from human expertise. Similarly, data-mining is becoming a popular technique to extract rules from the data available from the operation of a plant. In the work reported here knowledge was required to enable the supervisory control of an aluminium hot strip mill by the determination of mill set-points. A method was developed to fuse knowledge-elicitation and data-mining to incorporate the best aspects of each technique, whilst avoiding known problems. Utilisation of the knowledge was through an expert system, which determined schedules of set-points and provided information to human operators. The results show that the method proposed in this paper was effective in producing rules for the on-line control of a complex industrial process. (C) 2005 Elsevier Ltd. All rights reserved.
Resumo:
This paper provides an extended analysis of livelihood diversification in rural Tanzania, with special emphasis on artisanal and small-scale mining (ASM). Over the past decade, this sector of industry, which is labour-intensive and comprises an array of rudimentary and semi-mechanized operations, has become an indispensable economic activity throughout Sub-Saharan Africa, providing employment to a host of redundant public sector workers, retrenched large-scale mine labourers and poor farmers. In many of the region’s rural areas, it is overtaking subsistence agriculture as the primary industry. Such a pattern appears to be unfolding within the Morogoro and Mbeya regions of southern Tanzania, where findings from recent research suggest that a growing number of smallholder farmers are turning to ASM for employment and financial support. It is imperative that national rural development programmes take this trend into account and provide support to these people.
Resumo:
Howard Barker is a writer who has made several notable excursions into what he calls ‘the charnel house…of European drama.’ David Ian Rabey has observed that a compelling property of these classical works lies in what he calls ‘the incompleteness of [their] prescriptions’, and Barker’s Women Beware Women (1986), Seven Lears (1990) and Gertrude: The Cry (2002), are in turn based around the gaps and interstices found in Thomas Middleton’s Women Beware Women (c1627), Shakespeare’s King Lear (c1604) and Hamlet (c1601) respectively. This extends from representing the missing queen from King Lear, who Barker observes, ‘is barely quoted even in the depths of rage or pity’, to his new ending for Middleton’s Jacobean tragedy and the erotic revivification of Hamlet’s mother. This paper will argue that each modern reappropriation accentuates a hidden but powerful feature in these Elizabethan and Jacobean plays – namely their clash between obsessive desire, sexual transgression and death against the imposed restitution of a prescribed morality. This contradiction acts as the basis for Barker’s own explorations of eroticism, death and tragedy. The paper will also discuss Barker’s project for these ‘antique texts’, one that goes beyond what he derisively calls ‘relevance’, but attempts instead to recover ‘smothered genius’, whereby the transgressive is ‘concealed within structures that lend an artificial elegance.’ Together with Barker’s own rediscovery of tragedy, the paper will assert that these rewritings of Elizabethan and Jacobean drama expose their hidden, yet unsettling and provocative ideologies concerning the relationship between political corruption / justice through the power of sexuality (notably through the allure and danger of the mature woman), and an erotics of death that produces tragedy for the contemporary age.
Resumo:
In a world of almost permanent and rapidly increasing electronic data availability, techniques of filtering, compressing, and interpreting this data to transform it into valuable and easily comprehensible information is of utmost importance. One key topic in this area is the capability to deduce future system behavior from a given data input. This book brings together for the first time the complete theory of data-based neurofuzzy modelling and the linguistic attributes of fuzzy logic in a single cohesive mathematical framework. After introducing the basic theory of data-based modelling, new concepts including extended additive and multiplicative submodels are developed and their extensions to state estimation and data fusion are derived. All these algorithms are illustrated with benchmark and real-life examples to demonstrate their efficiency. Chris Harris and his group have carried out pioneering work which has tied together the fields of neural networks and linguistic rule-based algortihms. This book is aimed at researchers and scientists in time series modeling, empirical data modeling, knowledge discovery, data mining, and data fusion.