879 resultados para Focused retrieval
Resumo:
Bioacoustic data can be used for monitoring animal species diversity. The deployment of acoustic sensors enables acoustic monitoring at large temporal and spatial scales. We describe a content-based birdcall retrieval algorithm for the exploration of large data bases of acoustic recordings. In the algorithm, an event-based searching scheme and compact features are developed. In detail, ridge events are detected from audio files using event detection on spectral ridges. Then event alignment is used to search through audio files to locate candidate instances. A similarity measure is then applied to dimension-reduced spectral ridge feature vectors. The event-based searching method processes a smaller list of instances for faster retrieval. The experimental results demonstrate that our features achieve better success rate than existing methods and the feature dimension is greatly reduced.
Resumo:
Several techniques are known for searching an ordered collection of data. The techniques and analyses of retrieval methods based on primary attributes are straightforward. Retrieval using secondary attributes depends on several factors. For secondary attribute retrieval, the linear structures—inverted lists, multilists, doubly linked lists—and the recently proposed nonlinear tree structures—multiple attribute tree (MAT), K-d tree (kdT)—have their individual merits. It is shown in this paper that, of the two tree structures, MAT possesses several features of a systematic data structure for external file organisation which make it superior to kdT. Analytic estimates for the complexity of node searchers, in MAT and kdT for several types of queries, are developed and compared.
Resumo:
Loads that miss in L1 or L2 caches and waiting for their data at the head of the ROB cause significant slow down in the form of commit stalls. We identify that most of these commit stalls are caused by a small set of loads, referred to as LIMCOS (Loads Incurring Majority of COmmit Stalls). We propose simple history-based classifiers that track commit stalls suffered by loads to help us identify this small set of loads. We study an application of these classifiers to prefetching. The classifiers are used to train the prefetcher to focus on the misses suffered by LIMCOS. This, referred to as focused prefetching, results in a 9.8% gain in IPC over naive GHB based delta correlation prefetcher along with a 20.3% reduction in memory traffic for a set of 17 memory-intensive SPEC2000 benchmarks. Another important impact of focused prefetching is a 61% improvement in the accuracy of prefetches. We demonstrate that the proposed classification criterion performs better than other existing criteria like criticality and delinquent loads. Also we show that the criterion of focusing on commit stalls is robust enough across cache levels and can be applied to any prefetcher without any modifications to the prefetcher.
Resumo:
The aim of this dissertation was to explore teaching in higher education from the teachers’ perspective. Two of the four studies analysed the effect of pedagogical training on approaches to teaching and on self-efficacy beliefs of teachers on teaching. Of these two studies, Study I analysed the effect of pedagogical training by applying a cross-sectional setting. The results showed that short training made teachers less student-centred and decreased their self-efficacy beliefs, as reported by the teachers themselves. However, more constant training enhanced the adoption of a student-centred approach to teaching and increased the self-efficacy beliefs of teachers as well. The teacher-focused approach to teaching was more resistant to change. Study II, on the other hand, applied a longitudinal setting. The results implied that among teachers who had not acquired more pedagogical training after Study II there were no changes in the student-focused approach scale between the measurements. However, teachers who had participated in further pedagogical training scored significantly higher on the scale measuring the student-focused approach to teaching. There were positive changes in the self-efficacy beliefs of teachers among teachers who had not participated in further training as well as among those who had. However, the analysis revealed that those teachers had the least teaching experience. Again, the teacher-focused approach was more resistant to change. Study III analysed approaches to teaching qualitatively by using a large and multidisciplinary sample in order to capture the variation in descriptions of teaching. Two broad categories of description were found: the learning-focused and the content-focused approach to teaching. The results implied that the purpose of teaching separates the two categories. In addition, the study aimed to identify different aspects of teaching in the higher-education context. Ten aspects of teaching were identified. While Study III explored teaching on a general level, Study IV analysed teaching on an individual level. The aim was to explore consonance and dissonance in the kinds of combinations of approaches to teaching university teachers adopt. The results showed that some teachers were clearly and systematically either learning- or content-focused. On the other hand, profiles of some teachers consisted of combinations of learning- and content-focused approaches or conceptions making their profiles dissonant. Three types of dissonance were identified. The four studies indicated that pedagogical training organised for university teachers is needed in order to enhance the development of their teaching. The results implied that the shift from content-focused or dissonant profiles towards consonant learning-focused profiles is a slow process and that teachers’ conceptions of teaching have to be addressed first in order to promote learning-focused teaching.
Resumo:
DEVELOPING A TEXTILE ONTOLOGY FOR THE SEMANTIC WEB AND CONNECTING IT TO MUSEUM CATALOGING DATA The goal of the Semantic Web is to share concept-based information in a versatile way on the Internet. This is achievable using formal data structures called ontologies. The goal of this re-search is to increase the usability of museum cataloging data in information retrieval. The work is interdisciplinary, involving craft science, terminology science, computer science, and museology. In the first part of the dissertation an ontology of concepts of textiles, garments, and accessories is developed for museum cataloging work. The ontology work was done with the help of thesauri, vocabularies, research reports, and standards. The basis of the ontology development was the Museoalan asiasanasto MASA, a thesaurus for museum cataloging work which has been enriched by other vocabularies. Concepts and terms concerning the research object, as well as the material names of textiles, costumes, and accessories, were focused on. The research method was terminological concept analysis complemented by an ontological view of the Semantic Web. The concept structure was based on the hierarchical generic relation. Attention was also paid to other relations between terms and concepts, and between concepts themselves. Altogether 977 concept classes were created. Issues including how to choose and name concepts for the ontology hierarchy and how deep and broad the hierarchy could be are discussed from the viewpoint of the ontology developer and museum cataloger. The second part of the dissertation analyzes why some of the cataloged terms did not match with the developed textile ontology. This problem is significant because it prevents automatic ontological content integration of the cataloged data on the Semantic Web. The research datasets, i.e. the cataloged museum data on textile collections, came from three museums: Espoo City Museum, Lahti City Museum and The National Museum of Finland. The data included 1803 textile, costume, and accessory objects. Unmatched object and textile material names were analyzed. In the case of the object names six categories (475 cases), and of the material names eight categories (423 cases), were found where automatic annotation was not possible. The most common explanation was that the cataloged field was filled with a long sentence comprised of many terms. Sometimes in the compound term, the object name and material, or the name and the way of usage, were combined. As well, numeric values in the material name cataloging field prevented annotation and so did the absence of a corresponding concept in the ontology. Ready-made drop-down lists of materials used in one cataloging system facilitated the annotation. In the case of naming objects and materials, one should use terms in basic form without attributes. The developed textile ontology has been applied in two cultural portals, MuseumFinland and Culturesampo, where one can search for and browse information based on cataloged data using integrated ontologies in an interoperable way. The textile ontology is also part of the national FinnONTO ontology infrastructure. Keywords: annotation, concept, concept analysis, cataloging, museum collection, ontology, Semantic Web, textile collection, textile material
Resumo:
Recent advances in neural language models have contributed new methods for learning distributed vector representations of words (also called word embeddings). Two such methods are the continuous bag-of-words model and the skipgram model. These methods have been shown to produce embeddings that capture higher order relationships between words that are highly effective in natural language processing tasks involving the use of word similarity and word analogy. Despite these promising results, there has been little analysis of the use of these word embeddings for retrieval. Motivated by these observations, in this paper, we set out to determine how these word embeddings can be used within a retrieval model and what the benefit might be. To this aim, we use neural word embeddings within the well known translation language model for information retrieval. This language model captures implicit semantic relations between the words in queries and those in relevant documents, thus producing more accurate estimations of document relevance. The word embeddings used to estimate neural language models produce translations that differ from previous translation language model approaches; differences that deliver improvements in retrieval effectiveness. The models are robust to choices made in building word embeddings and, even more so, our results show that embeddings do not even need to be produced from the same corpus being used for retrieval.
Resumo:
In this paper, we discuss the measurements of spectral surface reflectance (rho(s)(lambda)) in the wavelength range 350-2500 nm measured using a spectroradiometer onboard a low-flying aircraft over Bangalore (12.95 degrees N, 77.65 degrees E), an urban site in southern India. The large discrepancies in the retrieval of aerosol propertiesover land by the Moderate-Resolution Imaging Spectroradiometer (MODIS), which could be attributed to the inaccurate estimation of surface reflectance at many sites in India and elsewhere, provided motivation for this paper. The aim of this paper was to verify the surface reflectance relationships assumed by the MODIS aerosol algorithm for the estimation of surface reflectance in the visible channels (470 and 660 nm) from the surface reflectance at 2100 nm for aerosol retrieval over land. The variety of surfaces observed in this paper includes green and dry vegetations, bare land, and urban surfaces. The measuredreflectance data were first corrected for the radiative effects of atmosphere lying between the ground and aircraft using the Second Simulation of Satellite Signal in the Solar Spectrum (6S) radiative transfer code. The corrected surface reflectance in the MODIS's blue (rho(s)(470)), red (rho(s)(660)), and shortwave-infrared (SWIR) channel (rho(s)(2100)) was linearly correlated. We found that the slope of reflectance relationship between 660 and 2100 nm derived from the forward scattering data was 0.53 with an intercept of 0.07, whereas the slope for the relationship between the reflectance at 470 and 660 nm was 0.85. These values are much higher than the slope (similar to 0.49) for either wavelengths assumed by the MODIS aerosol algorithm over this region. The reflectance relationship for the backward scattering data has a slope of 0.39, with an intercept of 0.08 for 660 nm, and 0.65, with an intercept of 0.08 for 470 nm. The large values of the intercept (which is very small in the MODIS reflectance relationships) result in larger values of absolute surface reflectance in the visible channels. The discrepancy between the measured and assumed surface reflectances could lead to error in the aerosol retrieval. The reflectance ratio (rho(s)(660)/rho(s)(2100)) showed a clear dependence on the N D V I-SWIR where the ratio increased from 0.5 to 1 with an increase in N V I-SWIR from 0 to 0.5. The high correlation between the reflectance at SWIR wavelengths (2100, 1640, and 1240 nm) indicated an opportunity to derive the surface reflectance and, possibly, aerosol properties at these wavelengths. We need more experiments to characterize the surface reflectance and associated inhomogeneity of land surfaces, which play a critical role in the remote sensing of aerosols over land.
Resumo:
A wide range of models used in agriculture, ecology, carbon cycling, climate and other related studies require information on the amount of leaf material present in a given environment to correctly represent radiation, heat, momentum, water, and various gas exchanges with the overlying atmosphere or the underlying soil. Leaf area index (LAI) thus often features as a critical land surface variable in parameterisations of global and regional climate models, e.g., radiation uptake, precipitation interception, energy conversion, gas exchange and momentum, as all areas are substantially determined by the vegetation surface. Optical wavelengths of remote sensing are the common electromagnetic regions used for LAI estimations and generally for vegetation studies. The main purpose of this dissertation was to enhance the determination of LAI using close-range remote sensing (hemispherical photography), airborne remote sensing (high resolution colour and colour infrared imagery), and satellite remote sensing (high resolution SPOT 5 HRG imagery) optical observations. The commonly used light extinction models are applied at all levels of optical observations. For the sake of comparative analysis, LAI was further determined using statistical relationships between spectral vegetation index (SVI) and ground based LAI. The study areas of this dissertation focus on two regions, one located in Taita Hills, South-East Kenya characterised by tropical cloud forest and exotic plantations, and the other in Gatineau Park, Southern Quebec, Canada dominated by temperate hardwood forest. The sampling procedure of sky map of gap fraction and size from hemispherical photographs was proven to be one of the most crucial steps in the accurate determination of LAI. LAI and clumping index estimates were significantly affected by the variation of the size of sky segments for given zenith angle ranges. On sloping ground, gap fraction and size distributions present strong upslope/downslope asymmetry of foliage elements, and thus the correction and the sensitivity analysis for both LAI and clumping index computations were demonstrated. Several SVIs can be used for LAI mapping using empirical regression analysis provided that the sensitivities of SVIs at varying ranges of LAI are large enough. Large scale LAI inversion algorithms were demonstrated and were proven to be a considerably efficient alternative approach for LAI mapping. LAI can be estimated nonparametrically from the information contained solely in the remotely sensed dataset given that the upper-end (saturated SVI) value is accurately determined. However, further study is still required to devise a methodology as well as instrumentation to retrieve on-ground green leaf area index . Subsequently, the large scale LAI inversion algorithms presented in this work can be precisely validated. Finally, based on literature review and this dissertation, potential future research prospects and directions were recommended.
Resumo:
The usual task in music information retrieval (MIR) is to find occurrences of a monophonic query pattern within a music database, which can contain both monophonic and polyphonic content. The so-called query-by-humming systems are a famous instance of content-based MIR. In such a system, the user's hummed query is converted into symbolic form to perform search operations in a similarly encoded database. The symbolic representation (e.g., textual, MIDI or vector data) is typically a quantized and simplified version of the sampled audio data, yielding to faster search algorithms and space requirements that can be met in real-life situations. In this thesis, we investigate geometric approaches to MIR. We first study some musicological properties often needed in MIR algorithms, and then give a literature review on traditional (e.g., string-matching-based) MIR algorithms and novel techniques based on geometry. We also introduce some concepts from digital image processing, namely the mathematical morphology, which we will use to develop and implement four algorithms for geometric music retrieval. The symbolic representation in the case of our algorithms is a binary 2-D image. We use various morphological pre- and post-processing operations on the query and the database images to perform template matching / pattern recognition for the images. The algorithms are basically extensions to classic image correlation and hit-or-miss transformation techniques used widely in template matching applications. They aim to be a future extension to the retrieval engine of C-BRAHMS, which is a research project of the Department of Computer Science at University of Helsinki.
Resumo:
This thesis is about a comparative study of early childhood education (ECE) curriculum documents focused on education for sustainability (EfS) in South Korea and Australia. It examined how the national ECE curriculum documents in two culturally different contexts align with contemporary concepts of sustainability and activist early childhood education for sustainability (ECEfS) principles. Drawing on systems theory, Korean and Australian ECE curriculum documents were used as the primary sources for this study within the framework of critical document analysis (CDA). This study offers a step forward in developing culturally inclusive/holistic understandings of sustainability and more contextualised/localised approaches to ECEfS.
Resumo:
This research investigates techniques to analyse long duration acoustic recordings to help ecologists monitor birdcall activities. It designs a generalized algorithm to identify a broad range of bird species. It allows ecologists to search for arbitrary birdcalls of interest, rather than restricting them to just a very limited number of species on which the recogniser is trained. The algorithm can help ecologists find sounds of interest more efficiently by filtering out large volumes of unwanted sounds and only focusing on birdcalls.
Resumo:
The increased availability of image capturing devices has enabled collections of digital images to rapidly expand in both size and diversity. This has created a constantly growing need for efficient and effective image browsing, searching, and retrieval tools. Pseudo-relevance feedback (PRF) has proven to be an effective mechanism for improving retrieval accuracy. An original, simple yet effective rank-based PRF mechanism (RB-PRF) that takes into account the initial rank order of each image to improve retrieval accuracy is proposed. This RB-PRF mechanism innovates by making use of binary image signatures to improve retrieval precision by promoting images similar to highly ranked images and demoting images similar to lower ranked images. Empirical evaluations based on standard benchmarks, namely Wang, Oliva & Torralba, and Corel datasets demonstrate the effectiveness of the proposed RB-PRF mechanism in image retrieval.
Resumo:
A repetitive sequence collection is one where portions of a base sequence of length n are repeated many times with small variations, forming a collection of total length N. Examples of such collections are version control data and genome sequences of individuals, where the differences can be expressed by lists of basic edit operations. Flexible and efficient data analysis on a such typically huge collection is plausible using suffix trees. However, suffix tree occupies O(N log N) bits, which very soon inhibits in-memory analyses. Recent advances in full-text self-indexing reduce the space of suffix tree to O(N log σ) bits, where σ is the alphabet size. In practice, the space reduction is more than 10-fold, for example on suffix tree of Human Genome. However, this reduction factor remains constant when more sequences are added to the collection. We develop a new family of self-indexes suited for the repetitive sequence collection setting. Their expected space requirement depends only on the length n of the base sequence and the number s of variations in its repeated copies. That is, the space reduction factor is no longer constant, but depends on N / n. We believe the structures developed in this work will provide a fundamental basis for storage and retrieval of individual genomes as they become available due to rapid progress in the sequencing technologies.