948 resultados para Semantic metrics
Resumo:
In this thesis, we propose to infer pixel-level labelling in video by utilising only object category information, exploiting the intrinsic structure of video data. Our motivation is the observation that image-level labels are much more easily to be acquired than pixel-level labels, and it is natural to find a link between the image level recognition and pixel level classification in video data, which would transfer learned recognition models from one domain to the other one. To this end, this thesis proposes two domain adaptation approaches to adapt the deep convolutional neural network (CNN) image recognition model trained from labelled image data to the target domain exploiting both semantic evidence learned from CNN, and the intrinsic structures of unlabelled video data. Our proposed approaches explicitly model and compensate for the domain adaptation from the source domain to the target domain which in turn underpins a robust semantic object segmentation method for natural videos. We demonstrate the superior performance of our methods by presenting extensive evaluations on challenging datasets comparing with the state-of-the-art methods.
Resumo:
The current study investigated the cognitive workload of sentence and clause wrap-up in younger and older readers. A large number of studies have demonstrated the presence of wrap-up effects, peaks in processing time at clause and sentence boundaries that some argue reflect attention to organizational and integrative semantic processes. However, the exact nature of these wrap-up effects is still not entirely clear, with some arguing that wrap-up is not related to processing difficulty, but rather is triggered by a low-level oculomotor response or the implicit monitoring of intonational contour. The notion that wrap-up effects are resource-demanding was directly tested by examining the degree to which sentence and clause wrap-up affects the parafoveal preview benefit. Older and younger adults read passages in which a target word N occurred in a sentence-internal, clause-final, or sentence-final position. A gaze-contingent boundary change paradigm was used in which, on some trials, a non-word preview of word N+1 was replaced by a target word once the eyes crossed an invisible boundary located between words N and N+1. All measures of reading time on word N were longer at clause and sentence boundaries than in the sentence-internal position. In the earliest measures of reading time, sentence and clause wrap-up showed evidence of reducing the magnitude of the preview benefit similarly for younger and older adults. However, this effect was moderated by age in gaze duration, such that older adults showed a complete reduction in the preview benefit in the sentence-final condition. Additionally, sentence and clause wrap-up were negatively associated with the preview benefit. Collectively, the findings from the current study suggest that wrap-up is cognitively demanding and may be less efficient with age, thus, resulting in a reduction of the parafoveal preview during normal reading.
Resumo:
Humans have a high ability to extract visual data information acquired by sight. Trought a learning process, which starts at birth and continues throughout life, image interpretation becomes almost instinctively. At a glance, one can easily describe a scene with reasonable precision, naming its main components. Usually, this is done by extracting low-level features such as edges, shapes and textures, and associanting them to high level meanings. In this way, a semantic description of the scene is done. An example of this, is the human capacity to recognize and describe other people physical and behavioral characteristics, or biometrics. Soft-biometrics also represents inherent characteristics of human body and behaviour, but do not allow unique person identification. Computer vision area aims to develop methods capable of performing visual interpretation with performance similar to humans. This thesis aims to propose computer vison methods which allows high level information extraction from images in the form of soft biometrics. This problem is approached in two ways, unsupervised and supervised learning methods. The first seeks to group images via an automatic feature extraction learning , using both convolution techniques, evolutionary computing and clustering. In this approach employed images contains faces and people. Second approach employs convolutional neural networks, which have the ability to operate on raw images, learning both feature extraction and classification processes. Here, images are classified according to gender and clothes, divided into upper and lower parts of human body. First approach, when tested with different image datasets obtained an accuracy of approximately 80% for faces and non-faces and 70% for people and non-person. The second tested using images and videos, obtained an accuracy of about 70% for gender, 80% to the upper clothes and 90% to lower clothes. The results of these case studies, show that proposed methods are promising, allowing the realization of automatic high level information image annotation. This opens possibilities for development of applications in diverse areas such as content-based image and video search and automatica video survaillance, reducing human effort in the task of manual annotation and monitoring.
Resumo:
220 p.
Resumo:
Geographically isolated wetlands, those entirely surrounded by uplands, provide numerous ecological functions, some of which are dependent on the degree to which they are hydrologically connected to nearby waters. There is a growing need for field-validated, landscape-scale approaches for classifying wetlands based on their expected degree of connectivity with stream networks. During the 2015 water year, flow duration was recorded in non-perennial streams (n = 23) connecting forested wetlands and nearby perennial streams on the Delmarva Peninsula (Maryland, USA). Field and GIS-derived landscape metrics (indicators of catchment, wetland, non-perennial stream, and soil characteristics) were assessed as predictors of wetland-stream connectivity (duration, seasonal onset and offset dates). Connection duration was most strongly correlated with non-perennial stream geomorphology and wetland characteristics. A final GIS-based stepwise regression model (adj-R2 = 0.74, p < 0.0001) described wetland-stream connection duration as a function of catchment area, wetland area and number, and soil available water storage.
Resumo:
Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state of the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state of the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.
Resumo:
Nearest neighbour collaborative filtering (NNCF) algorithms are commonly used in multimedia recommender systems to suggest media items based on the ratings of users with similar preferences. However, the prediction accuracy of NNCF algorithms is affected by the reduced number of items – the subset of items co-rated by both users – typically used to determine the similarity between pairs of users. In this paper, we propose a different approach, which substantially enhances the accuracy of the neighbour selection process – a user-based CF (UbCF) with semantic neighbour discovery (SND). Our neighbour discovery methodology, which assesses pairs of users by taking into account all the items rated at least by one of the users instead of just the set of co-rated items, semantically enriches this enlarged set of items using linked data and, finally, applies the Collinearity and Proximity Similarity metric (CPS), which combines the cosine similarity with Chebyschev distance dissimilarity metric. We tested the proposed SND against the Pearson Correlation neighbour discovery algorithm off-line, using the HetRec data set, and the results show a clear improvement in terms of accuracy and execution time for the predicted recommendations.
Resumo:
The contribution of the left inferior prefrontal cortex in semantic processing has been widely investigated in the last decade. Converging evidence from functional imaging studies shows that this region is involved in the “executive” or “controlled” aspects of semantic processing. In this study, we report a single case study of a patient, PW, with damage to the right prefrontal and temporal cortices following stroke. PW showed a problem in executive control of semantic processing, where he could not easily override automatic but irrelevant semantic processing. This case thus shows the necessary role of the right inferior prefrontal cortex in executive semantic processing. Compared to tasks previously used in the literature, our tasks placed higher demands on executive semantic processing. We suggest that the right inferior prefrontal cortex is recruited when the demands on executive semantic processing are particularly high.
Resumo:
Part 19: Knowledge Management in Networks
Resumo:
Dependence of some species on landscape structure has been proved in numerous studies. So far, however, little progress has been made in the integration of landscape metrics in the prediction of species associated with coastal features. Specific landscape metrics were tested as predictors of coastal shape using three coastal features of the Iberian Peninsula (beaches, capes and gulfs) at different scales. We used the landscape metrics in combination with environmental variables to model the niche and find suitable habitats for a seagrass species (Cymodocea nodosa) throughout its entire range of distribution. Landscape metrics able to capture variation in the coastline enhanced significantly the accuracy of the models, despite the limitations caused by the scale of the study. We provided the first global model of the factors that can be shaping the environmental niche and distribution of C. nodosa throughout its range. Sea surface temperature and salinity were the most relevant variables. We identified areas that seem unsuitable for C. nodosa as well as those suitable habitats not occupied by the species. We also present some preliminary results of testing historical biogeographical hypotheses derived from distribution predictions under Last Glacial Maximum conditions and genetic diversity data.
Resumo:
Determination of combustion metrics for a diesel engine has the potential of providing feedback for closed-loop combustion phasing control to meet current and upcoming emission and fuel consumption regulations. This thesis focused on the estimation of combustion metrics including start of combustion (SOC), crank angle location of 50% cumulative heat release (CA50), peak pressure crank angle location (PPCL), and peak pressure amplitude (PPA), peak apparent heat release rate crank angle location (PACL), mean absolute pressure error (MAPE), and peak apparent heat release rate amplitude (PAA). In-cylinder pressure has been used in the laboratory as the primary mechanism for characterization of combustion rates and more recently in-cylinder pressure has been used in series production vehicles for feedback control. However, the intrusive measurement with the in-cylinder pressure sensor is expensive and requires special mounting process and engine structure modification. As an alternative method, this work investigated block mounted accelerometers to estimate combustion metrics in a 9L I6 diesel engine. So the transfer path between the accelerometer signal and the in-cylinder pressure signal needs to be modeled. Depending on the transfer path, the in-cylinder pressure signal and the combustion metrics can be accurately estimated - recovered from accelerometer signals. The method and applicability for determining the transfer path is critical in utilizing an accelerometer(s) for feedback. Single-input single-output (SISO) frequency response function (FRF) is the most common transfer path model; however, it is shown here to have low robustness for varying engine operating conditions. This thesis examines mechanisms to improve the robustness of FRF for combustion metrics estimation. First, an adaptation process based on the particle swarm optimization algorithm was developed and added to the single-input single-output model. Second, a multiple-input single-output (MISO) FRF model coupled with principal component analysis and an offset compensation process was investigated and applied. Improvement of the FRF robustness was achieved based on these two approaches. Furthermore a neural network as a nonlinear model of the transfer path between the accelerometer signal and the apparent heat release rate was also investigated. Transfer path between the acoustical emissions and the in-cylinder pressure signal was also investigated in this dissertation on a high pressure common rail (HPCR) 1.9L TDI diesel engine. The acoustical emissions are an important factor in the powertrain development process. In this part of the research a transfer path was developed between the two and then used to predict the engine noise level with the measured in-cylinder pressure as the input. Three methods for transfer path modeling were applied and the method based on the cepstral smoothing technique led to the most accurate results with averaged estimation errors of 2 dBA and a root mean square error of 1.5dBA. Finally, a linear model for engine noise level estimation was proposed with the in-cylinder pressure signal and the engine speed as components.
Resumo:
As users continually request additional functionality, software systems will continue to grow in their complexity, as well as in their susceptibility to failures. Particularly for sensitive systems requiring higher levels of reliability, faulty system modules may increase development and maintenance cost. Hence, identifying them early would support the development of reliable systems through improved scheduling and quality control. Research effort to predict software modules likely to contain faults, as a consequence, has been substantial. Although a wide range of fault prediction models have been proposed, we remain far from having reliable tools that can be widely applied to real industrial systems. For projects with known fault histories, numerous research studies show that statistical models can provide reasonable estimates at predicting faulty modules using software metrics. However, as context-specific metrics differ from project to project, the task of predicting across projects is difficult to achieve. Prediction models obtained from one project experience are ineffective in their ability to predict fault-prone modules when applied to other projects. Hence, taking full benefit of the existing work in software development community has been substantially limited. As a step towards solving this problem, in this dissertation we propose a fault prediction approach that exploits existing prediction models, adapting them to improve their ability to predict faulty system modules across different software projects.
Resumo:
The ontology engineering research community has focused for many years on supporting the creation, development and evolution of ontologies. Ontology forecasting, which aims at predicting semantic changes in an ontology, represents instead a new challenge. In this paper, we want to give a contribution to this novel endeavour by focusing on the task of forecasting semantic concepts in the research domain. Indeed, ontologies representing scientific disciplines contain only research topics that are already popular enough to be selected by human experts or automatic algorithms. They are thus unfit to support tasks which require the ability of describing and exploring the forefront of research, such as trend detection and horizon scanning. We address this issue by introducing the Semantic Innovation Forecast (SIF) model, which predicts new concepts of an ontology at time t + 1, using only data available at time t. Our approach relies on lexical innovation and adoption information extracted from historical data. We evaluated the SIF model on a very large dataset consisting of over one million scientific papers belonging to the Computer Science domain: the outcomes show that the proposed approach offers a competitive boost in mean average precision-at-ten compared to the baselines when forecasting over 5 years.
Resumo:
In this paper, the problem of semantic place categorization in mobile robotics is addressed by considering a time-based probabilistic approach called dynamic Bayesian mixture model (DBMM), which is an improved variation of the dynamic Bayesian network. More specifically, multi-class semantic classification is performed by a DBMM composed of a mixture of heterogeneous base classifiers, using geometrical features computed from 2D laserscanner data, where the sensor is mounted on-board a moving robot operating indoors. Besides its capability to combine different probabilistic classifiers, the DBMM approach also incorporates time-based (dynamic) inferences in the form of previous class-conditional probabilities and priors. Extensive experiments were carried out on publicly available benchmark datasets, highlighting the influence of the number of time-slices and the effect of additive smoothing on the classification performance of the proposed approach. Reported results, under different scenarios and conditions, show the effectiveness and competitive performance of the DBMM.