897 resultados para Optical pattern recognition -- Mathematical models
Resumo:
Automatic gender classification has many security and commercial applications. Various modalities have been investigated for gender classification with face-based classification being the most popular. In some real-world scenarios the face may be partially occluded. In these circumstances a classification based on individual parts of the face known as local features must be adopted. We investigate gender classification using lip movements. We show for the first time that important gender specific information can be obtained from the way in which a person moves their lips during speech. Furthermore our study indicates that the lip dynamics during speech provide greater gender discriminative information than simply lip appearance. We also show that the lip dynamics and appearance contain complementary gender information such that a model which captures both traits gives the highest overall classification result. We use Discrete Cosine Transform based features and Gaussian Mixture Modelling to model lip appearance and dynamics and employ the XM2VTS database for our experiments. Our experiments show that a model which captures lip dynamics along with appearance can improve gender classification rates by between 16-21% compared to models of only lip appearance.
Resumo:
This paper presents a novel method that leverages reasoning capabilities in a computer vision system dedicated to human action recognition. The proposed methodology is decomposed into two stages. First, a machine learning based algorithm - known as bag of words - gives a first estimate of action classification from video sequences, by performing an image feature analysis. Those results are afterward passed to a common-sense reasoning system, which analyses, selects and corrects the initial estimation yielded by the machine learning algorithm. This second stage resorts to the knowledge implicit in the rationality that motivates human behaviour. Experiments are performed in realistic conditions, where poor recognition rates by the machine learning techniques are significantly improved by the second stage in which common-sense knowledge and reasoning capabilities have been leveraged. This demonstrates the value of integrating common-sense capabilities into a computer vision pipeline. © 2012 Elsevier B.V. All rights reserved.
Resumo:
Distinct neural populations carry signals from short-wave (S) cones. We used individual differences to test whether two types of pathways, those that receive excitatory input (S+) and those that receive inhibitory input (S-), contribute independently to psychophysical performance. We also conducted a genome-wide association study (GWAS) to look for genetic correlates of the individual differences. Our psychophysical test was based on the Cambridge Color Test, but detection thresholds were measured separately for S-cone spatial increments and decrements. Our participants were 1060 healthy adults aged 16-40. Test-retest reliabilities for thresholds were good (ρ=0.64 for S-cone increments, 0.67 for decrements and 0.73 for the average of the two). "Regression scores," isolating variability unique to incremental or decremental sensitivity, were also reliable (ρ=0.53 for increments and ρ=0.51 for decrements). The correlation between incremental and decremental thresholds was ρ=0.65. No genetic markers reached genome-wide significance (p-7). We identified 18 "suggestive" loci (p-5). The significant test-retest reliabilities show stable individual differences in S-cone sensitivity in a normal adult population. Though a portion of the variance in sensitivity is shared between incremental and decremental sensitivity, over 26% of the variance is stable across individuals, but unique to increments or decrements, suggesting distinct neural substrates. Some of the variability in sensitivity is likely to be genetic. We note that four of the suggestive associations found in the GWAS are with genes that are involved in glucose metabolism or have been associated with diabetes.
Resumo:
The OSCAR test, a clinical device that uses counterphase flicker photometry, is believed to be sensitive to the relative numbers of long-wavelength and middle-wavelength cones in the retina, as well as to individual variations in the spectral positions of the photopigments. As part of a population study of individual variations in perception, we obtained OSCAR settings from 1058 participants. We report the distribution characteristics for this cohort. A randomly selected subset of participants was tested twice at an interval of at least one week: the test-retest reliability (Spearman's rho) was 0.80. In a whole-genome association analysis we found a provisional association with a single nucleotide polymorphism (rs16844995). This marker is close to the gene RXRG, which encodes a nuclear receptor, retinoid X receptor γ. This nuclear receptor is already known to have a role in the differentiation of cones during the development of the eye, and we suggest that polymorphisms in or close to RXRG influence the relative probability with which long-wave and middle-wave opsin genes are expressed in human cones.
Resumo:
Social signals and interpretation of carried information is of high importance in Human Computer Interaction. Often used for affect recognition, the cues within these signals are displayed in various modalities. Fusion of multi-modal signals is a natural and interesting way to improve automatic classification of emotions transported in social signals. Throughout most present studies, uni-modal affect recognition as well as multi-modal fusion, decisions are forced for fixed annotation segments across all modalities. In this paper, we investigate the less prevalent approach of event driven fusion, which indirectly accumulates asynchronous events in all modalities for final predictions. We present a fusion approach, handling short-timed events in a vector space, which is of special interest for real-time applications. We compare results of segmentation based uni-modal classification and fusion schemes to the event driven fusion approach. The evaluation is carried out via detection of enjoyment-episodes within the audiovisual Belfast Story-Telling Corpus.
Resumo:
Artificial neural network (ANN) methods are used to predict forest characteristics. The data source is the Southeast Alaska (SEAK) Grid Inventory, a ground survey compiled by the USDA Forest Service at several thousand sites. The main objective of this article is to predict characteristics at unsurveyed locations between grid sites. A secondary objective is to evaluate the relative performance of different ANNs. Data from the grid sites are used to train six ANNs: multilayer perceptron, fuzzy ARTMAP, probabilistic, generalized regression, radial basis function, and learning vector quantization. A classification and regression tree method is used for comparison. Topographic variables are used to construct models: latitude and longitude coordinates, elevation, slope, and aspect. The models classify three forest characteristics: crown closure, species land cover, and tree size/structure. Models are constructed using n-fold cross-validation. Predictive accuracy is calculated using a method that accounts for the influence of misclassification as well as measuring correct classifications. The probabilistic and generalized regression networks are found to be the most accurate. The predictions of the ANN models are compared with a classification of the Tongass national forest in southeast Alaska based on the interpretation of satellite imagery and are found to be of similar accuracy.
Resumo:
Burkholderia cenocepacia causes opportunistic infections in plants, insects, animals, and humans, suggesting that “virulence” depends on the host and its innate susceptibility to infection. We hypothesized that modifications in key bacterial molecules recognized by the innate immune system modulate host responses to B. cenocepacia. Indeed, modification of lipo- polysaccharide (LPS) with 4-amino-4-deoxy-L-arabinose and flagellin glycosylation attenuates B. cenocepacia infection in Arabi- dopsis thaliana and Galleria mellonella insect larvae. However, B. cenocepacia LPS and flagellin triggered rapid bursts of nitric oxide and reactive oxygen species in A. thaliana leading to activation of the PR-1 defense gene. These responses were drastically reduced in plants with fls2 (flagellin FLS2 host receptor kinase), Atnoa1 (nitric oxide-associated protein 1), and dnd1-1 (reduced production of nitric oxide) null mutations. Together, our results indicate that LPS modification and flagellin glycosylation do not affect recognition by plant receptors but are required for bacteria to establish overt infection.
Resumo:
PatchCity is a new approach to the procedural generation of city models. The algorithm uses texture synthesis to create a city layout in the visual style of one or more input examples. Data is provided in vector graphic form from either real or synthetic city definitions. The paper describes the PatchCity algorithm, illustrates its use, and identifies its strengths and limitations. The technique provides a greater range of features and styles of city layout than existing generative methods, thereby achieving results that are more realistic. An open source implementation of the algorithm is available.
Resumo:
In this paper we propose a novel recurrent neural networkarchitecture for video-based person re-identification.Given the video sequence of a person, features are extracted from each frame using a convolutional neural network that incorporates a recurrent final layer, which allows information to flow between time-steps. The features from all time steps are then combined using temporal pooling to give an overall appearance feature for the complete sequence. The convolutional network, recurrent layer, and temporal pooling layer, are jointly trained to act as a feature extractor for video-based re-identification using a Siamese network architecture.Our approach makes use of colour and optical flow information in order to capture appearance and motion information which is useful for video re-identification. Experiments are conduced on the iLIDS-VID and PRID-2011 datasets to show that this approach outperforms existing methods of video-based re-identification.
https://github.com/niallmcl/Recurrent-Convolutional-Video-ReID
Project Source Code
Resumo:
A relação entre a epidemiologia, a modelação matemática e as ferramentas computacionais permite construir e testar teorias sobre o desenvolvimento e combate de uma doença. Esta tese tem como motivação o estudo de modelos epidemiológicos aplicados a doenças infeciosas numa perspetiva de Controlo Ótimo, dando particular relevância ao Dengue. Sendo uma doença tropical e subtropical transmitida por mosquitos, afecta cerca de 100 milhões de pessoas por ano, e é considerada pela Organização Mundial de Saúde como uma grande preocupação para a saúde pública. Os modelos matemáticos desenvolvidos e testados neste trabalho, baseiam-se em equações diferenciais ordinárias que descrevem a dinâmica subjacente à doença nomeadamente a interação entre humanos e mosquitos. É feito um estudo analítico dos mesmos relativamente aos pontos de equilíbrio, sua estabilidade e número básico de reprodução. A propagação do Dengue pode ser atenuada através de medidas de controlo do vetor transmissor, tais como o uso de inseticidas específicos e campanhas educacionais. Como o desenvolvimento de uma potencial vacina tem sido uma aposta mundial recente, são propostos modelos baseados na simulação de um hipotético processo de vacinação numa população. Tendo por base a teoria de Controlo Ótimo, são analisadas as estratégias ótimas para o uso destes controlos e respetivas repercussões na redução/erradicação da doença aquando de um surto na população, considerando uma abordagem bioeconómica. Os problemas formulados são resolvidos numericamente usando métodos diretos e indiretos. Os primeiros discretizam o problema reformulando-o num problema de optimização não linear. Os métodos indiretos usam o Princípio do Máximo de Pontryagin como condição necessária para encontrar a curva ótima para o respetivo controlo. Nestas duas estratégias utilizam-se vários pacotes de software numérico. Ao longo deste trabalho, houve sempre um compromisso entre o realismo dos modelos epidemiológicos e a sua tratabilidade em termos matemáticos.
Resumo:
Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. Models of visual perception are based on image representations in cortical area V1 and beyond, which contain many cell layers for feature extractions. Simple, complex and end-stopped cells tuned to different spatial frequencies (scales) and/or orientations provide input for line, edge and keypoint detection. This yields a rich, multi-scale object representation that can be stored in memory in order to identify objects. The multi-scale, keypoint-based saliency maps for Focus-of-Attention can be explored to obtain face detection and normalization, after which face recognition can be achieved using the line/edge representation. In this paper, we focus only on face normalization, showing that multi-scale keypoints can be used to construct canonical representations of faces in memory.
Resumo:
Object recognition requires that templates with canonical views are stored in memory. Such templates must somehow be normalised. In this paper we present a novel method for obtaining 2D translation, rotation and size invariance. Cortical simple, complex and end-stopped cells provide multi-scale maps of lines, edges and keypoints. These maps are combined such that objects are characterised. Dynamic routing in neighbouring neural layers allows feature maps of input objects and stored templates to converge. We illustrate the construction of group templates and the invariance method for object categorisation and recognition in the context of a cortical architecture, which can be applied in computer vision.
Resumo:
A biological disparity energy model can estimate local depth information by using a population of V1 complex cells. Instead of applying an analytical model which explicitly involves cell parameters like spatial frequency, orientation, binocular phase and position difference, we developed a model which only involves the cells’ responses, such that disparity can be extracted from a population code, using only a set of previously trained cells with random-dot stereograms of uniform disparity. Despite good results in smooth regions, the model needs complementary processing, notably at depth transitions. We therefore introduce a new model to extract disparity at keypoints such as edge junctions, line endings and points with large curvature. Responses of end-stopped cells serve to detect keypoints, and those of simple cells are used to detect orientations of their underlying line and edge structures. Annotated keypoints are then used in the leftright matching process, with a hierarchical, multi-scale tree structure and a saliency map to segregate disparity. By combining both models we can (re)define depth transitions and regions where the disparity energy model is less accurate.
Resumo:
All systems found in nature exhibit, with different degrees, a nonlinear behavior. To emulate this behavior, classical systems identification techniques use, typically, linear models, for mathematical simplicity. Models inspired by biological principles (artificial neural networks) and linguistically motivated (fuzzy systems), due to their universal approximation property, are becoming alternatives to classical mathematical models. In systems identification, the design of this type of models is an iterative process, requiring, among other steps, the need to identify the model structure, as well as the estimation of the model parameters. This thesis addresses the applicability of gradient-basis algorithms for the parameter estimation phase, and the use of evolutionary algorithms for model structure selection, for the design of neuro-fuzzy systems, i.e., models that offer the transparency property found in fuzzy systems, but use, for their design, algorithms introduced in the context of neural networks. A new methodology, based on the minimization of the integral of the error, and exploiting the parameter separability property typically found in neuro-fuzzy systems, is proposed for parameter estimation. A recent evolutionary technique (bacterial algorithms), based on the natural phenomenon of microbial evolution, is combined with genetic programming, and the resulting algorithm, bacterial programming, advocated for structure determination. Different versions of this evolutionary technique are combined with gradient-based algorithms, solving problems found in fuzzy and neuro-fuzzy design, namely incorporation of a-priori knowledge, gradient algorithms initialization and model complexity reduction.
Resumo:
The modelling of the experimental data of the extraction of the volatile oil from six aromatic plants (coriander, fennel, savoury, winter savoury, cotton lavender and thyme) was performed using five mathematical models, based on differential mass balances. In all cases the extraction was internal diffusion controlled and the internal mass transfer coefficienty (k(s)) have been found to change with pressure, temperature and particle size. For fennel, savoury and cotton lavender, the external mass transfer and the equilibrium phase also influenced the second extraction period, since k(s) changed with the tested flow rates. In general, the axial dispersion coefficient could be neglected for the conditions studied, since Peclet numbers were high. On the other hand, the solute-matrix interaction had to be considered in order to ensure a satisfactory description of the experimental data.