998 resultados para picture recognition
Resumo:
Pictorial representations of three-dimensional objects are often used to investigate animal cognitive abilities; however, investigators rarely evaluate whether the animals conceptualize the two-dimensional image as the object it is intended to represent. We tested for picture recognition in lion-tailed macaques by presenting five monkeys with digitized images of familiar foods on a touch screen. Monkeys viewed images of two different foods and learned that they would receive a piece of the one they touched first. After demonstrating that they would reliably select images of their preferred foods on one set of foods, animals were transferred to images of a second set of familiar foods. We assumed that if the monkeys recognized the images, they would spontaneously select images of their preferred foods on the second set of foods. Three monkeys selected images of their preferred foods significantly more often than chance on their first transfer session. In an additional test of the monkeys' picture recognition abilities, animals were presented with pairs of food images containing a medium-preference food paired with either a high-preference food or a low-preference food. The same three monkeys selected the medium-preference foods significantly more often when they were paired with low-preference foods and significantly less often when those same foods were paired with high-preference foods. Our novel design provided convincing evidence that macaques recognized the content of two-dimensional images on a touch screen. Results also suggested that the animals understood the connection between the two-dimensional images and the three-dimensional objects they represented.
Resumo:
Here we use two filtered speech tasks to investigate children’s processing of slow (<4 Hz) versus faster (∼33 Hz) temporal modulations in speech. We compare groups of children with either developmental dyslexia (Experiment 1) or speech and language impairments (SLIs, Experiment 2) to groups of typically-developing (TD) children age-matched to each disorder group. Ten nursery rhymes were filtered so that their modulation frequencies were either low-pass filtered (<4 Hz) or band-pass filtered (22 – 40 Hz). Recognition of the filtered nursery rhymes was tested in a picture recognition multiple choice paradigm. Children with dyslexia aged 10 years showed equivalent recognition overall to TD controls for both the low-pass and band-pass filtered stimuli, but showed significantly impaired acoustic learning during the experiment from low-pass filtered targets. Children with oral SLIs aged 9 years showed significantly poorer recognition of band pass filtered targets compared to their TD controls, and showed comparable acoustic learning effects to TD children during the experiment. The SLI samples were also divided into children with and without phonological difficulties. The children with both SLI and phonological difficulties were impaired in recognizing both kinds of filtered speech. These data are suggestive of impaired temporal sampling of the speech signal at different modulation rates by children with different kinds of developmental language disorder. Both SLI and dyslexic samples showed impaired discrimination of amplitude rise times. Implications of these findings for a temporal sampling framework for understanding developmental language disorders are discussed.
Resumo:
While close talking microphones give the best signal quality and produce the highest accuracy from current Automatic Speech Recognition (ASR) systems, the speech signal enhanced by microphone array has been shown to be an effective alternative in a noisy environment. The use of microphone arrays in contrast to close talking microphones alleviates the feeling of discomfort and distraction to the user. For this reason, microphone arrays are popular and have been used in a wide range of applications such as teleconferencing, hearing aids, speaker tracking, and as the front-end to speech recognition systems. With advances in sensor and sensor network technology, there is considerable potential for applications that employ ad-hoc networks of microphone-equipped devices collaboratively as a virtual microphone array. By allowing such devices to be distributed throughout the users’ environment, the microphone positions are no longer constrained to traditional fixed geometrical arrangements. This flexibility in the means of data acquisition allows different audio scenes to be captured to give a complete picture of the working environment. In such ad-hoc deployment of microphone sensors, however, the lack of information about the location of devices and active speakers poses technical challenges for array signal processing algorithms which must be addressed to allow deployment in real-world applications. While not an ad-hoc sensor network, conditions approaching this have in effect been imposed in recent National Institute of Standards and Technology (NIST) ASR evaluations on distant microphone recordings of meetings. The NIST evaluation data comes from multiple sites, each with different and often loosely specified distant microphone configurations. This research investigates how microphone array methods can be applied for ad-hoc microphone arrays. A particular focus is on devising methods that are robust to unknown microphone placements in order to improve the overall speech quality and recognition performance provided by the beamforming algorithms. In ad-hoc situations, microphone positions and likely source locations are not known and beamforming must be achieved blindly. There are two general approaches that can be employed to blindly estimate the steering vector for beamforming. The first is direct estimation without regard to the microphone and source locations. An alternative approach is instead to first determine the unknown microphone positions through array calibration methods and then to use the traditional geometrical formulation for the steering vector. Following these two major approaches investigated in this thesis, a novel clustered approach which includes clustering the microphones and selecting the clusters based on their proximity to the speaker is proposed. Novel experiments are conducted to demonstrate that the proposed method to automatically select clusters of microphones (ie, a subarray), closely located both to each other and to the desired speech source, may in fact provide a more robust speech enhancement and recognition than the full array could.
Resumo:
Robust speaker verification on short utterances remains a key consideration when deploying automatic speaker recognition, as many real world applications often have access to only limited duration speech data. This paper explores how the recent technologies focused around total variability modeling behave when training and testing utterance lengths are reduced. Results are presented which provide a comparison of Joint Factor Analysis (JFA) and i-vector based systems including various compensation techniques; Within-Class Covariance Normalization (WCCN), LDA, Scatter Difference Nuisance Attribute Projection (SDNAP) and Gaussian Probabilistic Linear Discriminant Analysis (GPLDA). Speaker verification performance for utterances with as little as 2 sec of data taken from the NIST Speaker Recognition Evaluations are presented to provide a clearer picture of the current performance characteristics of these techniques in short utterance conditions.
Resumo:
In an earlier paper (Part I) we described the construction of Hermite code for multiple grey-level pictures using the concepts of vector spaces over Galois Fields. In this paper a new algebra is worked out for Hermite codes to devise algorithms for various transformations such as translation, reflection, rotation, expansion and replication of the original picture. Also other operations such as concatenation, complementation, superposition, Jordan-sum and selective segmentation are considered. It is shown that the Hermite code of a picture is very powerful and serves as a mathematical signature of the picture. The Hermite code will have extensive applications in picture processing, pattern recognition and artificial intelligence.
Resumo:
This study is part of an ongoing collaborative bipolar research project, the Jorvi Bipolar Study (JoBS). The JoBS is run by the Department of Mental Health and Alcohol Research of the National Public Health Institute, Helsinki, and the Department of Psychiatry, Jorvi Hospital, Helsinki University Central Hospital (HUCH), Espoo, Finland. It is a prospective, naturalistic cohort study of secondary level care psychiatric in- and outpatients with a new episode of bipolar disorder (BD). The second report also included 269 major depressive disorder (MDD) patients from the Vantaa Depression Study (VDS). The VDS was carried out in collaboration with the Department of Psychiatry of the Peijas Medical Care District. Using the Mood Disorder Questionnaire (MDQ), all in- and outpatients at the Department of Psychiatry at Jorvi Hospital who currently had a possible new phase of DSM-IV BD were sought. Altogether, 1630 psychiatric patients were screened, and 490 were interviewed using a semistructured interview (SCID-I/P). The patients included in the cohort (n=191) had at intake a current phase of BD. The patients were evaluated at intake and at 6- and 18-month interviews. Based on this study, BD is poorly recognized even in psychiatric settings. Of the BD patients with acute worsening of illness, 39% had never been correctly diagnosed. The classic presentations of BD with hospitalizations, manic episodes, and psychotic symptoms lead clinicians to correct diagnosis of BD I in psychiatric care. Time of follow-up elapsed in psychiatric care, but none of the clinical features, seemed to explain correct diagnosis of BD II, suggesting reliance on cross- sectional presentation of illness. Even though BD II was clearly less often correctly diagnosed than BD I, few other differences between the two types of BD were detected. BD I and II patients appeared to differ little in terms of clinical picture or comorbidity, and the prevalence of psychiatric comorbidity was strongly related to the current illness phase in both types. At the same time, the difference in outcome was clear. BD II patients spent about 40% more time depressed than BD I patients. Patterns of psychiatric comorbidity of BD and MDD differed somewhat qualitatively. Overall, MDD patients were likely to have more anxiety disorders and cluster A personality disorders, and bipolar patients to have more cluster B personality disorders. The adverse consequences of missing or delayed diagnosis are potentially serious. Thus, these findings strongly support the value of screening for BD in psychiatric settings, especially among the major depressive patients. Nevertheless, the diagnosis must be based on a clinical interview and follow-up of mood. Comorbidity, present in 59% of bipolar patients in a current phase, needs concomitant evaluation, follow-up, and treatment. To improve outcome in BD, treatment of bipolar depression is a major challenge for clinicians.
Resumo:
The “distractor-frequency effect” refers to the finding that high-frequency (HF) distractor words slow picture naming less than low-frequency distractors in the picture–word interference paradigm. Rival input and output accounts of this effect have been proposed. The former attributes the effect to attentional selection mechanisms operating during distractor recognition, whereas the latter attributes it to monitoring/decision mechanisms operating on distractor and target responses in an articulatory buffer. Using high-density (128-channel) EEG, we tested hypotheses from these rival accounts. In addition to conducting stimulus- and response-locked whole-brain corrected analyses, we investigated the correct-related negativity, an ERP observed on correct trials at fronto-central electrodes proposed to reflect the involvement of domain general monitoring. The wholebrain ERP analysis revealed a significant effect of distractor frequency at inferior right frontal and temporal sites between 100 and 300-msec post-stimulus onset, during which lexical access is thought to occur. Response-locked, region of interest (ROI) analyses of fronto-central electrodes revealed a correct-related negativity starting 121 msec before and peaking 125 msec after vocal onset on the grand averages. Slope analysis of this component revealed a significant difference between HF and lowfrequency distractor words, with the former associated with a steeper slope on the time windowspanning from100 msec before to 100 msec after vocal onset. The finding of ERP effects in time windows and components corresponding to both lexical processing and monitoring suggests the distractor frequency effect is most likely associated with more than one physiological mechanism.
Resumo:
A computer may gather a lot of information from its environment in an optical or graphical manner. A scene, as seen for instance from a TV camera or a picture, can be transformed into a symbolic description of points and lines or surfaces. This thesis describes several programs, written in the language CONVERT, for the analysis of such descriptions in order to recognize, differentiate and identify desired objects or classes of objects in the scene. Examples are given in each case. Although the recognition might be in terms of projections of 2-dim and 3-dim objects, we do not deal with stereoscopic information. One of our programs (Polybrick) identifies parallelepipeds in a scene which may contain partially hidden bodies and non-parallelepipedic objects. The program TD works mainly with 2-dimensional figures, although under certain conditions successfully identifies 3-dim objects. Overlapping objects are identified when they are transparent. A third program, DT, works with 3-dim and 2-dim objects, and does not identify objects which are not completely seen. Important restrictions and suppositions are: (a) the input is assumed perfect (noiseless), and in a symbolic format; (b) no perspective deformation is considered. A portion of this thesis is devoted to the study of models (symbolic representations) of the objects we want to identify; different schemes, some of them already in use, are discussed. Focusing our attention on the more general problem of identification of general objects when they substantially overlap, we propose some schemes for their recognition, and also analyze some problems that are met.
Resumo:
Methods are presented (1) to partition or decompose a visual scene into the bodies forming it; (2) to position these bodies in three-dimensional space, by combining two scenes that make a stereoscopic pair; (3) to find the regions or zones of a visual scene that belong to its background; (4) to carry out the isolation of objects in (1) when the input has inaccuracies. Running computer programs implement the methods, and many examples illustrate their behavior. The input is a two-dimensional line-drawing of the scene, assumed to contain three-dimensional bodies possessing flat faces (polyhedra); some of them may be partially occluded. Suggestions are made for extending the work to curved objects. Some comparisons are made with human visual perception. The main conclusion is that it is possible to separate a picture or scene into the constituent objects exclusively on the basis of monocular geometric properties (on the basis of pure form); in fact, successful methods are shown.
Resumo:
Social work has a central role in negotiating and supporting birth family contact following adoption from care. This paper argues that family display (Finch) offers a useful conceptual resource for understanding relationships in the adoptive kinship network as they are enacted through contact. It reports on an interpretative phenomenological analysis of adoptive parents' accounts of open adoption from care that revealed direct and indirect contact to be contexts in which they and birth relatives performed family display practices: communicating the meaning of their respective relationships with the adopted child and seeking recognition that this was a legitimate family relationship. The analysis explores how family display was performed, and the impact of validating or invalidating responses. It aims to illuminate these social and interpretive processes involved in adoptive kinship in order to inform social work support for contact. The findings suggest that successful contact may be promoted by helping adoptive and birth relatives validate the legitimacy of the other's kin connection with the child, and through arrangements that facilitate family-like interactions.
Resumo:
The present study examined individual differences in Absorption and fantasy, as well as in Achiievement and achievement striving as possible moderators of the perceptual closure effect found by Snodgrass and Feenan (1990). The study also examined whether different instructions (experiential versus instrumental) interact with the personality variables to moderate the relationship between priming and subsequent performance on a picture completion task. 1 28 participants completed two sessions, one to fill out the MPQ and NEO personality inventories and the other to complete the experimental task. The experimental task consisted of a priming phase and a test phase, with pictures presented on a computer screen for both phases. Participants were shown 30 pictures in the priming phase, and then shovm the 30 primed pictures along with 30 new pictures for the test phase. Participants were randomly assigned to receive one of the two different instruction sets for the task. Two measures of performance were calculated, most fragmented measure and threshold. Results of the present study confirm that a five-second exposure time is long enough to produce the perceptual closure effect. The analysis of the two-way interaction effects indicated a significant quadratic interaction of Absorption with priming level on threshold performance. The results were in the opposite direction of predictions. Possible explanations for the Absorption results include lack of optimal conditions, lack of intrinsic motivation and measurement problems. Primary analyses also revealed two significant between-subject effects of fantasy and achievement striving on performance collapsed across priming levels. These results suggest that fantasy has a beneficial effect on performance at test for pictures primed at all levels, whereas achievement striving seems to have an adverse effect on performance at test for pictures primed at all levels. Results of the secondary analyses with a revised threshold performance measure indicated a significant quadratic interaction of Absorption, condition and priming level. In the experiential condition, test performance, based on Absorption scores for pictures primed at level 4, showed a positive slope and performance for pictures primed at levels 1 and 7 based on Absorption showed a negative slope. The reverse effect was found in the instrumental condition. The results suggest that Absorption, in combination with experiential involvement, may affect implicit memory. A second significant result of the secondary analyses was a linear three-way interaction of Achievement, condition and priming level on performance. Results suggest that as Achievement scores increased, test performance improved for less fragmented primed pictures in the instrumental condition and test performance improved for more highly fragmented primes in the experiential condition. Results from the secondary analyses suggest that the revised threshold measure may be more sensitive to individual differences. Results of the exploratory analyses with Openness to Experience, Conscientiousness and agentic positive emotionality (PEM-A) measures indicated no significant effects of any of these personality variables. Results suggest that facets of the scales may be more useful with regard to perceptual research, and that future research should examine narrowly focused personality traits as opposed to broader constructs.
Resumo:
Name agreement is the extent to which different people agree on a name for a particular picture. Previous studies have found that it takes longer to name low name agreement pictures than high name agreement pictures. To examine the effect of name agreement in the online process of picture naming, we compared event-related potentials (ERPs) recorded whilst 19 healthy, native English speakers silently named pictures which had either high or low name agreement. A series of ERP components was examined: P1 approximately 120ms from picture onset, N1 around 170ms, P2 around 220ms, N2 around 290ms, and P3 around 400ms. Additionally, a late time window from 800 to 900ms was considered. Name agreement had an early effect, starting at P1 and possibly resulting from uncertainty of picture identity, and continuing into N2, possibly resulting from alternative names for pictures. These results support the idea that name agreement affects two consecutive processes: first, object recognition, and second, lexical selection and/or phonological encoding.
Resumo:
Sequence-specific binding is demonstrated between pyrene-based tweezer molecules and soluble, high molar mass copolyimides. The binding involves complementary pi - pi stacking interactions, polymer chain-folding, and hydrogen bonding and is extremely sensitive to the steric environment around the pyromellitimide binding-site. A detailed picture of the intermolecular interactions involved has been obtained through single-crystal X-ray studies of tweezer complexes with model diimides. Ring-current magnetic shielding of polyimide protons by the pyrene '' arms '' of the tweezer molecule induces large complexation shifts of the corresponding H-1 NMR resonances, enabling specific triplet sequences to be identified by their complexation shifts. Extended comonomer sequences (triplets of triplets in which the monomer residues differ only by the presence or absence of a methyl group) can be '' read '' by a mechanism which involves multiple binding of tweezer molecules to adjacent diimide residues within the copolymer chain. The adjacent-binding model for sequence recognition has been validated by two conceptually different sets of tweezer binding experiments. One approach compares sequence-recognition events for copolyimides having either restricted or unrestricted triple-triplet sequences, and the other makes use of copolymers containing both strongly binding and completely nonbinding diimide residues. In all cases the nature and relative proportions of triple-triplet sequences predicted by the adjacent-binding model are fully consistent with the observed H-1 NMR data.
Resumo:
The project introduces an application using computer vision for Hand gesture recognition. A camera records a live video stream, from which a snapshot is taken with the help of interface. The system is trained for each type of count hand gestures (one, two, three, four, and five) at least once. After that a test gesture is given to it and the system tries to recognize it.A research was carried out on a number of algorithms that could best differentiate a hand gesture. It was found that the diagonal sum algorithm gave the highest accuracy rate. In the preprocessing phase, a self-developed algorithm removes the background of each training gesture. After that the image is converted into a binary image and the sums of all diagonal elements of the picture are taken. This sum helps us in differentiating and classifying different hand gestures.Previous systems have used data gloves or markers for input in the system. I have no such constraints for using the system. The user can give hand gestures in view of the camera naturally. A completely robust hand gesture recognition system is still under heavy research and development; the implemented system serves as an extendible foundation for future work.
Resumo:
The purpose of this study was to verify discriminative control by segments of signs in adolescents with deafness who use Brazilian Sign Language (BSL). Four adolescent with bilateral deafness, with 3 years of BSL teaching, saw a video presenting a children's tale in BSL. After showing accurate understanding of the story, participants saw another video of the same story with 12 signs altered in one of their segments (hand configuration, place of articulation, or movement). They apparently did not detect the alterations. However, when the signs were presented in isolation in a matching-to-sample test, they virtually always selected the picture corresponding to the unaltered signs. Three participants selected an unfamiliar picture in 50% or more trials with an altered sign as a sample, showing that they could detect the majority of the altered signs.