967 resultados para Speaker Recognition, Text-constrained, Multilingual, Speaker Verification, HMMs
Resumo:
Presentations sponsored by the Patent and Trademark Depository Library Association (PTDLA) at the American Library Association Annual Conference, New Orleans, June 25, 2006 Speaker #1: Nan Myers Associate Professor; Government Documents, Patents and Trademarks Librarian Wichita State University, Wichita, KS Title: Intellectual Property Roundup: Copyright, Trademarks, Trade Secrets, and Patents Abstract: This presentation provides a capsule overview of the distinctive coverage of the four types of intellectual property – What they are, why they are important, how to get them, what they cost, how long they last. Emphasis will be on what questions patrons ask most, along with the answers! Includes coverage of the mission of Patent & Trademark Depository Libraries (PTDLs) and other sources of business information outside of libraries, such as Small Business Development Centers. Speaker #2: Jan Comfort Government Information Reference Librarian Clemson University, Clemson, SC Title: Patents as a Source of Competitive Intelligence Information Abstract: Large corporations often have R&D departments, or large numbers of staff whose jobs are to monitor the activities of their competitors. This presentation will review strategies that small business owners can employ to do their own competitive intelligence analysis. The focus will be on features of the patent database that is available free of charge on the USPTO website, as well as commercial databases available at many public and academic libraries across the country. Speaker #3: Virginia Baldwin Professor; Engineering Librarian University of Nebraska-Lincoln, Lincoln, NE Title: Mining Online Patent Data for Business Information Abstract: The United States Patent and Trademark Office (USPTO) website and websites of international databases contains information about granted patents and patent applications and the technologies they represent. Statistical information about patents, their technologies, geographical information, and patenting entities are compiled and available as reports on the USPTO website. Other valuable information from these websites can be obtained using data mining techniques. This presentation will provide the keys to opening these resources and obtaining valuable data. Speaker #4: Donna Hopkins Engineering Librarian Renssalaer Polytechnic Institute, Troy, NY Title: Searching the USPTO Trademark Database for Wordmarks and Logos Abstract: This presentation provides an overview of wordmark searching in www.uspto.gov, followed by a review of the techniques of searching for non-word US trademarks using codes from the Design Search Code Manual. These codes are used in an electronic search, either on the uspto website or on CASSIS DVDs. The search is sometimes supplemented by consulting the Official Gazette. A specific example of using a section of the codes for searching is included. Similar searches on the Madrid Express database of WIPO, using the Vienna Classification, will also be briefly described.
Resumo:
[ES]The aim of the Kinship Verification in the Wild Evaluation (held in conjunction with the 2015 IEEE International Conference on Automatic Face and Gesture Recognition, Ljubljana, Slovenia) was to evaluate different kinship verification algorithms. For this task, two datasets were made available and three possible experimental protocols (unsupervised, image-restricted, and image-unrestricted) were designed. Five institutions submitted their results to the evaluation: (i) Politecnico di Torino, Italy; (ii) LIRIS-University of Lyon, France; (iii) Universidad de Las Palmas de Gran Canaria, Spain; (iv) Nanjing University of Aeronautics and Astronautics, China; and (v) Bar Ilan University, Israel. Most of the participants tackled the image-restricted challenge and experimental results demonstrated better kinship verification performance than the baseline methods provided by the organizers.
Resumo:
Ontology design and population -core aspects of semantic technologies- re- cently have become fields of great interest due to the increasing need of domain-specific knowledge bases that can boost the use of Semantic Web. For building such knowledge resources, the state of the art tools for ontology design require a lot of human work. Producing meaningful schemas and populating them with domain-specific data is in fact a very difficult and time-consuming task. Even more if the task consists in modelling knowledge at a web scale. The primary aim of this work is to investigate a novel and flexible method- ology for automatically learning ontology from textual data, lightening the human workload required for conceptualizing domain-specific knowledge and populating an extracted schema with real data, speeding up the whole ontology production process. Here computational linguistics plays a fundamental role, from automati- cally identifying facts from natural language and extracting frame of relations among recognized entities, to producing linked data with which extending existing knowledge bases or creating new ones. In the state of the art, automatic ontology learning systems are mainly based on plain-pipelined linguistics classifiers performing tasks such as Named Entity recognition, Entity resolution, Taxonomy and Relation extraction [11]. These approaches present some weaknesses, specially in capturing struc- tures through which the meaning of complex concepts is expressed [24]. Humans, in fact, tend to organize knowledge in well-defined patterns, which include participant entities and meaningful relations linking entities with each other. In literature, these structures have been called Semantic Frames by Fill- 6 Introduction more [20], or more recently as Knowledge Patterns [23]. Some NLP studies has recently shown the possibility of performing more accurate deep parsing with the ability of logically understanding the structure of discourse [7]. In this work, some of these technologies have been investigated and em- ployed to produce accurate ontology schemas. The long-term goal is to collect large amounts of semantically structured information from the web of crowds, through an automated process, in order to identify and investigate the cognitive patterns used by human to organize their knowledge.
Resumo:
Automatically recognizing faces captured under uncontrolled environments has always been a challenging topic in the past decades. In this work, we investigate cohort score normalization that has been widely used in biometric verification as means to improve the robustness of face recognition under challenging environments. In particular, we introduce cohort score normalization into undersampled face recognition problem. Further, we develop an effective cohort normalization method specifically for the unconstrained face pair matching problem. Extensive experiments conducted on several well known face databases demonstrate the effectiveness of cohort normalization on these challenging scenarios. In addition, to give a proper understanding of cohort behavior, we study the impact of the number and quality of cohort samples on the normalization performance. The experimental results show that bigger cohort set size gives more stable and often better results to a point before the performance saturates. And cohort samples with different quality indeed produce different cohort normalization performance. Recognizing faces gone after alterations is another challenging problem for current face recognition algorithms. Face image alterations can be roughly classified into two categories: unintentional (e.g., geometrics transformations introduced by the acquisition devide) and intentional alterations (e.g., plastic surgery). We study the impact of these alterations on face recognition accuracy. Our results show that state-of-the-art algorithms are able to overcome limited digital alterations but are sensitive to more relevant modifications. Further, we develop two useful descriptors for detecting those alterations which can significantly affect the recognition performance. In the end, we propose to use the Structural Similarity (SSIM) quality map to detect and model variations due to plastic surgeries. Extensive experiments conducted on a plastic surgery face database demonstrate the potential of SSIM map for matching face images after surgeries.
Resumo:
Pictorial representations of three-dimensional objects are often used to investigate animal cognitive abilities; however, investigators rarely evaluate whether the animals conceptualize the two-dimensional image as the object it is intended to represent. We tested for picture recognition in lion-tailed macaques by presenting five monkeys with digitized images of familiar foods on a touch screen. Monkeys viewed images of two different foods and learned that they would receive a piece of the one they touched first. After demonstrating that they would reliably select images of their preferred foods on one set of foods, animals were transferred to images of a second set of familiar foods. We assumed that if the monkeys recognized the images, they would spontaneously select images of their preferred foods on the second set of foods. Three monkeys selected images of their preferred foods significantly more often than chance on their first transfer session. In an additional test of the monkeys' picture recognition abilities, animals were presented with pairs of food images containing a medium-preference food paired with either a high-preference food or a low-preference food. The same three monkeys selected the medium-preference foods significantly more often when they were paired with low-preference foods and significantly less often when those same foods were paired with high-preference foods. Our novel design provided convincing evidence that macaques recognized the content of two-dimensional images on a touch screen. Results also suggested that the animals understood the connection between the two-dimensional images and the three-dimensional objects they represented.
Resumo:
We present a new method for the enhancement of speech. The method is designed for scenarios in which targeted speaker enrollment as well as system training within the typical noise environment are feasible. The proposed procedure is fundamentally different from most conventional and state-of-the-art denoising approaches. Instead of filtering a distorted signal we are resynthesizing a new “clean” signal based on its likely characteristics. These characteristics are estimated from the distorted signal. A successful implementation of the proposed method is presented. Experiments were performed in a scenario with roughly one hour of clean speech training data. Our results show that the proposed method compares very favorably to other state-of-the-art systems in both objective and subjective speech quality assessments. Potential applications for the proposed method include jet cockpit communication systems and offline methods for the restoration of audio recordings.
Resumo:
Eighty-one listeners defined by three age ranges (18–30, 31–59, and over 60 years) and three levels of musical experience performed an immediate recognition task requiring the detection of alterations in melodies. On each trial, a brief melody was presented, followed 5 sec later by a test stimulus that either was identical to the target or had two pitches changed, for a same–different judgment. Each melody pair was presented at 0.6 note/sec, 3.0 notes/sec, or 6.0 notes/sec. Performance was better with familiar melodies than with unfamiliar melodies. Overall performance declined slightly with age and improved substantially with increasing experience, in agreement with earlier results in an identification task. Tempo affected performance on familiar tunes (moderate was best), but not on unfamiliar tunes. We discuss these results in terms of theories of dynamic attending, cognitive slowing, and working memory in aging.
Resumo:
We investigated the effect of level-of-processing manipulations on “remember” and “know” responses in episodic melody recognition (Experiments 1 and 2) and how this effect is modulated by item familiarity (Experiment 2). In Experiment 1, participants performed 2 conceptual and 2 perceptual orienting tasks while listening to familiar melodies: judging the mood, continuing the tune, tracing the pitch contour, and counting long notes. The conceptual mood task led to higher d' rates for “remember” but not “know” responses. In Experiment 2, participants either judged the mood or counted long notes of tunes with high and low familiarity. A level-of-processing effect emerged again in participants’ “remember” d' rates regardless of melody familiarity. Results are discussed within the distinctive processing framework.
Resumo:
We tested normal young and elderly adults and elderly Alzheimer’s disease (AD) patients on recognition memory for tunes. In Experiment 1, AD patients and age-matched controls received a study list and an old/new recognition test of highly familiar, traditional tunes, followed by a study list and test of novel tunes. The controls performed better than did the AD patients. The controls showed the “mirror effect” of increased hits and reduced false alarms for traditional versus novel tunes, whereas the patients false-alarmed as often to traditional tunes as to novel tunes. Experiment 2 compared young adults and healthy elderly persons using a similar design. Performance was lower in the elderly group, but both younger and older subjects showed the mirror effect. Experiment 3 produced confusion between preexperimental familiarity and intraexperimental familiarity by mixing traditional and novel tunes in the study lists and tests. Here, the subjects in both age groups resembled the patients of Experiment 1 in failing to show the mirror effect. Older subjects again performed more poorly, and they differed qualitatively from younger subjects in setting stricter criteria for more nameable tunes. Distinguishing different sources of global familiarity is a factor in tune recognition, and the data suggest that this type of source monitoring is impaired in AD and involves different strategies in younger and older adults.
Resumo:
The authors examined the effects of age, musical experience, and characteristics of musical stimuli on a melodic short-term memory task in which participants had to recognize whether a tune was an exact transposition of another tune recently presented. Participants were musicians and nonmusicians between ages 18 and 30 or 60 and 80. In 4 experiments, the authors found that age and experience affected different aspects of the task, with experience becoming more influential when interference was provided during the task. Age and experience interacted only weakly, and neither age nor experience influenced the superiority of tonal over atonal materials. Recognition memory for the sequences did not reflect the same pattern of results as the transposition task. The implications of these results for theories of aging, experience, and music cognition are discussed.
Resumo:
Each year, the Research Committee of the Ohio Music Education Association sponsors a half-day Research Forum prior to the beginning of the state music education association conference. In 2004, Dr. Patricia J. Flowers, Professor of Music at the Ohio State University was the guest speaker. This article summarizes her talk on the process of becoming a music education researcher
Resumo:
Speech is often a multimodal process, presented audiovisually through a talking face. One area of speech perception influenced by visual speech is speech segmentation, or the process of breaking a stream of speech into individual words. Mitchel and Weiss (2013) demonstrated that a talking face contains specific cues to word boundaries and that subjects can correctly segment a speech stream when given a silent video of a speaker. The current study expanded upon these results, using an eye tracker to identify highly attended facial features of the audiovisual display used in Mitchel and Weiss (2013). In Experiment 1, subjects were found to spend the most time watching the eyes and mouth, with a trend suggesting that the mouth was viewed more than the eyes. Although subjects displayed significant learning of word boundaries, performance was not correlated with gaze duration on any individual feature, nor was performance correlated with a behavioral measure of autistic-like traits. However, trends suggested that as autistic-like traits increased, gaze duration of the mouth increased and gaze duration of the eyes decreased, similar to significant trends seen in autistic populations (Boratston & Blakemore, 2007). In Experiment 2, the same video was modified so that a black bar covered the eyes or mouth. Both videos elicited learning of word boundaries that was equivalent to that seen in the first experiment. Again, no correlations were found between segmentation performance and SRS scores in either condition. These results, taken with those in Experiment, suggest that neither the eyes nor mouth are critical to speech segmentation and that perhaps more global head movements indicate word boundaries (see Graf, Cosatto, Strom, & Huang, 2002). Future work will elucidate the contribution of individual features relative to global head movements, as well as extend these results to additional types of speech tasks.
Resumo:
The Telephone Conference Network, sponsored by The Pennsylvania State University's Coordinating Council for Health Care, is designed as a cost-effective format for providing inservice training in geriatric mental health for individuals who serve the elderly. Institutions which subscribe to the Telephone Conference Network are equipped with a conference speaker and telephone hook-up providing a two-way line of communication, and may choose from a variety of inservice programs. Mailed evaluations were completed by participants (N=73) in the "Skills to Manage Moods" program, a series of four 1-hour sessions designed to teach participants the skills needed to help patients cope with depression and to deliver the program to others. The majority of respondents reported high levels of satisfaction with the Telephone Conference Network system and the specific program in which they participated. Although 85 percent reported that they would be able to use the skills learned in the program on the job, 50 percent reported that they would not be interested in teaching these skills to others. The convenience and efficiency of the Telephone Conference Network were the most frequently mentioned strengths of the system, while the physical facilities and the program delivery format adopted by the individual institutions were the most frequently mentioned weaknesses. These data suggested several recommendations for Network subscribers and for professionals offering telephone conference programs, including ensuring optimal class enrollment and adequate physical facilities, and participant involvement in program implementation.
Resumo:
A variety of research has documented high levels of depression among older adults in the health care setting. Additional research has shown that care providers in health care settings are not very effective at diagnosing comorbid depression.This is a troublesome finding since comorbid depression has been linked to a number of negative outcomes in older adults. Early results have indicated that comorbid depression may be associated with a number of unfavorable consequences ranging from impairments in physical functioning to increased mortality.The health care setting with arguably the highest rate of physical impairment is the nursing home and it is the nursing home where the effects of comorbid depression may be most costly. Therefore, the current analysis uses data from the Institutional Population Component of the NationalMedical Expenditure Survey (US Department of Health and Human Services, 1990) to explore rates of both recognized and unrecognized comorbid depression in the nursing home setting. Using a constructed proxy variable representative of the DSM-III-R diagnosis of depression, results indicate that approximately 8.1% of nursing home residents have an unrecognized potential comorbid depression.
Resumo:
WE INVESTIGATED HOW WELL STRUCTURAL FEATURES such as note density or the relative number of changes in the melodic contour could predict success in implicit and explicit memory for unfamiliar melodies. We also analyzed which features are more likely to elicit increasingly confident judgments of "old" in a recognition memory task. An automated analysis program computed structural aspects of melodies, both independent of any context, and also with reference to the other melodies in the testset and the parent corpus of pop music. A few features predicted success in both memory tasks, which points to a shared memory component. However, motivic complexity compared to a large corpus of pop music had different effects on explicit and implicit memory. We also found that just a few features are associated with different rates of "old" judgments, whether the items were old or new. Rarer motives relative to the testset predicted hits and rarer motives relative to the corpus predicted false alarms. This data-driven analysis provides further support for both shared and separable mechanisms in implicit and explicit memory retrieval, as well as the role of distinctiveness in true and false judgments of familiarity.