122 resultados para Speaker Recognition, Text-constrained, Multilingual, Speaker Verification, HMMs
em Digital Commons at Florida International University
Resumo:
This dissertation develops an innovative approach towards less-constrained iris biometrics. Two major contributions are made in this research endeavor: (1) Designed an award-winning segmentation algorithm in the less-constrained environment where image acquisition is made of subjects on the move and taken under visible lighting conditions, and (2) Developed a pioneering iris biometrics method coupling segmentation and recognition of the iris based on video of moving persons under different acquisitions scenarios. The first part of the dissertation introduces a robust and fast segmentation approach using still images contained in the UBIRIS (version 2) noisy iris database. The results show accuracy estimated at 98% when using 500 randomly selected images from the UBIRIS.v2 partial database, and estimated at 97% in a Noisy Iris Challenge Evaluation (NICE.I) in an international competition that involved 97 participants worldwide involving 35 countries, ranking this research group in sixth position. This accuracy is achieved with a processing speed nearing real time. The second part of this dissertation presents an innovative segmentation and recognition approach using video-based iris images. Following the segmentation stage which delineates the iris region through a novel segmentation strategy, some pioneering experiments on the recognition stage of the less-constrained video iris biometrics have been accomplished. In the video-based and less-constrained iris recognition, the test or subject iris videos/images and the enrolled iris images are acquired with different acquisition systems. In the matching step, the verification/identification result was accomplished by comparing the similarity distance of encoded signature from test images with each of the signature dataset from the enrolled iris images. With the improvements gained, the results proved to be highly accurate under the unconstrained environment which is more challenging. This has led to a false acceptance rate (FAR) of 0% and a false rejection rate (FRR) of 17.64% for 85 tested users with 305 test images from the video, which shows great promise and high practical implications for iris biometrics research and system design.
Resumo:
The objective of this research is to develop nanoscale ultrasensitive transducers for detection of biological species at molecular level using carbon nanotubes as nanoelectrodes. Rapid detection of ultra low concentration or even single DNA molecules are essential for medical diagnosis and treatment, pharmaceutical applications, gene sequencing as well as forensic analysis. Here the use of functionalized single walled carbon nanotubes (SWNT) as nanoscale detection platform for rapid detection of single DNA molecules is demonstrated. The detection principle is based on obtaining electrical signal from a single amine terminated DNA molecule which is covalently bridged between two ends of an SWNT separated by a nanoscale gap. The synthesis, fabrication, chemical functionalization of nanoelectrodes and DNA attachment were optimized to perform reliable electrical characterization these molecules. Using this detection system fundamental study on charge transport in DNA molecule of both genomic and non genomic sequences is performed. We measured an electrical signal of about 30 pA through a hybridized DNA molecule of 80 base pair in length which encodes a portion of sequence of H5N1 gene of avian Influenza A virus. Due the dynamic nature of the DNA molecules the local environment such as ion concentration, pH and temperature significantly influence its physical properties. We observed a decrease in DNA conductance of about 33% in high vacuum conditions. The counterion variation was analyzed by changing the buffer from sodium acetate to tris(hydroxymethyl) aminomethane, which resulted in a two orders of magnitude increase in the conductivity of the DNA. The fabrication of large array of identical SWNT nanoelectrodes was achieved by using ultralong SWNTs. Using these nanoelectrode array we have investigated the sequence dependent charge transport in DNA. A systematic study performed on PolyG - PolyC sequence with varying number of intervening PolyA - PolyT pairs showed a decrease in electrical signal from 180 pA (PolyG - PolyC) to 30 pA with increasing number of the PolyA - PolyT pairs. This work also led to the development of ultrasensitive nanoelectrodes based on enzyme functionalized vertically aligned high density multiwalled CNTs for electrochemical detection of cholesterol. The nanoelectrodes exhibited selectively detection of cholesterol in the presence of common interferents found in human blood.
Resumo:
This dissertation establishes a novel system for human face learning and recognition based on incremental multilinear Principal Component Analysis (PCA). Most of the existing face recognition systems need training data during the learning process. The system as proposed in this dissertation utilizes an unsupervised or weakly supervised learning approach, in which the learning phase requires a minimal amount of training data. It also overcomes the inability of traditional systems to adapt to the testing phase as the decision process for the newly acquired images continues to rely on that same old training data set. Consequently when a new training set is to be used, the traditional approach will require that the entire eigensystem will have to be generated again. However, as a means to speed up this computational process, the proposed method uses the eigensystem generated from the old training set together with the new images to generate more effectively the new eigensystem in a so-called incremental learning process. In the empirical evaluation phase, there are two key factors that are essential in evaluating the performance of the proposed method: (1) recognition accuracy and (2) computational complexity. In order to establish the most suitable algorithm for this research, a comparative analysis of the best performing methods has been carried out first. The results of the comparative analysis advocated for the initial utilization of the multilinear PCA in our research. As for the consideration of the issue of computational complexity for the subspace update procedure, a novel incremental algorithm, which combines the traditional sequential Karhunen-Loeve (SKL) algorithm with the newly developed incremental modified fast PCA algorithm, was established. In order to utilize the multilinear PCA in the incremental process, a new unfolding method was developed to affix the newly added data at the end of the previous data. The results of the incremental process based on these two methods were obtained to bear out these new theoretical improvements. Some object tracking results using video images are also provided as another challenging task to prove the soundness of this incremental multilinear learning method.
Resumo:
The move from Standard Definition (SD) to High Definition (HD) represents a six times increases in data, which needs to be processed. With expanding resolutions and evolving compression, there is a need for high performance with flexible architectures to allow for quick upgrade ability. The technology advances in image display resolutions, advanced compression techniques, and video intelligence. Software implementation of these systems can attain accuracy with tradeoffs among processing performance (to achieve specified frame rates, working on large image data sets), power and cost constraints. There is a need for new architectures to be in pace with the fast innovations in video and imaging. It contains dedicated hardware implementation of the pixel and frame rate processes on Field Programmable Gate Array (FPGA) to achieve the real-time performance. ^ The following outlines the contributions of the dissertation. (1) We develop a target detection system by applying a novel running average mean threshold (RAMT) approach to globalize the threshold required for background subtraction. This approach adapts the threshold automatically to different environments (indoor and outdoor) and different targets (humans and vehicles). For low power consumption and better performance, we design the complete system on FPGA. (2) We introduce a safe distance factor and develop an algorithm for occlusion occurrence detection during target tracking. A novel mean-threshold is calculated by motion-position analysis. (3) A new strategy for gesture recognition is developed using Combinational Neural Networks (CNN) based on a tree structure. Analysis of the method is done on American Sign Language (ASL) gestures. We introduce novel point of interests approach to reduce the feature vector size and gradient threshold approach for accurate classification. (4) We design a gesture recognition system using a hardware/ software co-simulation neural network for high speed and low memory storage requirements provided by the FPGA. We develop an innovative maximum distant algorithm which uses only 0.39% of the image as the feature vector to train and test the system design. Database set gestures involved in different applications may vary. Therefore, it is highly essential to keep the feature vector as low as possible while maintaining the same accuracy and performance^
Resumo:
This dissertation introduces a new system for handwritten text recognition based on an improved neural network design. Most of the existing neural networks treat mean square error function as the standard error function. The system as proposed in this dissertation utilizes the mean quartic error function, where the third and fourth derivatives are non-zero. Consequently, many improvements on the training methods were achieved. The training results are carefully assessed before and after the update. To evaluate the performance of a training system, there are three essential factors to be considered, and they are from high to low importance priority: (1) error rate on testing set, (2) processing time needed to recognize a segmented character and (3) the total training time and subsequently the total testing time. It is observed that bounded training methods accelerate the training process, while semi-third order training methods, next-minimal training methods, and preprocessing operations reduce the error rate on the testing set. Empirical observations suggest that two combinations of training methods are needed for different case character recognition. Since character segmentation is required for word and sentence recognition, this dissertation provides also an effective rule-based segmentation method, which is different from the conventional adaptive segmentation methods. Dictionary-based correction is utilized to correct mistakes resulting from the recognition and segmentation phases. The integration of the segmentation methods with the handwritten character recognition algorithm yielded an accuracy of 92% for lower case characters and 97% for upper case characters. In the testing phase, the database consists of 20,000 handwritten characters, with 10,000 for each case. The testing phase on the recognition 10,000 handwritten characters required 8.5 seconds in processing time.
Resumo:
Hardware/software (HW/SW) cosimulation integrates software simulation and hardware simulation simultaneously. Usually, HW/SW co-simulation platform is used to ease debugging and verification for very large-scale integration (VLSI) design. To accelerate the computation of the gesture recognition technique, an HW/SW implementation using field programmable gate array (FPGA) technology is presented in this paper. The major contributions of this work are: (1) a novel design of memory controller in the Verilog Hardware Description Language (Verilog HDL) to reduce memory consumption and load on the processor. (2) The testing part of the neural network algorithm is being hardwired to improve the speed and performance. The American Sign Language gesture recognition is chosen to verify the performance of the approach. Several experiments were carried out on four databases of the gestures (alphabet signs A to Z). (3) The major benefit of this design is that it takes only few milliseconds to recognize the hand gesture which makes it computationally more efficient.
Resumo:
Biometrics is afield of study which pursues the association of a person's identity with his/her physiological or behavioral characteristics.^ As one aspect of biometrics, face recognition has attracted special attention because it is a natural and noninvasive means to identify individuals. Most of the previous studies in face recognition are based on two-dimensional (2D) intensity images. Face recognition based on 2D intensity images, however, is sensitive to environment illumination and subject orientation changes, affecting the recognition results. With the development of three-dimensional (3D) scanners, 3D face recognition is being explored as an alternative to the traditional 2D methods for face recognition.^ This dissertation proposes a method in which the expression and the identity of a face are determined in an integrated fashion from 3D scans. In this framework, there is a front end expression recognition module which sorts the incoming 3D face according to the expression detected in the 3D scans. Then, scans with neutral expressions are processed by a corresponding 3D neutral face recognition module. Alternatively, if a scan displays a non-neutral expression, e.g., a smiling expression, it will be routed to an appropriate specialized recognition module for smiling face recognition.^ The expression recognition method proposed in this dissertation is innovative in that it uses information from 3D scans to perform the classification task. A smiling face recognition module was developed, based on the statistical modeling of the variance between faces with neutral expression and faces with a smiling expression.^ The proposed expression and face recognition framework was tested with a database containing 120 3D scans from 30 subjects (Half are neutral faces and half are smiling faces). It is shown that the proposed framework achieves a recognition rate 10% higher than attempting the identification with only the neutral face recognition module.^
Resumo:
Ensuring the correctness of software has been the major motivation in software research, constituting a Grand Challenge. Due to its impact in the final implementation, one critical aspect of software is its architectural design. By guaranteeing a correct architectural design, major and costly flaws can be caught early on in the development cycle. Software architecture design has received a lot of attention in the past years, with several methods, techniques and tools developed. However, there is still more to be done, such as providing adequate formal analysis of software architectures. On these regards, a framework to ensure system dependability from design to implementation has been developed at FIU (Florida International University). This framework is based on SAM (Software Architecture Model), an ADL (Architecture Description Language), that allows hierarchical compositions of components and connectors, defines an architectural modeling language for the behavior of components and connectors, and provides a specification language for the behavioral properties. The behavioral model of a SAM model is expressed in the form of Petri nets and the properties in first order linear temporal logic.^ This dissertation presents a formal verification and testing approach to guarantee the correctness of Software Architectures. The Software Architectures studied are expressed in SAM. For the formal verification approach, the technique applied was model checking and the model checker of choice was Spin. As part of the approach, a SAM model is formally translated to a model in the input language of Spin and verified for its correctness with respect to temporal properties. In terms of testing, a testing approach for SAM architectures was defined which includes the evaluation of test cases based on Petri net testing theory to be used in the testing process at the design level. Additionally, the information at the design level is used to derive test cases for the implementation level. Finally, a modeling and analysis tool (SAM tool) was implemented to help support the design and analysis of SAM models. The results show the applicability of the approach to testing and verification of SAM models with the aid of the SAM tool.^
Resumo:
The purpose of this study was to investigate the ontogeny of auditory learning via operant contingency in Northern bobwhite (Colinus virginianus ) hatchlings and possible interaction between attention, orienting and learning during early development. Chicks received individual 5 min training sessions in which they received a playback of a bobwhite maternal call at a single delay following each vocalization they emitted. Playback was either from a single randomly chosen speaker or switched back and forth semi-randomly between two speakers during training. Chicks were tested 24 hrs later in a simultaneous choice test between the familiar and an unfamiliar maternal call. It was found that day-old chicks showed a significant time-specific decrement in auditory learning when trained with delays in the range of 470–910 ms between their vocalizations and call playback only when training involved two speakers. Two-day-old birds showed an even more sustained disruption of learning than day-old chicks, whereas three-day-old chicks showed a pattern of intermittent interference with their learning when trained at such delays. A similar but less severe decrement in auditory learning was found when chicks were provided with motor training in which playback was contingent upon chicks entering and exiting one of two colored squares placed on the floor of the arena. Chicks provided with playback of the call at randomly chosen delays each time they vocalized exhibited large fluctuations in their responsivity to the auditory stimulus as a function of delay—fluctuations which were correlated significantly with measures of chick learning, particularly at two-days-of-age. When playback was limited to a single location chicks no longer showed a time-specific disruption of their learning of the auditory stimulus. Sequential analyses revealed several patterns suggesting that an attentional process similar or analogous to attentional blink may have contributed both to the observed fluctuations in chick responsivity to the auditory stimulus as a function of delay and to the time-specific learning deficit shown by chicks provided with two-speaker training. The study highlights that learning can be substantially modulated by processes of orienting and attention and has a number of important implications for research within cognitive neuroscience, animal behavior and learning.
Resumo:
This dissertation introduces a novel automated book reader as an assistive technology tool for persons with blindness. The literature shows extensive work in the area of optical character recognition, but the current methodologies available for the automated reading of books or bound volumes remain inadequate and are severely constrained during document scanning or image acquisition processes. The goal of the book reader design is to automate and simplify the task of reading a book while providing a user-friendly environment with a realistic but affordable system design. This design responds to the main concerns of (a) providing a method of image acquisition that maintains the integrity of the source (b) overcoming optical character recognition errors created by inherent imaging issues such as curvature effects and barrel distortion, and (c) determining a suitable method for accurate recognition of characters that yields an interface with the ability to read from any open book with a high reading accuracy nearing 98%. This research endeavor focuses in its initial aim on the development of an assistive technology tool to help persons with blindness in the reading of books and other bound volumes. But its secondary and broader aim is to also find in this design the perfect platform for the digitization process of bound documentation in line with the mission of the Open Content Alliance (OCA), a nonprofit Alliance at making reading materials available in digital form. The theoretical perspective of this research relates to the mathematical developments that are made in order to resolve both the inherent distortions due to the properties of the camera lens and the anticipated distortions of the changing page curvature as one leafs through the book. This is evidenced by the significant increase of the recognition rate of characters and a high accuracy read-out through text to speech processing. This reasonably priced interface with its high performance results and its compatibility to any computer or laptop through universal serial bus connectors extends greatly the prospects for universal accessibility to documentation.
Resumo:
Perception and recognition of faces are fundamental cognitive abilities that form a basis for our social interactions. Research has investigated face perception using a variety of methodologies across the lifespan. Habituation, novelty preference, and visual paired comparison paradigms are typically used to investigate face perception in young infants. Storybook recognition tasks and eyewitness lineup paradigms are generally used to investigate face perception in young children. These methodologies have introduced systematic differences including the use of linguistic information for children but not infants, greater memory load for children than infants, and longer exposure times to faces for infants than for older children, making comparisons across age difficult. Thus, research investigating infant and child perception of faces using common methods, measures, and stimuli is needed to better understand how face perception develops. According to predictions of the Intersensory Redundancy Hypothesis (IRH; Bahrick & Lickliter, 2000, 2002), in early development, perception of faces is enhanced in unimodal visual (i.e., silent dynamic face) rather than bimodal audiovisual (i.e., dynamic face with synchronous speech) stimulation. The current study investigated the development of face recognition across children of three ages: 5 – 6 months, 18 – 24 months, and 3.5 – 4 years, using the novelty preference paradigm and the same stimuli for all age groups. It also assessed the role of modality (unimodal visual versus bimodal audiovisual) and memory load (low versus high) on face recognition. It was hypothesized that face recognition would improve across age and would be enhanced in unimodal visual stimulation with a low memory load. Results demonstrated a developmental trend (F(2, 90) = 5.00, p = 0.009) with older children showing significantly better recognition of faces than younger children. In contrast to predictions, no differences were found as a function of modality of presentation (bimodal audiovisual versus unimodal visual) or memory load (low versus high). This study was the first to demonstrate a developmental improvement in face recognition from infancy through childhood using common methods, measures and stimuli consistent across age.
Resumo:
Race in Argentina played a significant role as a highly durable construct by identifying and advancing subjects (1776–1810) and citizens (1811–1853). My dissertation explores the intricacies of power relations by focusing on the ways in which race informed the legal process during the transition from a colonial to national State. It argues that the State’s development in both the colonial and national periods depended upon defining and classifying African descendants. In response, people of African descendent used the State’s assigned definitions and classifications to advance their legal identities. It employs race and culture as operative concepts, and law as a representation of the sometimes, tense relationship between social practices and the State’s concern for social peace. This dissertation examines the dynamic nature of the court. It utilizes the theoretical concepts multicentric legal orders that are analyzed through weak and strong legal pluralisms, and jurisdictional politics, from the late eighteenth to early nineteenth centuries. This dissertation juxtaposes various levels of jurisdiction (canon/state law and colonial/national law) to illuminate how people of color used the legal system to ameliorate their social condition. In each chapter the primary source materials are state generated documents which include criminal, ecclesiastical, civil, and marriage dissent court cases along with notarial and census records. Though it would appear that these documents would provide a superficial understanding of people of color, my analysis provides both a top-down and bottom-up approach that reflects a continuous negotiation for African descendants’ goal for State recognition. These approaches allow for implicit or explicit negotiation of a legal identity that transformed slaves and free African descendants into active agents of their own destinies.
Resumo:
http://digitalcommons.fiu.edu/com_images/1004/thumbnail.jpg
Resumo:
Novel predator introductions are thought to have a high impact on native prey, especially in freshwater systems. Prey may fail to recognize predators as a threat, or show inappropriate or ineffective responses. The ability of prey to recognize and respond appropriately to novel predators may depend on the prey’s use of general or specific cues to detect predation threats.We used laboratory experiments to examine the ability of three native Everglades prey species (Eastern mosquitofish, flagfish and riverine grass shrimp) to respond to the presence, as well as to the chemical and visual cues of a native predator (warmouth) and a recentlyintroduced non-native predator (African jewelfish). We used prey from populations that had not previously encountered jewelfish. Despite this novelty, the native warmouth and nonnative jewelfish had overall similar predatory effects, except on mosquitofish, which suffered higher warmouth predation. When predators were present, the three prey taxa showed consistent and strong responses to the non-native jewelfish, which were similar in magnitude to the responses exhibited to the native warmouth. When cues were presented, fish prey responded largely to chemical cues, while shrimp showed no response to either chemical or visual cues. Overall, responses by mosquitofish and flagfish to chemical cues indicated low differentiation among cue types, with similar responses to general and specific cues. The fact that antipredator behaviours were similar toward native and non-native predators suggests that the susceptibility to a novel fish predator may be similar to that of native fishes, and prey may overcome predator novelty, at least when predators are confamilial to other common and longer-established non-native threats.