Biblioteca Digital

957 resultados para machine recognition

Uncalibrated hand-eye co-ordination and man-machine interfaces

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Towards improved speech recognition using a speech production model

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Advanced Monte Carlo Simulation and Machine Learning for Frequency Domain Optical Coherence Tomography

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Optical Coherence Tomography(OCT) is a popular, rapidly growing imaging technique with an increasing number of bio-medical applications due to its noninvasive nature. However, there are three major challenges in understanding and improving an OCT system: (1) Obtaining an OCT image is not easy. It either takes a real medical experiment or requires days of computer simulation. Without much data, it is difficult to study the physical processes underlying OCT imaging of different objects simply because there aren't many imaged objects. (2) Interpretation of an OCT image is also hard. This challenge is more profound than it appears. For instance, it would require a trained expert to tell from an OCT image of human skin whether there is a lesion or not. This is expensive in its own right, but even the expert cannot be sure about the exact size of the lesion or the width of the various skin layers. The take-away message is that analyzing an OCT image even from a high level would usually require a trained expert, and pixel-level interpretation is simply unrealistic. The reason is simple: we have OCT images but not their underlying ground-truth structure, so there is nothing to learn from. (3) The imaging depth of OCT is very limited (millimeter or sub-millimeter on human tissues). While OCT utilizes infrared light for illumination to stay noninvasive, the downside of this is that photons at such long wavelengths can only penetrate a limited depth into the tissue before getting back-scattered. To image a particular region of a tissue, photons first need to reach that region. As a result, OCT signals from deeper regions of the tissue are both weak (since few photons reached there) and distorted (due to multiple scatterings of the contributing photons). This fact alone makes OCT images very hard to interpret.

This thesis addresses the above challenges by successfully developing an advanced Monte Carlo simulation platform which is 10000 times faster than the state-of-the-art simulator in the literature, bringing down the simulation time from 360 hours to a single minute. This powerful simulation tool not only enables us to efficiently generate as many OCT images of objects with arbitrary structure and shape as we want on a common desktop computer, but it also provides us the underlying ground-truth of the simulated images at the same time because we dictate them at the beginning of the simulation. This is one of the key contributions of this thesis. What allows us to build such a powerful simulation tool includes a thorough understanding of the signal formation process, clever implementation of the importance sampling/photon splitting procedure, efficient use of a voxel-based mesh system in determining photon-mesh interception, and a parallel computation of different A-scans that consist a full OCT image, among other programming and mathematical tricks, which will be explained in detail later in the thesis.

Next we aim at the inverse problem: given an OCT image, predict/reconstruct its ground-truth structure on a pixel level. By solving this problem we would be able to interpret an OCT image completely and precisely without the help from a trained expert. It turns out that we can do much better. For simple structures we are able to reconstruct the ground-truth of an OCT image more than 98% correctly, and for more complicated structures (e.g., a multi-layered brain structure) we are looking at 93%. We achieved this through extensive uses of Machine Learning. The success of the Monte Carlo simulation already puts us in a great position by providing us with a great deal of data (effectively unlimited), in the form of (image, truth) pairs. Through a transformation of the high-dimensional response variable, we convert the learning task into a multi-output multi-class classification problem and a multi-output regression problem. We then build a hierarchy architecture of machine learning models (committee of experts) and train different parts of the architecture with specifically designed data sets. In prediction, an unseen OCT image first goes through a classification model to determine its structure (e.g., the number and the types of layers present in the image); then the image is handed to a regression model that is trained specifically for that particular structure to predict the length of the different layers and by doing so reconstruct the ground-truth of the image. We also demonstrate that ideas from Deep Learning can be useful to further improve the performance.

It is worth pointing out that solving the inverse problem automatically improves the imaging depth, since previously the lower half of an OCT image (i.e., greater depth) can be hardly seen but now becomes fully resolved. Interestingly, although OCT signals consisting the lower half of the image are weak, messy, and uninterpretable to human eyes, they still carry enough information which when fed into a well-trained machine learning model spits out precisely the true structure of the object being imaged. This is just another case where Artificial Intelligence (AI) outperforms human. To the best knowledge of the author, this thesis is not only a success but also the first attempt to reconstruct an OCT image at a pixel level. To even give a try on this kind of task, it would require fully annotated OCT images and a lot of them (hundreds or even thousands). This is clearly impossible without a powerful simulation tool like the one developed in this thesis.

Veja mais

Robust speech recognition in noise --- performance of the IBM continuous speech recogniser on the ARPA noise spoke task

Relevância:

20.00% 20.00%

Publicador:

Veja mais

The development of the 1994 HTK large vocabulary speech recognition system

Relevância:

20.00% 20.00%

Publicador:

Veja mais

The oculomotor system: (1) Vertical-horizontal interaction and signal recognition, (2) Time delays and power spectra

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the first section of this thesis, two-dimensional properties of the human eye movement control system were studied. The vertical - horizontal interaction was investigated by using a two-dimensional target motion consisting of a sinusoid in one of the directions vertical or horizontal, and low-pass filtered Gaussian random motion of variable bandwidth (and hence information content) in the orthogonal direction. It was found that the random motion reduced the efficiency of the sinusoidal tracking. However, the sinusoidal tracking was only slightly dependent on the bandwidth of the random motion. Thus the system should be thought of as consisting of two independent channels with a small amount of mutual cross-talk.

These target motions were then rotated to discover whether or not the system is capable of recognizing the two-component nature of the target motion. That is, the sinusoid was presented along an oblique line (neither vertical nor horizontal) with the random motion orthogonal to it. The system did not simply track the vertical and horizontal components of motion, but rotated its frame of reference so that its two tracking channels coincided with the directions of the two target motion components. This recognition occurred even when the two orthogonal motions were both random, but with different bandwidths.

In the second section, time delays, prediction and power spectra were examined. Time delays were calculated in response to various periodic signals, various bandwidths of narrow-band Gaussian random motions and sinusoids. It was demonstrated that prediction occurred only when the target motion was periodic, and only if the harmonic content was such that the signal was sufficiently narrow-band. It appears as if general periodic motions are split into predictive and non-predictive components.

For unpredictable motions, the relationship between the time delay and the average speed of the retinal image was linear. Based on this I proposed a model explaining the time delays for both random and periodic motions. My experiments did not prove that the system is sampled data, or that it is continuous. However, the model can be interpreted as representative of a sample data system whose sample interval is a function of the target motion.

It was shown that increasing the bandwidth of the low-pass filtered Gaussian random motion resulted in an increase of the eye movement bandwidth. Some properties of the eyeball-muscle dynamics and the extraocular muscle "active state tension" were derived.

Veja mais

The 1994 HTK large vocabulary speech recognition system

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Proton magnetic resonance studies of ribonucleic acid complexes. I. Complexes of biological bases and oligonucleotides with RNA. II. Template recognition and the degeneracy of the genetic code

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Part I. Complexes of Biological Bases and Oligonucleotides with RNA

The physical nature of complexes of several biological bases and oligonucleotides with single-stranded ribonucleic acids have been studied by high resolution proton magnetic resonance spectroscopy. The importance of various forces in the stabilization of these complexes is also discussed.

Previous work has shown that purine forms an intercalated complex with single-stranded nucleic acids. This complex formation led to severe and stereospecific broadening of the purine resonances. From the field dependence of the linewidths, T₁ measurements of the purine protons and nuclear Overhauser enhancement experiments, the mechanism for the line broadening was ascertained to be dipole-dipole interactions between the purine protons and the ribose protons of the nucleic acid.

The interactions of ethidium bromide (EB) with several RNA residues have been studied. EB forms vertically stacked aggregates with itself as well as with uridine, 3'-uridine monophosphate and 5'-uridine monophosphate and forms an intercalated complex with uridylyl (3' → 5') uridine and polyuridylic acid (poly U). The geometry of EB in the intercalated complex has also been determined.

The effect of chain length of oligo-A-nucleotides on their mode of interaction with poly U in D₂0 at neutral pD have also been studied. Below room temperatures, ApA and ApApA form a rigid triple-stranded complex involving a stoichiometry of one adenine to two uracil bases, presumably via specific adenine-uracil base pairing and cooperative base stacking of the adenine bases. While no evidence was obtained for the interaction of ApA with poly U above room temperature, ApApA exhibited complex formation of a 1:1 nature with poly U by forming Watson-Crick base pairs. The thermodynamics of these systems are discussed.

Part II. Template Recognition and the Degeneracy of the Genetic Code

The interaction of ApApG and poly U was studied as a model system for the codon-anticodon interaction of tRNA and mRNA in vivo. ApApG was shown to interact with poly U below ~20°C. The interaction was of a 1:1 nature which exhibited the Hoogsteen bonding scheme. The three bases of ApApG are in an anti conformation and the guanosine base appears to be in the lactim tautomeric form in the complex.

Due to the inadequacies of previous models for the degeneracy of the genetic code in explaining the observed interactions of ApApG with poly U, the "tautomeric doublet" model is proposed as a possible explanation of the degenerate interactions of tRNA with mRNA during protein synthesis in vivo.

Veja mais