859 resultados para Semi-supervised learning
Resumo:
Multi-label classification (MLC) is the supervised learning problem where an instance may be associated with multiple labels. Modeling dependencies between labels allows MLC methods to improve their performance at the expense of an increased computational cost. In this paper we focus on the classifier chains (CC) approach for modeling dependencies. On the one hand, the original CC algorithm makes a greedy approximation, and is fast but tends to propagate errors down the chain. On the other hand, a recent Bayes-optimal method improves the performance, but is computationally intractable in practice. Here we present a novel double-Monte Carlo scheme (M2CC), both for finding a good chain sequence and performing efficient inference. The M2CC algorithm remains tractable for high-dimensional data sets and obtains the best overall accuracy, as shown on several real data sets with input dimension as high as 1449 and up to 103 labels.
Resumo:
This work proposes an optimization of a semi-supervised Change Detection methodology based on a combination of Change Indices (CI) derived from an image multitemporal data set. For this purpose, SPOT 5 Panchromatic images with 2.5 m spatial resolution have been used, from which three Change Indices have been calculated. Two of them are usually known indices; however the third one has been derived considering the Kullbak-Leibler divergence. Then, these three indices have been combined forming a multiband image that has been used in as input for a Support Vector Machine (SVM) classifier where four different discriminant functions have been tested in order to differentiate between change and no_change categories. The performance of the suggested procedure has been assessed applying different quality measures, reaching in each case highly satisfactory values. These results have demonstrated that the simultaneous combination of basic change indices with others more sophisticated like the Kullback-Leibler distance, and the application of non-parametric discriminant functions like those employees in the SVM method, allows solving efficiently a change detection problem.
Resumo:
El análisis de las diferentes alternativas en la planificación y diseño de corredores y trazados de carreteras debe basarse en la correcta definición de variables territoriales que sirvan como criterios para la toma de decisión y esto requiere un análisis ambiental preliminar de esas variables de calidad. En España, los estudios de viabilidad de nuevas carreteras y autovías están asociados a una fase del proceso de decisión que se corresponde con el denominado Estudio Informativo, el cual establece condicionantes físicos, ambientales, de uso del suelo y culturales que deben ser considerados en las primeras fases de la definición del trazado de un corredor de carretera. Así, la metodología más frecuente es establecer diferentes niveles de capacidad de acogida del territorio en el área de estudio con el fin de resumir las variables territoriales en mapas temáticos y facilitar el proceso de trazado de las alternativas de corredores de carretera. El paisaje es un factor limitante a tener en cuenta en la planificación y diseño de carreteras y, por tanto, deben buscarse trazados más sostenibles en relación con criterios estéticos y ecológicos del mismo. Pero este factor no es frecuentemente analizado en los Estudios Informativos e incluso, si es considerado, los estudios específicos de la calidad del paisaje (estético y ecológico) y de las formas del terreno no incorporan las recomendaciones de las guías de trazado para evitar o reducir los impactos en el paisaje. Además, los mapas de paisaje que se generan en este tipo de estudios no se corresponden con la escala de desarrollo del Estudio Informativo (1:5.000). Otro déficit común en planificación de corredores y trazados de carreteras es que no se tiene en cuenta la conectividad del paisaje durante el proceso de diseño de la carretera para prevenir la afección a los corredores de fauna existentes en el paisaje. Este déficit puede originar un posterior efecto barrera en los movimientos dispersivos de la fauna y la fragmentación de sus hábitats debido a la ocupación parcial o total de las teselas de hábitats con importancia biológica para la fauna (o hábitats focales) y a la interrupción de los corredores de fauna que concentran esos movimientos dispersivos de la fauna entre teselas. El objetivo principal de esta tesis es mejorar el estudio del paisaje para prevenir su afección durante el proceso de trazado de carreteras, facilitar la conservación de los corredores de fauna (o pasillos verdes) y la localización de medidas preventivas y correctoras en términos de selección y cuantificación de factores de idoneidad a fin de reducir los impactos visuales y ecológicos en el paisaje a escala local. Concretamente, la incorporación de valores cuantitativos y bien justificados en el proceso de decisión permite incrementar la transparencia en el proceso de diseño de corredores y trazados de carreteras. Con este fin, se han planteado cuatro preguntas específicas en esta investigación (1) ¿Cómo se seleccionan y evalúan los factores territoriales limitantes para localizar una nueva carretera por los profesionales españoles de planificación del territorio en relación con el paisaje? (2) ¿Cómo pueden ser definidos los corredores de fauna a partir de factores del paisaje que influyen en los movimientos dispersivos de la fauna? (3) ¿Cómo pueden delimitarse y evaluarse los corredores de fauna incluyendo el comportamiento parcialmente errático en los movimientos dispersivos de la fauna y el efecto barrera de los elementos antrópicos a una escala local? (4) ¿Qué y cómo las recomendaciones de diseño de carreteras relacionadas con el paisaje y las formas del terreno pueden ser incluidas en un modelo de Sistemas de Información Geográfica (SIG) para ayudar a los ingenieros civiles durante el proceso de diseño de un trazado de carreteras bajo el punto de vista de la sostenibilidad?. Esta tesis doctoral propone nuevas metodologías que mejoran el análisis visual y ecológico del paisaje utilizando indicadores y modelos SIG para obtener alternativas de trazado que produzcan un menor impacto en el paisaje. Estas metodologías fueron probadas en un paisaje heterogéneo con una alta tasa de densidad de corzo (Capreolus capreolus L.), uno de los grandes mamíferos más atropellados en la red de carreteras españolas, y donde está planificada la construcción de una nueva autovía que atravesará la mitad del área de distribución del corzo. Inicialmente, se han analizado las variables utilizadas en 22 estudios de proyectos de planificación de corredores de carreteras promovidos por el Ministerio de Fomento entre 2006 y 2008. Estas variables se agruparon según condicionantes físicos, ambientales, de usos del suelo y culturales con el fin de comparar los valores asignados de capacidad de acogida del territorio a cada variable en los diferentes estudios revisados. Posteriormente, y como etapa previa de un análisis de conectividad, se construyó un mapa de resistencia de los movimientos dispersivos del corzo en base a la literatura y al juicio de expertos. Usando esta investigación como base, se le asignó un valor de resistencia a cada factor seleccionado para construir la matriz de resistencia, ponderándolo y combinándolo con el resto de factores usando el proceso analítico jerárquico y los operadores de lógica difusa como métodos de análisis multicriterio. Posteriormente, se diseñó una metodología SIG para delimitar claramente la extensión física de los corredores de fauna de acuerdo a un valor umbral de ancho geométrico mínimo, así como la existencia de múltiples potenciales conexiones entre cada par de teselas de hábitats presentes en el paisaje estudiado. Finalmente, se realizó un procesado de datos Light Detection and Ranging (LiDAR) y un modelo SIG para calcular la calidad del paisaje (estético y ecológico), las formas del terreno que presentan características similares para trazar una carretera y la acumulación de vistas de potenciales conductores y observadores de los alrededores de la nueva vía. Las principales contribuciones de esta investigación al conocimiento científico existente en el campo de la evaluación del impacto ambiental en relación al diseño de corredores y trazados de carreteras son cuatro. Primero, el análisis realizado de 22 Estudios Informativos de planificación de carreteras reveló que los métodos aplicados por los profesionales para la evaluación de la capacidad de acogida del territorio no fue suficientemente estandarizada, ya que había una falta de uniformidad en el uso de fuentes cartográficas y en las metodologías de evaluación de la capacidad de acogida del territorio, especialmente en el análisis de la calidad del paisaje estético y ecológico. Segundo, el análisis realizado en esta tesis destaca la importancia de los métodos multicriterio para estructurar, combinar y validar factores que limitan los movimientos dispersivos de la fauna en el análisis de conectividad. Tercero, los modelos SIG desarrollados Generador de alternativas de corredores o Generator of Alternative Corridors (GAC) y Eliminador de Corredores Estrechos o Narrow Corridor Eraser (NCE) pueden ser aplicados sistemáticamente y sobre una base científica en análisis de conectividad como una mejora de las herramientas existentes para la comprensión el paisaje como una red compuesta por nodos y enlaces interconectados. Así, ejecutando los modelos GAC y NCE de forma iterativa, pueden obtenerse corredores alternativos con similar probabilidad de ser utilizados por la fauna y sin que éstos presenten cuellos de botella. Cuarto, el caso de estudio llevado a cabo de prediseño de corredores y trazado de una nueva autovía ha sido novedoso incluyendo una clasificación semisupervisada de las formas del terreno, filtrando una nube de puntos LiDAR e incluyendo la nueva geometría 3D de la carretera en el Modelo Digital de Superficie (MDS). El uso combinado del procesamiento de datos LiDAR y de índices y clasificaciones geomorfológicas puede ayudar a los responsables encargados en la toma de decisiones a evaluar qué alternativas de trazado causan el menor impacto en el paisaje, proporciona una visión global de los juicios de valor más aplicados y, en conclusión, define qué medidas de integración paisajística correctoras deben aplicarse y dónde. ABSTRACT The assessment of different alternatives in road-corridor planning and layout design must be based on a number of well-defined territorial variables that serve as decision-making criteria, and this requires a high-quality preliminary environmental analysis of those quality variables. In Spain, feasibility studies for new roads and motorways are associated to a phase of the decision procedure which corresponds with the one known as the Informative Study, which establishes the physical, environmental, land-use and cultural constraints to be considered in the early stages of defining road corridor layouts. The most common methodology is to establish different levels of Territorial Carrying Capacity (TCC) in the study area in order to summarize the territorial variables on thematic maps and facilitate the tracing process of road-corridor layout alternatives. Landscape is a constraint factor that must be considered in road planning and design, and the most sustainable layouts should be sought based on aesthetic and ecological criteria. However this factor is not often analyzed in Informative Studies and even if it is, baseline studies on landscape quality (aesthetic and ecological) and landforms do not usually include the recommendations of road tracing guides designed to avoid or reduce impacts on the landscape. The resolution of the landscape maps produced in this type of studies does not comply with the recommended road design scale (1:5,000) in the regulations for the Informative Study procedure. Another common shortcoming in road planning is that landscape ecological connectivity is not considered during road design in order to avoid affecting wildlife corridors in the landscape. In the prior road planning stage, this issue could lead to a major barrier effect for fauna dispersal movements and to the fragmentation of their habitat due to the partial or total occupation of habitat patches of biological importance for the fauna (or focal habitats), and the interruption of wildlife corridors that concentrate fauna dispersal movements between patches. The main goal of this dissertation is to improve the study of the landscape and prevent negative effects during the road tracing process, and facilitate the preservation of wildlife corridors (or green ways) and the location of preventive and corrective measures by selecting and quantifying suitability factors to reduce visual and ecological landscape impacts at a local scale. Specifically the incorporation of quantitative and well-supported values in the decision-making process provides increased transparency in the road corridors and layouts design process. Four specific questions were raised in this research: (1) How are territorial constraints selected and evaluated in terms of landscape by Spanish land-planning practitioners before locating a new road? (2) How can wildlife corridors be defined based on the landscape factors influencing the dispersal movements of fauna? (3) How can wildlife corridors be delimited and assessed to include the partially erratic movements of fauna and the barrier effect of the anthropic elements at a local scale? (4) How recommendations of road design related to landscape and landforms can be included in a Geographic Information System (GIS) model to aid civil engineers during the road layout design process and support sustainable development? This doctoral thesis proposes new methodologies that improve the assessment of the visual and ecological landscape character using indicators and GIS models to obtain road layout alternatives with a lower impact on the landscape. These methodologies were tested on a case study of a heterogeneous landscape with a high density of roe deer (Capreolus capreolus L.) –one of the large mammals most commonly hit by vehicles on the Spanish road network– and where a new motorway is planned to pass through the middle of their distribution area. We explored the variables used in 22 road-corridor planning projects sponsored by the Ministry of Public Works between 2006 and 2008. These variables were grouped into physical, environmental, land-use and cultural constraints for the purpose of comparing the TCC values assigned to each variable in the various studies reviewed. As a prior stage in a connectivity analysis, a map of resistance to roe deer dispersal movements was created based on the literature and experts judgment. Using this research as a base, each factor selected to build the matrix was assigned a resistance value and weighted and combined with the rest of the factors using the analytic hierarchy process (AHP) and fuzzy logic operators as multicriteria assessment (MCA) methods. A GIS methodology was designed to clearly delimit the physical area of wildlife corridors according to a geometric threshold width value, and the multiple potential connections between each pair of habitat patches in the landscape. A Digital Surface Model Light Detection and Ranging (LiDAR) dataset processing and a GIS model was performed to determine landscape quality (aesthetic and ecological) and landforms with similar characteristics for the road layout, and the cumulative viewshed of potential drivers and observers in the area surrounding the new motorway. The main contributions of this research to current scientific knowledge in the field of environmental impact assessment for road corridors and layouts design are four. First, the analysis of 22 Informative Studies on road planning revealed that the methods applied by practitioners for assessing the TCC were not sufficiently standardized due to the lack of uniformity in the cartographic information sources and the TCC valuation methodologies, especially in the analysis of the aesthetic and ecological quality of the landscape. Second, the analysis in this dissertation highlights the importance of multicriteria methods to structure, combine and validate factors that constrain wildlife dispersal movements in the connectivity analysis. Third, the “Generator of Alternative Corridors (GAC)” and “Narrow Corridor Eraser (NCE)” GIS models developed can be applied systematically and on a scientific basis in connectivity analyses to improve existing tools and understand landscape as a network composed of interconnected nodes and links. Thus, alternative corridors with similar probability of use by fauna and without bottlenecks can be obtained by iteratively running GAC and NCE models. Fourth, our case study of new motorway corridors and layouts design innovatively included semi-supervised classification of landforms, filtering of LiDAR point clouds and new 3D road geometry on the Digital Surface Model (DSM). The combined used of LiDAR data processing and geomorphological indices and classifications can help decision-makers assess which road layouts produce lower impacts on the landscape, provide an overall insight into the most commonly applied value judgments, and in conclusion, define which corrective measures should be applied in terms of landscaping, and where.
Resumo:
Este trabalho visa conhecer e analisar as ferramentas de marketing e de comunicação utilizadas pela Universidade Metodista de São Paulo, pela Universidade Anhembi Morumbi e pela Universidade de São Paulo em seus cursos de Educação a Distância, bem como nas disciplinas semipresenciais oferecidas pelas referidas instituições, através de pesquisas nos sites e entrevistas com os responsáveis pela elaboração da EaD nas instituições. Tenciona ainda, comparar as estratégias criadas por essas instituições no desenvolvimento de seus cursos e disciplinas, além de conhecer e analisar a comunicação elaborada por elas, como forma de divulgar e persuadir seus públicos-alvo. Objetiva também buscar o conhecimento e a percepção que seus alunos possuem desses cursos e disciplinas e da Educação a Distância como um todo, através de entrevistas individuais e utilizando um questionário estruturado.(AU)
Resumo:
Este trabalho visa conhecer e analisar as ferramentas de marketing e de comunicação utilizadas pela Universidade Metodista de São Paulo, pela Universidade Anhembi Morumbi e pela Universidade de São Paulo em seus cursos de Educação a Distância, bem como nas disciplinas semipresenciais oferecidas pelas referidas instituições, através de pesquisas nos sites e entrevistas com os responsáveis pela elaboração da EaD nas instituições. Tenciona ainda, comparar as estratégias criadas por essas instituições no desenvolvimento de seus cursos e disciplinas, além de conhecer e analisar a comunicação elaborada por elas, como forma de divulgar e persuadir seus públicos-alvo. Objetiva também buscar o conhecimento e a percepção que seus alunos possuem desses cursos e disciplinas e da Educação a Distância como um todo, através de entrevistas individuais e utilizando um questionário estruturado.(AU)
Resumo:
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.
Resumo:
El campo de procesamiento de lenguaje natural (PLN), ha tenido un gran crecimiento en los últimos años; sus áreas de investigación incluyen: recuperación y extracción de información, minería de datos, traducción automática, sistemas de búsquedas de respuestas, generación de resúmenes automáticos, análisis de sentimientos, entre otras. En este artículo se presentan conceptos y algunas herramientas con el fin de contribuir al entendimiento del procesamiento de texto con técnicas de PLN, con el propósito de extraer información relevante que pueda ser usada en un gran rango de aplicaciones. Se pueden desarrollar clasificadores automáticos que permitan categorizar documentos y recomendar etiquetas; estos clasificadores deben ser independientes de la plataforma, fácilmente personalizables para poder ser integrados en diferentes proyectos y que sean capaces de aprender a partir de ejemplos. En el presente artículo se introducen estos algoritmos de clasificación, se analizan algunas herramientas de código abierto disponibles actualmente para llevar a cabo estas tareas y se comparan diversas implementaciones utilizando la métrica F en la evaluación de los clasificadores.
Resumo:
Cette thèse contribue a la recherche vers l'intelligence artificielle en utilisant des méthodes connexionnistes. Les réseaux de neurones récurrents sont un ensemble de modèles séquentiels de plus en plus populaires capable en principe d'apprendre des algorithmes arbitraires. Ces modèles effectuent un apprentissage en profondeur, un type d'apprentissage machine. Sa généralité et son succès empirique en font un sujet intéressant pour la recherche et un outil prometteur pour la création de l'intelligence artificielle plus générale. Le premier chapitre de cette thèse donne un bref aperçu des sujets de fonds: l'intelligence artificielle, l'apprentissage machine, l'apprentissage en profondeur et les réseaux de neurones récurrents. Les trois chapitres suivants couvrent ces sujets de manière de plus en plus spécifiques. Enfin, nous présentons quelques contributions apportées aux réseaux de neurones récurrents. Le chapitre \ref{arxiv1} présente nos travaux de régularisation des réseaux de neurones récurrents. La régularisation vise à améliorer la capacité de généralisation du modèle, et joue un role clé dans la performance de plusieurs applications des réseaux de neurones récurrents, en particulier en reconnaissance vocale. Notre approche donne l'état de l'art sur TIMIT, un benchmark standard pour cette tâche. Le chapitre \ref{cpgp} présente une seconde ligne de travail, toujours en cours, qui explore une nouvelle architecture pour les réseaux de neurones récurrents. Les réseaux de neurones récurrents maintiennent un état caché qui représente leurs observations antérieures. L'idée de ce travail est de coder certaines dynamiques abstraites dans l'état caché, donnant au réseau une manière naturelle d'encoder des tendances cohérentes de l'état de son environnement. Notre travail est fondé sur un modèle existant; nous décrivons ce travail et nos contributions avec notamment une expérience préliminaire.
Resumo:
Cette thèse contribue a la recherche vers l'intelligence artificielle en utilisant des méthodes connexionnistes. Les réseaux de neurones récurrents sont un ensemble de modèles séquentiels de plus en plus populaires capable en principe d'apprendre des algorithmes arbitraires. Ces modèles effectuent un apprentissage en profondeur, un type d'apprentissage machine. Sa généralité et son succès empirique en font un sujet intéressant pour la recherche et un outil prometteur pour la création de l'intelligence artificielle plus générale. Le premier chapitre de cette thèse donne un bref aperçu des sujets de fonds: l'intelligence artificielle, l'apprentissage machine, l'apprentissage en profondeur et les réseaux de neurones récurrents. Les trois chapitres suivants couvrent ces sujets de manière de plus en plus spécifiques. Enfin, nous présentons quelques contributions apportées aux réseaux de neurones récurrents. Le chapitre \ref{arxiv1} présente nos travaux de régularisation des réseaux de neurones récurrents. La régularisation vise à améliorer la capacité de généralisation du modèle, et joue un role clé dans la performance de plusieurs applications des réseaux de neurones récurrents, en particulier en reconnaissance vocale. Notre approche donne l'état de l'art sur TIMIT, un benchmark standard pour cette tâche. Le chapitre \ref{cpgp} présente une seconde ligne de travail, toujours en cours, qui explore une nouvelle architecture pour les réseaux de neurones récurrents. Les réseaux de neurones récurrents maintiennent un état caché qui représente leurs observations antérieures. L'idée de ce travail est de coder certaines dynamiques abstraites dans l'état caché, donnant au réseau une manière naturelle d'encoder des tendances cohérentes de l'état de son environnement. Notre travail est fondé sur un modèle existant; nous décrivons ce travail et nos contributions avec notamment une expérience préliminaire.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-04
Resumo:
Este trabalho visa conhecer e analisar as ferramentas de marketing e de comunicação utilizadas pela Universidade Metodista de São Paulo, pela Universidade Anhembi Morumbi e pela Universidade de São Paulo em seus cursos de Educação a Distância, bem como nas disciplinas semipresenciais oferecidas pelas referidas instituições, através de pesquisas nos sites e entrevistas com os responsáveis pela elaboração da EaD nas instituições. Tenciona ainda, comparar as estratégias criadas por essas instituições no desenvolvimento de seus cursos e disciplinas, além de conhecer e analisar a comunicação elaborada por elas, como forma de divulgar e persuadir seus públicos-alvo. Objetiva também buscar o conhecimento e a percepção que seus alunos possuem desses cursos e disciplinas e da Educação a Distância como um todo, através de entrevistas individuais e utilizando um questionário estruturado.(AU)
Resumo:
An unsupervised learning procedure based on maximizing the mutual information between the outputs of two networks receiving different but statistically dependent inputs is analyzed (Becker S. and Hinton G., Nature, 355 (1992) 161). By exploiting a formal analogy to supervised learning in parity machines, the theory of zero-temperature Gibbs learning for the unsupervised procedure is presented for the case that the networks are perceptrons and for the case of fully connected committees.
Resumo:
This thesis describes a novel connectionist machine utilizing induction by a Hilbert hypercube representation. This representation offers a number of distinct advantages which are described. We construct a theoretical and practical learning machine which lies in an area of overlap between three disciplines - neural nets, machine learning and knowledge acquisition - hence it is refered to as a "coalesced" machine. To this unifying aspect is added the various advantages of its orthogonal lattice structure as against less structured nets. We discuss the case for such a fundamental and low level empirical learning tool and the assumptions behind the machine are clearly outlined. Our theory of an orthogonal lattice structure the Hilbert hypercube of an n-dimensional space using a complemented distributed lattice as a basis for supervised learning is derived from first principles on clearly laid out scientific principles. The resulting "subhypercube theory" was implemented in a development machine which was then used to test the theoretical predictions again under strict scientific guidelines. The scope, advantages and limitations of this machine were tested in a series of experiments. Novel and seminal properties of the machine include: the "metrical", deterministic and global nature of its search; complete convergence invariably producing minimum polynomial solutions for both disjuncts and conjuncts even with moderate levels of noise present; a learning engine which is mathematically analysable in depth based upon the "complexity range" of the function concerned; a strong bias towards the simplest possible globally (rather than locally) derived "balanced" explanation of the data; the ability to cope with variables in the network; and new ways of reducing the exponential explosion. Performance issues were addressed and comparative studies with other learning machines indicates that our novel approach has definite value and should be further researched.
Resumo:
Neural Networks have been successfully employed in different biomedical settings. They have been useful for feature extractions from images and biomedical data in a variety of diagnostic applications. In this paper, they are applied as a diagnostic tool for classifying different levels of gastric electrical uncoupling in controlled acute experiments on dogs. Data was collected from 16 dogs using six bipolar electrodes inserted into the serosa of the antral wall. Each dog underwent three recordings under different conditions: (1) basal state, (2) mild surgically-induced uncoupling, and (3) severe surgically-induced uncoupling. For each condition half-hour recordings were made. The neural network was implemented according to the Learning Vector Quantization model. This is a supervised learning model of the Kohonen Self-Organizing Maps. Majority of the recordings collected from the dogs were used for network training. Remaining recordings served as a testing tool to examine the validity of the training procedure. Approximately 90% of the dogs from the neural network training set were classified properly. However, only 31% of the dogs not included in the training process were accurately diagnosed. The poor neural-network based diagnosis of recordings that did not participate in the training process might have been caused by inappropriate representation of input data. Previous research has suggested characterizing signals according to certain features of the recorded data. This method, if employed, would reduce the noise and possibly improve the diagnostic abilities of the neural network.
Resumo:
As one of the most popular deep learning models, convolution neural network (CNN) has achieved huge success in image information extraction. Traditionally CNN is trained by supervised learning method with labeled data and used as a classifier by adding a classification layer in the end. Its capability of extracting image features is largely limited due to the difficulty of setting up a large training dataset. In this paper, we propose a new unsupervised learning CNN model, which uses a so-called convolutional sparse auto-encoder (CSAE) algorithm pre-Train the CNN. Instead of using labeled natural images for CNN training, the CSAE algorithm can be used to train the CNN with unlabeled artificial images, which enables easy expansion of training data and unsupervised learning. The CSAE algorithm is especially designed for extracting complex features from specific objects such as Chinese characters. After the features of articficial images are extracted by the CSAE algorithm, the learned parameters are used to initialize the first CNN convolutional layer, and then the CNN model is fine-Trained by scene image patches with a linear classifier. The new CNN model is applied to Chinese scene text detection and is evaluated with a multilingual image dataset, which labels Chinese, English and numerals texts separately. More than 10% detection precision gain is observed over two CNN models.