919 resultados para Semi-supervised classification


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semi-supervised learning techniques have gained increasing attention in the machine learning community, as a result of two main factors: (1) the available data is exponentially increasing; (2) the task of data labeling is cumbersome and expensive, involving human experts in the process. In this paper, we propose a network-based semi-supervised learning method inspired by the modularity greedy algorithm, which was originally applied for unsupervised learning. Changes have been made in the process of modularity maximization in a way to adapt the model to propagate labels throughout the network. Furthermore, a network reduction technique is introduced, as well as an extensive analysis of its impact on the network. Computer simulations are performed for artificial and real-world databases, providing a numerical quantitative basis for the performance of the proposed method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The strength and durability of materials produced from aggregates (e.g., concrete bricks, concrete, and ballast) are critically affected by the weathering of the particles, which is closely related to their mineral composition. It is possible to infer the degree of weathering from visual features derived from the surface of the aggregates. By using sound pattern recognition methods, this study shows that the characterization of the visual texture of particles, performed by using texture-related features of gray scale images, allows the effective differentiation between weathered and nonweathered aggregates. The selection of the most discriminative features is also performed by taking into account a feature ranking method. The evaluation of the methodology in the presence of noise suggests that it can be used in stone quarries for automatic detection of weathered materials.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Training a system to recognize handwritten words is a task that requires a large amount of data with their correct transcription. However, the creation of such a training set, including the generation of the ground truth, is tedious and costly. One way of reducing the high cost of labeled training data acquisition is to exploit unlabeled data, which can be gathered easily. Making use of both labeled and unlabeled data is known as semi-supervised learning. One of the most general versions of semi-supervised learning is self-training, where a recognizer iteratively retrains itself on its own output on new, unlabeled data. In this paper we propose to apply semi-supervised learning, and in particular self-training, to the problem of cursive, handwritten word recognition. The special focus of the paper is on retraining rules that define what data are actually being used in the retraining phase. In a series of experiments it is shown that the performance of a neural network based recognizer can be significantly improved through the use of unlabeled data and self-training if appropriate retraining rules are applied.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this research was to implement a methodology through the generation of a supervised classifier based on the Mahalanobis distance to characterize the grapevine canopy and assess leaf area and yield using RGB images. The method automatically processes sets of images, and calculates the areas (number of pixels) corresponding to seven different classes (Grapes, Wood, Background, and four classes of Leaf, of increasing leaf age). Each one is initialized by the user, who selects a set of representative pixels for every class in order to induce the clustering around them. The proposed methodology was evaluated with 70 grapevine (V. vinifera L. cv. Tempranillo) images, acquired in a commercial vineyard located in La Rioja (Spain), after several defoliation and de-fruiting events on 10 vines, with a conventional RGB camera and no artificial illumination. The segmentation results showed a performance of 92% for leaves and 98% for clusters, and allowed to assess the grapevine’s leaf area and yield with R2 values of 0.81 (p < 0.001) and 0.73 (p = 0.002), respectively. This methodology, which operates with a simple image acquisition setup and guarantees the right number and kind of pixel classes, has shown to be suitable and robust enough to provide valuable information for vineyard management.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

INTRODUCTION: Objective assessment of motor skills has become an important challenge in minimally invasive surgery (MIS) training.Currently, there is no gold standard defining and determining the residents' surgical competence.To aid in the decision process, we analyze the validity of a supervised classifier to determine the degree of MIS competence based on assessment of psychomotor skills METHODOLOGY: The ANFIS is trained to classify performance in a box trainer peg transfer task performed by two groups (expert/non expert). There were 42 participants included in the study: the non-expert group consisted of 16 medical students and 8 residents (< 10 MIS procedures performed), whereas the expert group consisted of 14 residents (> 10 MIS procedures performed) and 4 experienced surgeons. Instrument movements were captured by means of the Endoscopic Video Analysis (EVA) tracking system. Nine motion analysis parameters (MAPs) were analyzed, including time, path length, depth, average speed, average acceleration, economy of area, economy of volume, idle time and motion smoothness. Data reduction was performed by means of principal component analysis, and then used to train the ANFIS net. Performance was measured by leave one out cross validation. RESULTS: The ANFIS presented an accuracy of 80.95%, where 13 experts and 21 non-experts were correctly classified. Total root mean square error was 0.88, while the area under the classifiers' ROC curve (AUC) was measured at 0.81. DISCUSSION: We have shown the usefulness of ANFIS for classification of MIS competence in a simple box trainer exercise. The main advantage of using ANFIS resides in its continuous output, which allows fine discrimination of surgical competence. There are, however, challenges that must be taken into account when considering use of ANFIS (e.g. training time, architecture modeling). Despite this, we have shown discriminative power of ANFIS for a low-difficulty box trainer task, regardless of the individual significances between MAPs. Future studies are required to confirm the findings, inclusion of new tasks, conditions and sample population.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Objective assessment of psychomotor skills has become an important challenge in the training of minimally invasive surgical (MIS) techniques. Currently, no gold standard defining surgical competence exists for classifying residents according to their surgical skills. Supervised classification has been proposed as a means for objectively establishing competence thresholds in psychomotor skills evaluation. This report presents a study comparing three classification methods for establishing their validity in a set of tasks for basic skills’ assessment. Methods Linear discriminant analysis (LDA), support vector machines (SVM), and adaptive neuro-fuzzy inference systems (ANFIS) were used. A total of 42 participants, divided into an experienced group (4 expert surgeons and 14 residents with >10 laparoscopic surgeries performed) and a nonexperienced group (16 students and 8 residents with <10 laparoscopic surgeries performed), performed three box trainer tasks validated for assessment of MIS psychomotor skills. Instrument movements were captured using the TrEndo tracking system, and nine motion analysis parameters (MAPs) were analyzed. The performance of the classifiers was measured by leave-one-out cross-validation using the scores obtained by the participants. Results The mean accuracy performances of the classifiers were 71 % (LDA), 78.2 % (SVM), and 71.7 % (ANFIS). No statistically significant differences in the performance were identified between the classifiers. Conclusions The three proposed classifiers showed good performance in the discrimination of skills, especially when information from all MAPs and tasks combined were considered. A correlation between the surgeons’ previous experience and their execution of the tasks could be ascertained from results. However, misclassifications across all the classifiers could imply the existence of other factors influencing psychomotor competence.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have recently developed a principled approach to interactive non-linear hierarchical visualization [8] based on the Generative Topographic Mapping (GTM). Hierarchical plots are needed when a single visualization plot is not sufficient (e.g. when dealing with large quantities of data). In this paper we extend our system by giving the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode the user interactively selects ``regions of interest'' as in [8], whereas in the automatic mode an unsupervised minimum message length (MML)-driven construction of a mixture of GTMs is used. The latter is particularly useful when the plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. We illustrate our approach on a data set of 2300 18-dimensional points and mention extension of our system to accommodate discrete data types.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An interactive hierarchical Generative Topographic Mapping (HGTM) ¸iteHGTM has been developed to visualise complex data sets. In this paper, we build a more general visualisation system by extending the HGTM visualisation system in 3 directions: bf (1) We generalize HGTM to noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM) developed in ¸iteKabanpami. bf (2) We give the user a choice of initializing the child plots of the current plot in either em interactive, or em automatic mode. In the interactive mode the user interactively selects ``regions of interest'' as in ¸iteHGTM, whereas in the automatic mode an unsupervised minimum message length (MML)-driven construction of a mixture of LTMs is employed. bf (3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualisation plots, since they can highlight the boundaries between data clusters. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. We illustrate our approach on a toy example and apply our system to three more complex real data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In machine learning, Gaussian process latent variable model (GP-LVM) has been extensively applied in the field of unsupervised dimensionality reduction. When some supervised information, e.g., pairwise constraints or labels of the data, is available, the traditional GP-LVM cannot directly utilize such supervised information to improve the performance of dimensionality reduction. In this case, it is necessary to modify the traditional GP-LVM to make it capable of handing the supervised or semi-supervised learning tasks. For this purpose, we propose a new semi-supervised GP-LVM framework under the pairwise constraints. Through transferring the pairwise constraints in the observed space to the latent space, the constrained priori information on the latent variables can be obtained. Under this constrained priori, the latent variables are optimized by the maximum a posteriori (MAP) algorithm. The effectiveness of the proposed algorithm is demonstrated with experiments on a variety of data sets. © 2010 Elsevier B.V.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Permafrost landscapes experience different disturbances and store large amounts of organic matter, which may become a source of greenhouse gases upon permafrost degradation. We analysed the influence of terrain and geomorphic disturbances (e.g. soil creep, active-layer detachment, gullying, thaw slumping, accumulation of fluvial deposits) on soil organic carbon (SOC) and total nitrogen (TN) storage using 11 permafrost cores from Herschel Island, western Canadian Arctic. Our results indicate a strong correlation between SOC storage and the topographic wetness index. Undisturbed sites stored the majority of SOC and TN in the upper 70 cm of soil. Sites characterised by mass wasting showed significant SOC depletion and soil compaction, whereas sites characterised by the accumulation of peat and fluvial deposits store SOC and TN along the whole core. We upscaled SOC and TN to estimate total stocks using the ecological units determined from vegetation composition, slope angle and the geomorphic disturbance regime. The ecological units were delineated with a supervised classification based on RapidEye multispectral satellite imagery and slope angle. Mean SOC and TN storage for the uppermost 1?m of soil on Herschel Island are 34.8 kg C/m**2 and 3.4 kg N/m**2, respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Resources created at the University of Southampton for the module Remote Sensing for Earth Observation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El análisis de las diferentes alternativas en la planificación y diseño de corredores y trazados de carreteras debe basarse en la correcta definición de variables territoriales que sirvan como criterios para la toma de decisión y esto requiere un análisis ambiental preliminar de esas variables de calidad. En España, los estudios de viabilidad de nuevas carreteras y autovías están asociados a una fase del proceso de decisión que se corresponde con el denominado Estudio Informativo, el cual establece condicionantes físicos, ambientales, de uso del suelo y culturales que deben ser considerados en las primeras fases de la definición del trazado de un corredor de carretera. Así, la metodología más frecuente es establecer diferentes niveles de capacidad de acogida del territorio en el área de estudio con el fin de resumir las variables territoriales en mapas temáticos y facilitar el proceso de trazado de las alternativas de corredores de carretera. El paisaje es un factor limitante a tener en cuenta en la planificación y diseño de carreteras y, por tanto, deben buscarse trazados más sostenibles en relación con criterios estéticos y ecológicos del mismo. Pero este factor no es frecuentemente analizado en los Estudios Informativos e incluso, si es considerado, los estudios específicos de la calidad del paisaje (estético y ecológico) y de las formas del terreno no incorporan las recomendaciones de las guías de trazado para evitar o reducir los impactos en el paisaje. Además, los mapas de paisaje que se generan en este tipo de estudios no se corresponden con la escala de desarrollo del Estudio Informativo (1:5.000). Otro déficit común en planificación de corredores y trazados de carreteras es que no se tiene en cuenta la conectividad del paisaje durante el proceso de diseño de la carretera para prevenir la afección a los corredores de fauna existentes en el paisaje. Este déficit puede originar un posterior efecto barrera en los movimientos dispersivos de la fauna y la fragmentación de sus hábitats debido a la ocupación parcial o total de las teselas de hábitats con importancia biológica para la fauna (o hábitats focales) y a la interrupción de los corredores de fauna que concentran esos movimientos dispersivos de la fauna entre teselas. El objetivo principal de esta tesis es mejorar el estudio del paisaje para prevenir su afección durante el proceso de trazado de carreteras, facilitar la conservación de los corredores de fauna (o pasillos verdes) y la localización de medidas preventivas y correctoras en términos de selección y cuantificación de factores de idoneidad a fin de reducir los impactos visuales y ecológicos en el paisaje a escala local. Concretamente, la incorporación de valores cuantitativos y bien justificados en el proceso de decisión permite incrementar la transparencia en el proceso de diseño de corredores y trazados de carreteras. Con este fin, se han planteado cuatro preguntas específicas en esta investigación (1) ¿Cómo se seleccionan y evalúan los factores territoriales limitantes para localizar una nueva carretera por los profesionales españoles de planificación del territorio en relación con el paisaje? (2) ¿Cómo pueden ser definidos los corredores de fauna a partir de factores del paisaje que influyen en los movimientos dispersivos de la fauna? (3) ¿Cómo pueden delimitarse y evaluarse los corredores de fauna incluyendo el comportamiento parcialmente errático en los movimientos dispersivos de la fauna y el efecto barrera de los elementos antrópicos a una escala local? (4) ¿Qué y cómo las recomendaciones de diseño de carreteras relacionadas con el paisaje y las formas del terreno pueden ser incluidas en un modelo de Sistemas de Información Geográfica (SIG) para ayudar a los ingenieros civiles durante el proceso de diseño de un trazado de carreteras bajo el punto de vista de la sostenibilidad?. Esta tesis doctoral propone nuevas metodologías que mejoran el análisis visual y ecológico del paisaje utilizando indicadores y modelos SIG para obtener alternativas de trazado que produzcan un menor impacto en el paisaje. Estas metodologías fueron probadas en un paisaje heterogéneo con una alta tasa de densidad de corzo (Capreolus capreolus L.), uno de los grandes mamíferos más atropellados en la red de carreteras españolas, y donde está planificada la construcción de una nueva autovía que atravesará la mitad del área de distribución del corzo. Inicialmente, se han analizado las variables utilizadas en 22 estudios de proyectos de planificación de corredores de carreteras promovidos por el Ministerio de Fomento entre 2006 y 2008. Estas variables se agruparon según condicionantes físicos, ambientales, de usos del suelo y culturales con el fin de comparar los valores asignados de capacidad de acogida del territorio a cada variable en los diferentes estudios revisados. Posteriormente, y como etapa previa de un análisis de conectividad, se construyó un mapa de resistencia de los movimientos dispersivos del corzo en base a la literatura y al juicio de expertos. Usando esta investigación como base, se le asignó un valor de resistencia a cada factor seleccionado para construir la matriz de resistencia, ponderándolo y combinándolo con el resto de factores usando el proceso analítico jerárquico y los operadores de lógica difusa como métodos de análisis multicriterio. Posteriormente, se diseñó una metodología SIG para delimitar claramente la extensión física de los corredores de fauna de acuerdo a un valor umbral de ancho geométrico mínimo, así como la existencia de múltiples potenciales conexiones entre cada par de teselas de hábitats presentes en el paisaje estudiado. Finalmente, se realizó un procesado de datos Light Detection and Ranging (LiDAR) y un modelo SIG para calcular la calidad del paisaje (estético y ecológico), las formas del terreno que presentan características similares para trazar una carretera y la acumulación de vistas de potenciales conductores y observadores de los alrededores de la nueva vía. Las principales contribuciones de esta investigación al conocimiento científico existente en el campo de la evaluación del impacto ambiental en relación al diseño de corredores y trazados de carreteras son cuatro. Primero, el análisis realizado de 22 Estudios Informativos de planificación de carreteras reveló que los métodos aplicados por los profesionales para la evaluación de la capacidad de acogida del territorio no fue suficientemente estandarizada, ya que había una falta de uniformidad en el uso de fuentes cartográficas y en las metodologías de evaluación de la capacidad de acogida del territorio, especialmente en el análisis de la calidad del paisaje estético y ecológico. Segundo, el análisis realizado en esta tesis destaca la importancia de los métodos multicriterio para estructurar, combinar y validar factores que limitan los movimientos dispersivos de la fauna en el análisis de conectividad. Tercero, los modelos SIG desarrollados Generador de alternativas de corredores o Generator of Alternative Corridors (GAC) y Eliminador de Corredores Estrechos o Narrow Corridor Eraser (NCE) pueden ser aplicados sistemáticamente y sobre una base científica en análisis de conectividad como una mejora de las herramientas existentes para la comprensión el paisaje como una red compuesta por nodos y enlaces interconectados. Así, ejecutando los modelos GAC y NCE de forma iterativa, pueden obtenerse corredores alternativos con similar probabilidad de ser utilizados por la fauna y sin que éstos presenten cuellos de botella. Cuarto, el caso de estudio llevado a cabo de prediseño de corredores y trazado de una nueva autovía ha sido novedoso incluyendo una clasificación semisupervisada de las formas del terreno, filtrando una nube de puntos LiDAR e incluyendo la nueva geometría 3D de la carretera en el Modelo Digital de Superficie (MDS). El uso combinado del procesamiento de datos LiDAR y de índices y clasificaciones geomorfológicas puede ayudar a los responsables encargados en la toma de decisiones a evaluar qué alternativas de trazado causan el menor impacto en el paisaje, proporciona una visión global de los juicios de valor más aplicados y, en conclusión, define qué medidas de integración paisajística correctoras deben aplicarse y dónde. ABSTRACT The assessment of different alternatives in road-corridor planning and layout design must be based on a number of well-defined territorial variables that serve as decision-making criteria, and this requires a high-quality preliminary environmental analysis of those quality variables. In Spain, feasibility studies for new roads and motorways are associated to a phase of the decision procedure which corresponds with the one known as the Informative Study, which establishes the physical, environmental, land-use and cultural constraints to be considered in the early stages of defining road corridor layouts. The most common methodology is to establish different levels of Territorial Carrying Capacity (TCC) in the study area in order to summarize the territorial variables on thematic maps and facilitate the tracing process of road-corridor layout alternatives. Landscape is a constraint factor that must be considered in road planning and design, and the most sustainable layouts should be sought based on aesthetic and ecological criteria. However this factor is not often analyzed in Informative Studies and even if it is, baseline studies on landscape quality (aesthetic and ecological) and landforms do not usually include the recommendations of road tracing guides designed to avoid or reduce impacts on the landscape. The resolution of the landscape maps produced in this type of studies does not comply with the recommended road design scale (1:5,000) in the regulations for the Informative Study procedure. Another common shortcoming in road planning is that landscape ecological connectivity is not considered during road design in order to avoid affecting wildlife corridors in the landscape. In the prior road planning stage, this issue could lead to a major barrier effect for fauna dispersal movements and to the fragmentation of their habitat due to the partial or total occupation of habitat patches of biological importance for the fauna (or focal habitats), and the interruption of wildlife corridors that concentrate fauna dispersal movements between patches. The main goal of this dissertation is to improve the study of the landscape and prevent negative effects during the road tracing process, and facilitate the preservation of wildlife corridors (or green ways) and the location of preventive and corrective measures by selecting and quantifying suitability factors to reduce visual and ecological landscape impacts at a local scale. Specifically the incorporation of quantitative and well-supported values in the decision-making process provides increased transparency in the road corridors and layouts design process. Four specific questions were raised in this research: (1) How are territorial constraints selected and evaluated in terms of landscape by Spanish land-planning practitioners before locating a new road? (2) How can wildlife corridors be defined based on the landscape factors influencing the dispersal movements of fauna? (3) How can wildlife corridors be delimited and assessed to include the partially erratic movements of fauna and the barrier effect of the anthropic elements at a local scale? (4) How recommendations of road design related to landscape and landforms can be included in a Geographic Information System (GIS) model to aid civil engineers during the road layout design process and support sustainable development? This doctoral thesis proposes new methodologies that improve the assessment of the visual and ecological landscape character using indicators and GIS models to obtain road layout alternatives with a lower impact on the landscape. These methodologies were tested on a case study of a heterogeneous landscape with a high density of roe deer (Capreolus capreolus L.) –one of the large mammals most commonly hit by vehicles on the Spanish road network– and where a new motorway is planned to pass through the middle of their distribution area. We explored the variables used in 22 road-corridor planning projects sponsored by the Ministry of Public Works between 2006 and 2008. These variables were grouped into physical, environmental, land-use and cultural constraints for the purpose of comparing the TCC values assigned to each variable in the various studies reviewed. As a prior stage in a connectivity analysis, a map of resistance to roe deer dispersal movements was created based on the literature and experts judgment. Using this research as a base, each factor selected to build the matrix was assigned a resistance value and weighted and combined with the rest of the factors using the analytic hierarchy process (AHP) and fuzzy logic operators as multicriteria assessment (MCA) methods. A GIS methodology was designed to clearly delimit the physical area of wildlife corridors according to a geometric threshold width value, and the multiple potential connections between each pair of habitat patches in the landscape. A Digital Surface Model Light Detection and Ranging (LiDAR) dataset processing and a GIS model was performed to determine landscape quality (aesthetic and ecological) and landforms with similar characteristics for the road layout, and the cumulative viewshed of potential drivers and observers in the area surrounding the new motorway. The main contributions of this research to current scientific knowledge in the field of environmental impact assessment for road corridors and layouts design are four. First, the analysis of 22 Informative Studies on road planning revealed that the methods applied by practitioners for assessing the TCC were not sufficiently standardized due to the lack of uniformity in the cartographic information sources and the TCC valuation methodologies, especially in the analysis of the aesthetic and ecological quality of the landscape. Second, the analysis in this dissertation highlights the importance of multicriteria methods to structure, combine and validate factors that constrain wildlife dispersal movements in the connectivity analysis. Third, the “Generator of Alternative Corridors (GAC)” and “Narrow Corridor Eraser (NCE)” GIS models developed can be applied systematically and on a scientific basis in connectivity analyses to improve existing tools and understand landscape as a network composed of interconnected nodes and links. Thus, alternative corridors with similar probability of use by fauna and without bottlenecks can be obtained by iteratively running GAC and NCE models. Fourth, our case study of new motorway corridors and layouts design innovatively included semi-supervised classification of landforms, filtering of LiDAR point clouds and new 3D road geometry on the Digital Surface Model (DSM). The combined used of LiDAR data processing and geomorphological indices and classifications can help decision-makers assess which road layouts produce lower impacts on the landscape, provide an overall insight into the most commonly applied value judgments, and in conclusion, define which corrective measures should be applied in terms of landscaping, and where.