9 resultados para Semi-supervised clustering

em Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years, the performance of semi-supervised learning has been theoretically investigated. However, most of this theoretical development has focussed on binary classification problems. In this paper, we take it a step further by extending the work of Castelli and Cover [1] [2] to the multi-class paradigm. Particularly, we consider the key problem in semi-supervised learning of classifying an unseen instance x into one of K different classes, using a training dataset sampled from a mixture density distribution and composed of l labelled records and u unlabelled examples. Even under the assumption of identifiability of the mixture and having infinite unlabelled examples, labelled records are needed to determine the K decision regions. Therefore, in this paper, we first investigate the minimum number of labelled examples needed to accomplish that task. Then, we propose an optimal multi-class learning algorithm which is a generalisation of the optimal procedure proposed in the literature for binary problems. Finally, we make use of this generalisation to study the probability of error when the binary class constraint is relaxed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a time when Technology Supported Learning Systems are being widely used, there is a lack of tools that allows their development in an automatic or semi-automatic way. Technology Supported Learning Systems require an appropriate Domain Module, ie. the pedagogical representation of the domain to be mastered, in order to be effective. However, content authoring is a time and effort consuming task, therefore, efforts in automatising the Domain Module acquisition are necessary.Traditionally, textbooks have been used as the main mechanism to maintain and transmit the knowledge of a certain subject or domain. Textbooks have been authored by domain experts who have organised the contents in a means that facilitate understanding and learning, considering pedagogical issues.Given that textbooks are appropriate sources of information, they can be used to facilitate the development of the Domain Module allowing the identification of the topics to be mastered and the pedagogical relationships among them, as well as the extraction of Learning Objects, ie. meaningful fragments of the textbook with educational purpose.Consequently, in this work DOM-Sortze, a framework for the semi-automatic construction of Domain Modules from electronic textbooks, has been developed. DOM-Sortze uses NLP techniques, heuristic reasoning and ontologies to fulfill its work. DOM-Sortze has been designed and developed with the aim of automatising the development of the Domain Module, regardless of the subject, promoting the knowledge reuse and facilitating the collaboration of the users during the process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work we attempt to find out the extent to which realistic prebiotic compartments, such as fatty acid vesicles, would constrain the chemical network dynamics that could have sustained a minimal form of metabolism. We combine experimental and simulation results to establish the conditions under which a reaction network with a catalytically closed organization (more specifically, an (M, R)-system) would overcome the potential problem of self-suffocation that arises from the limited accessibility of nutrients to its internal reaction domain. The relationship between the permeability of the membrane, the lifetime of the key catalysts and their efficiency (reaction rate enhancement) turns out to be critical. In particular, we show how permeability values constrain the characteristic time scale of the bounded protometabolic processes. From this concrete and illustrative example we finally extend the discussion to a wider evolutionary context.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new supervised burned area mapping software named BAMS (Burned Area Mapping Software) is presented in this paper. The tool was built from standard ArcGIS (TM) libraries. It computes several of the spectral indexes most commonly used in burned area detection and implements a two-phase supervised strategy to map areas burned between two Landsat multitemporal images. The only input required from the user is the visual delimitation of a few burned areas, from which burned perimeters are extracted. After the discrimination of burned patches, the user can visually assess the results, and iteratively select additional sampling burned areas to improve the extent of the burned patches. The final result of the BAMS program is a polygon vector layer containing three categories: (a) burned perimeters, (b) unburned areas, and (c) non-observed areas. The latter refer to clouds or sensor observation errors. Outputs of the BAMS code meet the requirements of file formats and structure of standard validation protocols. This paper presents the tool's structure and technical basis. The program has been tested in six areas located in the United States, for various ecosystems and land covers, and then compared against the National Monitoring Trends in Burn Severity (MTBS) Burned Area Boundaries Dataset.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The stone marten is a widely distributed mustelid in the Palaearctic region that exhibits variable habitat preferences in different parts of its range. The species is a Holocene immigrant from southwest Asia which, according to fossil remains, followed the expansion of the Neolithic farming cultures into Europe and possibly colonized the Iberian Peninsula during the Early Neolithic (ca. 7,000 years BP). However, the population genetic structure and historical biogeography of this generalist carnivore remains essentially unknown. In this study we have combined mitochondrial DNA (mtDNA) sequencing (621 bp) and microsatellite genotyping (23 polymorphic markers) to infer the population genetic structure of the stone marten within the Iberian Peninsula. The mtDNA data revealed low haplotype and nucleotide diversities and a lack of phylogeographic structure, most likely due to a recent colonization of the Iberian Peninsula by a few mtDNA lineages during the Early Neolithic. The microsatellite data set was analysed with a) spatial and non-spatial Bayesian individual-based clustering (IBC) approaches (STRUCTURE, TESS, BAPS and GENELAND), and b) multivariate methods [discriminant analysis of principal components (DAPC) and spatial principal component analysis (sPCA)]. Additionally, because isolation by distance (IBD) is a common spatial genetic pattern in mobile and continuously distributed species and it may represent a challenge to the performance of the above methods, the microsatellite data set was tested for its presence. Overall, the genetic structure of the stone marten in the Iberian Peninsula was characterized by a NE-SW spatial pattern of IBD, and this may explain the observed disagreement between clustering solutions obtained by the different IBC methods. However, there was significant indication for contemporary genetic structuring, albeit weak, into at least three different subpopulations. The detected subdivision could be attributed to the influence of the rivers Ebro, Tagus and Guadiana, suggesting that main watercourses in the Iberian Peninsula may act as semi-permeable barriers to gene flow in stone martens. To our knowledge, this is the first phylogeographic and population genetic study of the species at a broad regional scale. We also wanted to make the case for the importance and benefits of using and comparing multiple different clustering and multivariate methods in spatial genetic analyses of mobile and continuously distributed species.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

4th International Workshop on Transverse Polisarization Phenomena in Hard Processes (TRANSVERSITY 2014)