808 resultados para Semi-supervised clustering
Resumo:
Vegeu el resum a l'inici del document del fitxer adjunt
Resumo:
The long term goal of this research is to develop a program able to produce an automatic segmentation and categorization of textual sequences into discourse types. In this preliminary contribution, we present the construction of an algorithm which takes a segmented text as input and attempts to produce a categorization of sequences, such as narrative, argumentative, descriptive and so on. Also, this work aims at investigating a possible convergence between the typological approach developed in particular in the field of text and discourse analysis in French by Adam (2008) and Bronckart (1997) and unsupervised statistical learning.
Resumo:
Memòria elaborada a partir d’una estada al projecte Proteus de la New York University entre abril i juny del 2007. Les tècniques de clustering poden ajudar a reduir la supervisió en processos d’obtenció de patrons per a Extracció d’Informació. Tanmateix, és necessari disposar d’algorismes adequats a documents, i aquests algorismes requereixen mesures adequades de similitud entre patrons. Els kernels poden oferir una solució a aquests problemes, però l’aprenentatge no supervisat requereix d’estrat`egies m´es astutes que l’aprenentatge supervisat per a incorporar major quantitat d’informació. En aquesta memòria, fruit de la meva estada de mes d’Abril al de Juny de 2007 al projecte. Proteus de la New York University, es proposen i avaluen diversos kernels sobre patrons. Ini- cialment s’estudien kernels amb una família de patrons restringits, i a continuació s’apliquen kernels ja usats en tasques supervisades d’Extracció d’Informació. Degut a la degradació del rendiment que experimenta el clustering a l’afegir informació irrellevant, els kernels se simpli- fiquen i es busquen estratègies per a incorporar-hi semàntica de forma selectiva. Finalment, s’estudia quin efecte té aplicar clustering sobre el coneixement semàntic com a pas previ al clustering de patrons. Les diverses estratègies s’avaluen en tasques de clustering de documents i patrons usant dades reals.
Resumo:
For the development of studies on snail interspecific competition special in-door laboratory channels were built. In the all five channels seeded with adult specimens of Biomphalaria glabrata mass migration of juvenile snails outside the water was observed. Most of the migrant snails presented apertural lamellae. Data collected during the period of two years, showed the regression of the migration phenomenon and the disappearance of the lamellate snails.
Resumo:
There is a vast literature that specifies Bayesian shrinkage priors for vector autoregressions (VARs) of possibly large dimensions. In this paper I argue that many of these priors are not appropriate for multi-country settings, which motivates me to develop priors for panel VARs (PVARs). The parametric and semi-parametric priors I suggest not only perform valuable shrinkage in large dimensions, but also allow for soft clustering of variables or countries which are homogeneous. I discuss the implications of these new priors for modelling interdependencies and heterogeneities among different countries in a panel VAR setting. Monte Carlo evidence and an empirical forecasting exercise show clear and important gains of the new priors compared to existing popular priors for VARs and PVARs.
Resumo:
Defining an efficient training set is one of the most delicate phases for the success of remote sensing image classification routines. The complexity of the problem, the limited temporal and financial resources, as well as the high intraclass variance can make an algorithm fail if it is trained with a suboptimal dataset. Active learning aims at building efficient training sets by iteratively improving the model performance through sampling. A user-defined heuristic ranks the unlabeled pixels according to a function of the uncertainty of their class membership and then the user is asked to provide labels for the most uncertain pixels. This paper reviews and tests the main families of active learning algorithms: committee, large margin, and posterior probability-based. For each of them, the most recent advances in the remote sensing community are discussed and some heuristics are detailed and tested. Several challenging remote sensing scenarios are considered, including very high spatial resolution and hyperspectral image classification. Finally, guidelines for choosing the good architecture are provided for new and/or unexperienced user.
Resumo:
Creative industries tend to concentrate mainly around large- and medium-sized cities, forming creative local production systems. The text analyses the forces behind clustering of creative industries to provide the first empirical explanation of the determinants of creative employment clustering following a multidisciplinary approach based on cultural and creative economics, evolutionary geography and urban economics. A comparative analysis has been performed for Italy and Spain. The results show different patterns of creative employment clustering in both countries. The small role of historical and cultural endowments, the size of the place, the average size of creative industries, the productive diversity and the concentration of human capital and creative class have been found as common factors of clustering in both countries.
Resumo:
Concerns on the clustering of retail industries and professional services in main streets had traditionally been the public interest rationale for supporting distance regulations. Although many geographic restrictions have been suppressed, deregulation has hinged mostly upon the theory results on the natural tendency of outlets to differentiate spatially. Empirical evidence has so far offered mixed results. Using the case of deregulation of pharmacy establishment in a region of Spain, we empirically show how pharmacy locations scatter, and that there is not rationale for distance regulation apart from the underlying private interest of very few incumbents.
Resumo:
We consider linear optimization over a nonempty convex semi-algebraic feasible region F. Semidefinite programming is an example. If F is compact, then for almost every linear objective there is a unique optimal solution, lying on a unique \active" manifold, around which F is \partly smooth", and the second-order sufficient conditions hold. Perturbing the objective results in smooth variation of the optimal solution. The active manifold consists, locally, of these perturbed optimal solutions; it is independent of the representation of F, and is eventually identified by a variety of iterative algorithms such as proximal and projected gradient schemes. These results extend to unbounded sets F.
Resumo:
Els canvi recents en els plans d’estudis de la UPC i la UOC tenen en compte el nou espai europeu d’educació superior (EEES). Una de les conseqüències directes a aquests canvis es la necessitat d'aprofitar i optimitzar el temps dedicat a les activitats d'aprenentatge que requereixen la participació activa de l’estudiant i que es realitzen de manera continuada durant el semestre. A més, I'EEES destaca la importància de les pràctiques, les relacions interpersonals i la capacitat per treballar en equip, suggerint la reducció de classes magistrals i l’augment d’activitats que fomentin tant el treball personal de l’estudiant com el cooperatiu. En l’àmbit de la docència informàtica d’assignatures de bases de dades el problema és especialment complex degut a que els enunciats de les proves no acostumen a tenir una solució única. Nosaltres hem desenvolupat una eina anomenada LEARN-SQL, l’objectiu de la qual és corregir automàticament qualsevol tipus de sentència SQL (consultes, actualitzacions, procediments emmagatzemats, disparadors, etc.) i discernir si la resposta aportada per l’estudiant és o no és correcta amb independència de la solució concreta que aquest proposi. D’aquesta manera potenciem l’autoaprenentatge i l’autoavaluació, fent possible la semi-presencialitat supervisada i facilitant l’aprenentatge individualitzat segons les necessitats de cada estudiant. Addicionalment, aquesta eina ajuda als professors a dissenyar les proves d’avaluació, permetent també la opció de revisar qualitativament les solucions aportades pels estudiants. Per últim, el sistema proporciona ajuda als estudiants per a que aprenguin dels seus propis errors, proporcionant retroalimentació de qualitat.
Resumo:
"Vegeu el resum a l'inici del document del fitxer adjunt."
Resumo:
Specific properties emerge from the structure of large networks, such as that of worldwide air traffic, including a highly hierarchical node structure and multi-level small world sub-groups that strongly influence future dynamics. We have developed clustering methods to understand the form of these structures, to identify structural properties, and to evaluate the effects of these properties. Graph clustering methods are often constructed from different components: a metric, a clustering index, and a modularity measure to assess the quality of a clustering method. To understand the impact of each of these components on the clustering method, we explore and compare different combinations. These different combinations are used to compare multilevel clustering methods to delineate the effects of geographical distance, hubs, network densities, and bridges on worldwide air passenger traffic. The ultimate goal of this methodological research is to demonstrate evidence of combined effects in the development of an air traffic network. In fact, the network can be divided into different levels of âeurooecohesionâeuro, which can be qualified and measured by comparative studies (Newman, 2002; Guimera et al., 2005; Sales-Pardo et al., 2007).