166 resultados para clustering techniques
em Université de Lausanne, Switzerland
Resumo:
A methodology of exploratory data analysis investigating the phenomenon of orographic precipitation enhancement is proposed. The precipitation observations obtained from three Swiss Doppler weather radars are analysed for the major precipitation event of August 2005 in the Alps. Image processing techniques are used to detect significant precipitation cells/pixels from radar images while filtering out spurious effects due to ground clutter. The contribution of topography to precipitation patterns is described by an extensive set of topographical descriptors computed from the digital elevation model at multiple spatial scales. Additionally, the motion vector field is derived from subsequent radar images and integrated into a set of topographic features to highlight the slopes exposed to main flows. Following the exploratory data analysis with a recent algorithm of spectral clustering, it is shown that orographic precipitation cells are generated under specific flow and topographic conditions. Repeatability of precipitation patterns in particular spatial locations is found to be linked to specific local terrain shapes, e.g. at the top of hills and on the upwind side of the mountains. This methodology and our empirical findings for the Alpine region provide a basis for building computational data-driven models of orographic enhancement and triggering of precipitation. Copyright (C) 2011 Royal Meteorological Society .
Resumo:
The coverage and volume of geo-referenced datasets are extensive and incessantly¦growing. The systematic capture of geo-referenced information generates large volumes¦of spatio-temporal data to be analyzed. Clustering and visualization play a key¦role in the exploratory data analysis and the extraction of knowledge embedded in¦these data. However, new challenges in visualization and clustering are posed when¦dealing with the special characteristics of this data. For instance, its complex structures,¦large quantity of samples, variables involved in a temporal context, high dimensionality¦and large variability in cluster shapes.¦The central aim of my thesis is to propose new algorithms and methodologies for¦clustering and visualization, in order to assist the knowledge extraction from spatiotemporal¦geo-referenced data, thus improving making decision processes.¦I present two original algorithms, one for clustering: the Fuzzy Growing Hierarchical¦Self-Organizing Networks (FGHSON), and the second for exploratory visual data analysis:¦the Tree-structured Self-organizing Maps Component Planes. In addition, I present¦methodologies that combined with FGHSON and the Tree-structured SOM Component¦Planes allow the integration of space and time seamlessly and simultaneously in¦order to extract knowledge embedded in a temporal context.¦The originality of the FGHSON lies in its capability to reflect the underlying structure¦of a dataset in a hierarchical fuzzy way. A hierarchical fuzzy representation of¦clusters is crucial when data include complex structures with large variability of cluster¦shapes, variances, densities and number of clusters. The most important characteristics¦of the FGHSON include: (1) It does not require an a-priori setup of the number¦of clusters. (2) The algorithm executes several self-organizing processes in parallel.¦Hence, when dealing with large datasets the processes can be distributed reducing the¦computational cost. (3) Only three parameters are necessary to set up the algorithm.¦In the case of the Tree-structured SOM Component Planes, the novelty of this algorithm¦lies in its ability to create a structure that allows the visual exploratory data analysis¦of large high-dimensional datasets. This algorithm creates a hierarchical structure¦of Self-Organizing Map Component Planes, arranging similar variables' projections in¦the same branches of the tree. Hence, similarities on variables' behavior can be easily¦detected (e.g. local correlations, maximal and minimal values and outliers).¦Both FGHSON and the Tree-structured SOM Component Planes were applied in¦several agroecological problems proving to be very efficient in the exploratory analysis¦and clustering of spatio-temporal datasets.¦In this thesis I also tested three soft competitive learning algorithms. Two of them¦well-known non supervised soft competitive algorithms, namely the Self-Organizing¦Maps (SOMs) and the Growing Hierarchical Self-Organizing Maps (GHSOMs); and the¦third was our original contribution, the FGHSON. Although the algorithms presented¦here have been used in several areas, to my knowledge there is not any work applying¦and comparing the performance of those techniques when dealing with spatiotemporal¦geospatial data, as it is presented in this thesis.¦I propose original methodologies to explore spatio-temporal geo-referenced datasets¦through time. Our approach uses time windows to capture temporal similarities and¦variations by using the FGHSON clustering algorithm. The developed methodologies¦are used in two case studies. In the first, the objective was to find similar agroecozones¦through time and in the second one it was to find similar environmental patterns¦shifted in time.¦Several results presented in this thesis have led to new contributions to agroecological¦knowledge, for instance, in sugar cane, and blackberry production.¦Finally, in the framework of this thesis we developed several software tools: (1)¦a Matlab toolbox that implements the FGHSON algorithm, and (2) a program called¦BIS (Bio-inspired Identification of Similar agroecozones) an interactive graphical user¦interface tool which integrates the FGHSON algorithm with Google Earth in order to¦show zones with similar agroecological characteristics.
Resumo:
Thy-1 is a membrane glycoprotein suggested to stabilize or inhibit growth of neuronal processes. However, its precise function has remained obscure, because its endogenous ligand is unknown. We previously showed that Thy-1 binds directly to α(V)β(3) integrin in trans eliciting responses in astrocytes. Nonetheless, whether α(V)β(3) integrin might also serve as a Thy-1-ligand triggering a neuronal response has not been explored. Thus, utilizing primary neurons and a neuron-derived cell line CAD, Thy-1-mediated effects of α(V)β(3) integrin on growth and retraction of neuronal processes were tested. In astrocyte-neuron co-cultures, endogenous α(V)β(3) integrin restricted neurite outgrowth. Likewise, α(V)β(3)-Fc was sufficient to suppress neurite extension in Thy-1(+), but not in Thy-1(-) CAD cells. In differentiating primary neurons exposed to α(V)β(3)-Fc, fewer and shorter dendrites were detected. This effect was abolished by cleavage of Thy-1 from the neuronal surface using phosphoinositide-specific phospholipase C (PI-PLC). Moreover, α(V)β(3)-Fc also induced retraction of already extended Thy-1(+)-axon-like neurites in differentiated CAD cells as well as of axonal terminals in differentiated primary neurons. Axonal retraction occurred when redistribution and clustering of Thy-1 molecules in the plasma membrane was induced by α(V)β(3) integrin. Binding of α(V)β(3)-Fc was detected in Thy-1 clusters during axon retraction of primary neurons. Moreover, α(V)β(3)-Fc-induced Thy-1 clustering correlated in time and space with redistribution and inactivation of Src kinase. Thus, our data indicates that α(V)β(3) integrin is a ligand for Thy-1 that upon binding not only restricts the growth of neurites, but also induces retraction of already existing processes by inducing Thy-1 clustering. We propose that these events participate in bi-directional astrocyte-neuron communication relevant to axonal repair after neuronal damage.
Resumo:
We present a new framework for large-scale data clustering. The main idea is to modify functional dimensionality reduction techniques to directly optimize over discrete labels using stochastic gradient descent. Compared to methods like spectral clustering our approach solves a single optimization problem, rather than an ad-hoc two-stage optimization approach, does not require a matrix inversion, can easily encode prior knowledge in the set of implementable functions, and does not have an ?out-of-sample? problem. Experimental results on both artificial and real-world datasets show the usefulness of our approach.
Resumo:
The long term goal of this research is to develop a program able to produce an automatic segmentation and categorization of textual sequences into discourse types. In this preliminary contribution, we present the construction of an algorithm which takes a segmented text as input and attempts to produce a categorization of sequences, such as narrative, argumentative, descriptive and so on. Also, this work aims at investigating a possible convergence between the typological approach developed in particular in the field of text and discourse analysis in French by Adam (2008) and Bronckart (1997) and unsupervised statistical learning.
Resumo:
The use of the Internet now has a specific purpose: to find information. Unfortunately, the amount of data available on the Internet is growing exponentially, creating what can be considered a nearly infinite and ever-evolving network with no discernable structure. This rapid growth has raised the question of how to find the most relevant information. Many different techniques have been introduced to address the information overload, including search engines, Semantic Web, and recommender systems, among others. Recommender systems are computer-based techniques that are used to reduce information overload and recommend products likely to interest a user when given some information about the user's profile. This technique is mainly used in e-Commerce to suggest items that fit a customer's purchasing tendencies. The use of recommender systems for e-Government is a research topic that is intended to improve the interaction among public administrations, citizens, and the private sector through reducing information overload on e-Government services. More specifically, e-Democracy aims to increase citizens' participation in democratic processes through the use of information and communication technologies. In this chapter, an architecture of a recommender system that uses fuzzy clustering methods for e-Elections is introduced. In addition, a comparison with the smartvote system, a Web-based Voting Assistance Application (VAA) used to aid voters in finding the party or candidate that is most in line with their preferences, is presented.
Resumo:
Reconstructive surgery takes an important place in breast cancer treatment. Immediate breast reconstruction is performed during the same operation as mastectomy. It is contraindicated following radiotherapy. Reconstruction performed after mastectomy is called differed breast reconstruction. It is completed 6 months after chemotherapy and 1 year after radiotherapy. Prosthetic breast reconstruction is indicated when tissues are of good qualities and breast are small. Autologous reconstruction is performed in case of radiotherapy or large breast. After breast reconstruction, imperfections can be corrected with autologous fat injection.
Resumo:
Forensic scientists have long detected the presence of drugs and their metabolites in biological materials using body fluids such as urine, blood and/or other biological liquids or tissues. For doping analysis, only urine has so far been collected. In recent years, remarkable advances in sensitive analytical techniques have encouraged the analysis of drugs in unconventional biological samples such as hair, saliva and sweat. These samples are easily collected, although drug levels are often lower than the corresponding levels in urine or blood. This chapter reviews recent studies in the detection of doping agents in hair, saliva and sweat. Sampling, analytical procedures and interpretation of the results are discussed in comparison with those obtained from urine and blood samples.
Resumo:
Neuroimaging techniques provide valuable tools for diagnosing Alzheimer's disease (AD), monitoring disease progression and evaluating responses to treatment. There is currently a wide array of techniques available including computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), and, for recording electrical brain activity, electroencephalography (EEG). The choice of technique depends on the contrast between tissues of interest, spatial resolution, temporal resolution, requirements for functional data and the probable number of scans required. For example, while PET, CT and MRI can be used to differentiate between AD and other dementias, MRI is safer and provides better contrast of soft tissues. Neuroimaging is a technique spanning many disciplines and requires effective communication between doctors requesting a scan of a patient or group of patients and those with technical expertise. Consideration and discussion of the most suitable type of scan and the necessary settings to achieve the best results will help ensure appropriate techniques are chosen and used effectively. Neuroimaging techniques are currently expanding understanding of the structural and functional changes that occur in dementia. Further research may allow identification of early neurological signs ofAD, before clinical symptoms are evident, providing the opportunity to test preventative therapies. CombiningMRI and machine learning techniques may be a powerful approach to improve diagnosis ofAD and to predict clinical outcomes.