23 resultados para Data structures
em Université de Lausanne, Switzerland
Resumo:
When dealing with multi-angular image sequences, problems of reflectance changes due either to illumination and acquisition geometry, or to interactions with the atmosphere, naturally arise. These phenomena interplay with the scene and lead to a modification of the measured radiance: for example, according to the angle of acquisition, tall objects may be seen from top or from the side and different light scatterings may affect the surfaces. This results in shifts in the acquired radiance, that make the problem of multi-angular classification harder and might lead to catastrophic results, since surfaces with the same reflectance return significantly different signals. In this paper, rather than performing atmospheric or bi-directional reflection distribution function (BRDF) correction, a non-linear manifold learning approach is used to align data structures. This method maximizes the similarity between the different acquisitions by deforming their manifold, thus enhancing the transferability of classification models among the images of the sequence.
Resumo:
Statistics has become an indispensable tool in biomedical research. Thanks, in particular, to computer science, the researcher has easy access to elementary "classical" procedures. These are often of a "confirmatory" nature: their aim is to test hypotheses (for example the efficacy of a treatment) prior to experimentation. However, doctors often use them in situations more complex than foreseen, to discover interesting data structures and formulate hypotheses. This inverse process may lead to misuse which increases the number of "statistically proven" results in medical publications. The help of a professional statistician thus becomes necessary. Moreover, good, simple "exploratory" techniques are now available. In addition, medical data contain quite a high percentage of outliers (data that deviate from the majority). With classical methods it is often very difficult (even for a statistician!) to detect them and the reliability of results becomes questionable. New, reliable ("robust") procedures have been the subject of research for the past two decades. Their practical introduction is one of the activities of the Statistics and Data Processing Department of the University of Social and Preventive Medicine, Lausanne.
Resumo:
Data mining can be defined as the extraction of previously unknown and potentially useful information from large datasets. The main principle is to devise computer programs that run through databases and automatically seek deterministic patterns. It is applied in different fields of application, e.g., remote sensing, biometry, speech recognition, but has seldom been applied to forensic case data. The intrinsic difficulty related to the use of such data lies in its heterogeneity, which comes from the many different sources of information. The aim of this study is to highlight potential uses of pattern recognition that would provide relevant results from a criminal intelligence point of view. The role of data mining within a global crime analysis methodology is to detect all types of structures in a dataset. Once filtered and interpreted, those structures can point to previously unseen criminal activities. The interpretation of patterns for intelligence purposes is the final stage of the process. It allows the researcher to validate the whole methodology and to refine each step if necessary. An application to cutting agents found in illicit drug seizures was performed. A combinatorial approach was done, using the presence and the absence of products. Methods coming from the graph theory field were used to extract patterns in data constituted by links between products and place and date of seizure. A data mining process completed using graphing techniques is called ``graph mining''. Patterns were detected that had to be interpreted and compared with preliminary knowledge to establish their relevancy. The illicit drug profiling process is actually an intelligence process that uses preliminary illicit drug classes to classify new samples. Methods proposed in this study could be used \textit{a priori} to compare structures from preliminary and post-detection patterns. This new knowledge of a repeated structure may provide valuable complementary information to profiling and become a source of intelligence.
Resumo:
Brain deformations induced by space-occupying lesions may result in unpredictable position and shape of functionally important brain structures. The aim of this study is to propose a method for segmentation of brain structures by deformation of a segmented brain atlas in presence of a space-occupying lesion. Our approach is based on an a priori model of lesion growth (MLG) that assumes radial expansion from a seeding point and involves three steps: first, an affine registration bringing the atlas and the patient into global correspondence; then, the seeding of a synthetic tumor into the brain atlas providing a template for the lesion; finally, the deformation of the seeded atlas, combining a method derived from optical flow principles and a model of lesion growth. The method was applied on two meningiomas inducing a pure displacement of the underlying brain structures, and segmentation accuracy of ventricles and basal ganglia was assessed. Results show that the segmented structures were consistent with the patient's anatomy and that the deformation accuracy of surrounding brain structures was highly dependent on the accurate placement of the tumor seeding point. Further improvements of the method will optimize the segmentation accuracy. Visualization of brain structures provides useful information for therapeutic consideration of space-occupying lesions, including surgical, radiosurgical, and radiotherapeutic planning, in order to increase treatment efficiency and prevent neurological damage.
Resumo:
This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.
Resumo:
In colonies of social Hymenoptera (which include all ants, as well as some wasp and bee species), only queens reproduce whereas workers generally perform other tasks. The evolution of worker's reproductive altruism can be explained by kin selection, which states that workers can indirectly transmit copies of their genes by helping the reproduction of relatives. The relatedness between queens and workers may however be low, particularly when there are multiple queens per colony, which limits the transmission of copies of workers genes and increases potential conflicts between colony members. In this thesis, we investigated the link between social structure variations and conflicts, and explored the mechanisms involved in variation of colony queen number in ants. According to kin selection, workers should rear the brood they are most related to. In social Hymenoptera, males are haploid whereas females (workers and queens) are diploid. As a result, workers can be up to three times more related to females than males in some colonies, where they should consequently favour the production of females. In contrast, queens are equally related to daughters and sons in all types of colonies and therefore should favour a balanced sex ratio. In a meta-analysis across all studies of social Hymenoptera, we showed that colony sex ratio is generally largely influenced by workers. Hence, the evolution of social structures where queens and workers are equally related to males and females may contribute to decrease the conflict between the two castes over colony sex ratio. Another conflict between queens and workers can occur over male production. Many species contain workers that still have the ability to lay haploid eggs. In some social structures, workers are on average more related to sons of queens than to sons of other workers. As a result, workers should eliminate worker-laid eggs to favour queen-laid eggs. We showed that in the ant Formica selysi, workers eliminate more worker-laid than queen-laid eggs, independently of colony social structure. These results therefore suggest that worker policing can evolve independently from relatedness, potentially because of costs of worker reproduction at the colony-level. Colony queen number is a key parameter that influences relatedness between group members. Queen body size is generally linked to the success of independent colony foundation by single queens and may influence the number of queens in the new colony. In the ant F. selysi, single-queen colonies produce larger queens than multiple-queen colonies. We showed that this association results from genes or maternal effects transmitted to the eggs. However, we also found that queens produced in colonies of the two social forms did not differ in their general ability to found new colonies independently. Queen body size may also influence queen dispersal ability and constrain small queens to be re-adopted in their original nest after mating at proximity. We tested the acceptance of new queens in another ant species, Formica paralugubris, which has numerous queens per colony. Our results show that workers do not discriminate between nestmate and foreign queens, and more generally accept new queens at a limited rate. To conclude, this thesis shows that mechanisms influencing variation in colony queen number and the influence of these changes on conflict resolution are complex. Data gathered in this thesis therefore constitute a solid background for further research on the evolution and the maintenance of complex organisations in insect societies.
Resumo:
The transpressional boundary between the Australian and Pacific plates in the central South Island of New Zealand comprises the Alpine Fault and a broad region of distributed strain concentrated in the Southern Alps but encompassing regions further to the east, including the northwest Canterbury Plains. Low to moderate levels of seismicity (e. g., 2 > M 5 events since 1974 and 2 > M 4.0 in 2009) and Holocene sediments offset or disrupted along rare exposed active fault segments are evidence for ongoing tectonism in the northwest plains, the surface topography of which is remarkably flat and even. Because the geology underlying the late Quaternary alluvial fan deposits that carpet most of the plains is not established, the detailed tectonic evolution of this region and the potential for larger earthquakes is only poorly understood. To address these issues, we have processed and interpreted high-resolution (2.5 m subsurface sampling interval) seismic data acquired along lines strategically located relative to extensive rock exposures to the north, west, and southwest and rare exposures to the east. Geological information provided by these rock exposures offer important constraints on the interpretation of the seismic data. The processed seismic reflection sections image a variably thick layer of generally undisturbed younger (i.e., < 24 ka) Quaternary alluvial sediments unconformably overlying an older (> 59 ka) Quaternary sedimentary sequence that shows evidence of moderate faulting and folding during and subsequent to deposition. These Quaternary units are in unconformable contact with Late Cretaceous-Tertiary interbedded sedimentary and volcanic rocks that are highly faulted, folded, and tilted. The lowest imaged unit is largely reflection-free Permian Triassic basement rocks. Quaternary-age deformation has affected all the rocks underlying the younger alluvial sediments, and there is evidence for ongoing deformation. Eight primary and numerous secondary faults as well as a major anticlinal fold are revealed on the seismic sections. Folded sedimentary and volcanic units are observed in the hanging walls and footwalls of most faults. Five of the primary faults represent plausible extensions of mapped faults, three of which are active. The major anticlinal fold is the probable continuation of known active structure. A magnitude 7.1 earthquake occurred on 4 September 2010 near the southeastern edge of our study area. This predominantly right-lateral strike-slip event and numerous aftershocks (ten with magnitudes >= 5 within one week of the main event) highlight the primary message of our paper: that the generally flat and topographically featureless Canterbury Plains is underlain by a network of active faults that have the potential to generate significant earthquakes.
Resumo:
The shrews of the Sorex araneus group, characterized by the sexual chromosome complex XY1, Y2 have been intensively studied by morphological, karyotypical, and biochemical analyses. Nevertheless, the phylogenetic relationships among the species belonging to the araneus complex are still under debate, as different approaches gave often contradictory results. In this paper, partial nucleotide sequences of the mitochondrial DNA cytochrome b gene (1011 bp) were determined for 6 species of the araneus group from Eurasia and North America. We also included in the data set the sequences of Sorex samniticus, whose relationships with the araneus group remain controversial. Three other species representing two major karyological groups were also examined. Both parsimony and distance trees strongly support the monophyly of the araneus group. Sorex sumniticus is significantly more closely related to the araneus complex than to the other species included in the analysis. Based on the branching pattern within the araneus group, an attempt has been made to reconstruct the colonization history of the Holarctic region.
Resumo:
this study presents a review of published geological data, combined with original observations on the tectonics of the simplon massif and the Lepontine gneiss dome in the Western Alps. New observations concern the geometry of the Oligocene Vanzone back fold, formed under amphibolite facies conditions, and of its root between Domodossola and Locarno, which is cut at an acute angle by the Miocene, epi- to anchizonal, dextral centovalli strike-slip fault. the structures of the simplon massif result from collision over 50 Ma between two plate boundaries with a different geometry: the underthrusted European plate and the Adriatic indenter. Detailed mapping and analysis of a complex structural interference pattern, combined with observations on the metamorphic grade of the superimposed structures and radiometric data, allow a kinematic model to be developed for this zone of oblique continental collision. the following main Alpine tectonic phases and structures may be distinguished: 1. NW-directed nappe emplacement, starting in the Early Eocene (similar to 50 Ma); 2. W, SW and S- verging transverse folds; 3. transpressional movements on the dextral simplon ductile shear zone since similar to 32 Ma; 4. formation of the Bergell - Vanzone backfolds and of the southern steep belt during the Oligocene, emplacement of the mantle derived 31 - 29 Ma Bergell and Biella granodiorites and porphyritic andesites as well as intrusions of 29-25 Ma crustal aplites and pegmatites; 5. formation of the dextral discrete Rhone-Simplon line and the centovalli line during the Miocene, accompanied by the pull-apart development of the Lepontine gneiss dome - Dent blanche (Valpelline) depression. It is suggested that movements of shortening in fan shaped NW, W and sW directions accompanied the more regular NW- to WNW-directed displacement of the Adriatic indenter during continental collision.
Resumo:
The 20 amino acid residue peptides derived from RecA loop L2 have been shown to be the pairing domain of RecA. The peptides bind to ss- and dsDNA, unstack ssDNA, and pair the ssDNA to its homologous target in a duplex DNA. As shown by circular dichroism, upon binding to DNA the disordered peptides adopt a beta-structure conformation. Here we show that the conformational change of the peptide from random coil to beta-structure is important in binding ss- and dsDNA. The beta-structure in the DNA pairing peptides can be induced by many environmental conditions such as high pH, high concentration, and non-micellar sodium dodecyl sulfate (6 mM). This behavior indicates an intrinsic property of these peptides to form a beta-structure. A beta-structure model for the loop L2 of RecA protein when bound to DNA is thus proposed. The fact that aromatic residues at the central position 203 strongly modulate the peptide binding to DNA and subsequent biochemical activities can be accounted for by the direct effect of the aromatic amino acids on the peptide conformational change. The DNA-pairing domain of RecA visualized by electron microscopy self-assembles into a filamentous structure like RecA. The relevance of such a peptide filamentous structure to the structure of RecA when bound to DNA is discussed.
Resumo:
The Crystalline Nappe of the High Himalayan Crystalline has been examined along the Kulu Valley and its vicinity (Mandi-Khoksar transect). This nappe was believed to have undergone deformation related only to its transport towards the SW essentially during the `'Main Central Thrust event''. New data has led to the conclusion that during the Himalayan orogeny, two distinctive phases, related to two opposite transport directions, characterize the evolution of this part of the chain, before the creation of the late NE-vergent backfolding. The first phase corresponds to an early NE-vergent folding and thrusting, creating the Tandi Syncline and the NE-oriented Shikar Beh Nappe stack, with a displacement amplitude of about 50 km. Two schistosities, together with a strong stretching lineation are developed at a deep tectonic level under amphibolite facies conditions (kyanite-staurolite-garnet-two mica schists). At a higher tectonic level and in the southern part of the section (Tandy Syncline and southern Kulu Valley between Kulu and Mandi) one or two schistosities are developed in the greenschist facies grade rocks (garnet-biotite and biotite schists). These structures and the associated Barrovian type metamorphism are all related to the NE-verging Shikar Beh Nappe. The creation of the NE-verging Shikar Beh Nappe may be explained by the reactivation of a SW dipping listric normal fault of the N Indian flexural passive margin, during the early stages of the Himalayan orogeny. In the second phase, the still hot metamorphic rocks of the Shikar Beh Nappe were folded and thrust towards the SW (mainly along the MBT and the MCT with a displacement in excess of 100 km) onto the cold, low-grade metamorphic rocks of the Larji-Kulu-Rampur Window or, near Mandi, on the non-metamorphic sandstones of the Ganges Molasse (Siwaliks). Sense of shear criteria and a strong NE-SW stretching-lineation indicate that the Crystalline Nappe has been overthrusted towards the SW. Thermometry on synkinematically crystallised garnet-biotite and garnet-hornblende pairs reveals the lower amphibolite facies temperature conditions related to the Crystalline Nappe formation. From the muscovite and biotite Rb-Sr cooling ages, the Shikar Beh Nappe emplacement occurred before 32 Ma and the southwestward thrusting of the Crystalline Nappe began before 21 Ma. Our model involving two opposite directions of thrusting goes against the conventional idea of only one main SW-oriented transport direction in the High Himalayan Crystalline Nappes.
Resumo:
Aim. To predict the fate of alpine interactions involving specialized species, using a monophagous beetle and its host-plant as a case study. Location. The Alps. Methods. We investigated genetic structuring of the herbivorous beetle Oreina gloriosa and its specific host-plant Peucedanum ostruthium. We used genome fingerprinting (in the insect and the plant) and sequence data (in the insect) to compare the distribution of the main gene pools in the two associated species and to estimate divergence time in the insect, a proxy for the temporal origin of the interaction. We quantified the similarity in spatial genetic structures by performing a Procrustes analysis, a tool from the shape theory. Finally, we simulated recolonization of an empty space analogous to the deglaciated Alps just after ice retreat by two lineages from two species showing unbalanced dependence, to examine how timing of the recolonization process, as well as dispersal capacities of associated species, could explain the observed pattern. Results. Contrasting with expectations based on their asymmetrical dependence, patterns in the beetle and plant were congruent at a large scale. Exceptions occurred at a regional scale in areas of admixture, matching known suture zones in Alpine plants. Simulations using a lattice-based model suggested these empirical patterns arose during or soon after recolonization, long after the estimated origin of the interaction c. 0.5 million years ago. Main conclusions. Species-specific interactions are scarce in alpine habitats because glacial cycles have limited opportunities for coevolution. Their fate, however, remains uncertain under climate change. Here we show that whereas most dispersal routes are paralleled at large scale, regional incongruence implies that the destinies of the species might differ under changing climate. This may be a consequence of the host-dependence of the beetle that locally limits the establishment of dispersing insects.
Resumo:
The coverage and volume of geo-referenced datasets are extensive and incessantly¦growing. The systematic capture of geo-referenced information generates large volumes¦of spatio-temporal data to be analyzed. Clustering and visualization play a key¦role in the exploratory data analysis and the extraction of knowledge embedded in¦these data. However, new challenges in visualization and clustering are posed when¦dealing with the special characteristics of this data. For instance, its complex structures,¦large quantity of samples, variables involved in a temporal context, high dimensionality¦and large variability in cluster shapes.¦The central aim of my thesis is to propose new algorithms and methodologies for¦clustering and visualization, in order to assist the knowledge extraction from spatiotemporal¦geo-referenced data, thus improving making decision processes.¦I present two original algorithms, one for clustering: the Fuzzy Growing Hierarchical¦Self-Organizing Networks (FGHSON), and the second for exploratory visual data analysis:¦the Tree-structured Self-organizing Maps Component Planes. In addition, I present¦methodologies that combined with FGHSON and the Tree-structured SOM Component¦Planes allow the integration of space and time seamlessly and simultaneously in¦order to extract knowledge embedded in a temporal context.¦The originality of the FGHSON lies in its capability to reflect the underlying structure¦of a dataset in a hierarchical fuzzy way. A hierarchical fuzzy representation of¦clusters is crucial when data include complex structures with large variability of cluster¦shapes, variances, densities and number of clusters. The most important characteristics¦of the FGHSON include: (1) It does not require an a-priori setup of the number¦of clusters. (2) The algorithm executes several self-organizing processes in parallel.¦Hence, when dealing with large datasets the processes can be distributed reducing the¦computational cost. (3) Only three parameters are necessary to set up the algorithm.¦In the case of the Tree-structured SOM Component Planes, the novelty of this algorithm¦lies in its ability to create a structure that allows the visual exploratory data analysis¦of large high-dimensional datasets. This algorithm creates a hierarchical structure¦of Self-Organizing Map Component Planes, arranging similar variables' projections in¦the same branches of the tree. Hence, similarities on variables' behavior can be easily¦detected (e.g. local correlations, maximal and minimal values and outliers).¦Both FGHSON and the Tree-structured SOM Component Planes were applied in¦several agroecological problems proving to be very efficient in the exploratory analysis¦and clustering of spatio-temporal datasets.¦In this thesis I also tested three soft competitive learning algorithms. Two of them¦well-known non supervised soft competitive algorithms, namely the Self-Organizing¦Maps (SOMs) and the Growing Hierarchical Self-Organizing Maps (GHSOMs); and the¦third was our original contribution, the FGHSON. Although the algorithms presented¦here have been used in several areas, to my knowledge there is not any work applying¦and comparing the performance of those techniques when dealing with spatiotemporal¦geospatial data, as it is presented in this thesis.¦I propose original methodologies to explore spatio-temporal geo-referenced datasets¦through time. Our approach uses time windows to capture temporal similarities and¦variations by using the FGHSON clustering algorithm. The developed methodologies¦are used in two case studies. In the first, the objective was to find similar agroecozones¦through time and in the second one it was to find similar environmental patterns¦shifted in time.¦Several results presented in this thesis have led to new contributions to agroecological¦knowledge, for instance, in sugar cane, and blackberry production.¦Finally, in the framework of this thesis we developed several software tools: (1)¦a Matlab toolbox that implements the FGHSON algorithm, and (2) a program called¦BIS (Bio-inspired Identification of Similar agroecozones) an interactive graphical user¦interface tool which integrates the FGHSON algorithm with Google Earth in order to¦show zones with similar agroecological characteristics.
Resumo:
Basal ganglia and brain stem nuclei are involved in the pathophysiology of various neurological and neuropsychiatric disorders. Currently available structural T1-weighted (T1w) magnetic resonance images do not provide sufficient contrast for reliable automated segmentation of various subcortical grey matter structures. We use a novel, semi-quantitative magnetization transfer (MT) imaging protocol that overcomes limitations in T1w images, which are mainly due to their sensitivity to the high iron content in subcortical grey matter. We demonstrate improved automated segmentation of putamen, pallidum, pulvinar and substantia nigra using MT images. A comparison with segmentation of high-quality T1w images was performed in 49 healthy subjects. Our results show that MT maps are highly suitable for automated segmentation, and so for multi-subject morphometric studies with a focus on subcortical structures.