756 resultados para Grid-based clustering approach
Resumo:
Earthquakes are associated with negative events, such as large number of casualties, destruction of buildings and infrastructures, or emergence of tsunamis. In this paper, we apply the Multidimensional Scaling (MDS) analysis to earthquake data. MDS is a set of techniques that produce spatial or geometric representations of complex objects, such that, objects perceived to be similar/distinct in some sense are placed nearby/distant on the MDS maps. The interpretation of the charts is based on the resulting clusters since MDS produces a different locus for each similarity measure. In this study, over three million seismic occurrences, covering the period from January 1, 1904 up to March 14, 2012 are analyzed. The events, characterized by their magnitude and spatiotemporal distributions, are divided into groups, either according to the Flinn–Engdahl seismic regions of Earth or using a rectangular grid based in latitude and longitude coordinates. Space-time and Space-frequency correlation indices are proposed to quantify the similarities among events. MDS has the advantage of avoiding sensitivity to the non-uniform spatial distribution of seismic data, resulting from poorly instrumented areas, and is well suited for accessing dynamics of complex systems. MDS maps are proven as an intuitive and useful visual representation of the complex relationships that are present among seismic events, which may not be perceived on traditional geographic maps. Therefore, MDS constitutes a valid alternative to classic visualization tools, for understanding the global behavior of earthquakes.
Resumo:
We present a novel approach of Stereo Visual Odometry for vehicles equipped with calibrated stereo cameras. We combine a dense probabilistic 5D egomotion estimation method with a sparse keypoint based stereo approach to provide high quality estimates of vehicle’s angular and linear velocities. To validate our approach, we perform two sets of experiments with a well known benchmarking dataset. First, we assess the quality of the raw velocity estimates in comparison to classical pose estimation algorithms. Second, we added to our method’s instantaneous velocity estimates a Kalman Filter and compare its performance with a well known open source stereo Visual Odometry library. The presented results compare favorably with state-of-the-art approaches, mainly in the estimation of the angular velocities, where significant improvements are achieved.
Resumo:
The present paper reports the precipitation process of Al3Sc structures in an aluminum scandium alloy, which has been simulated with a synchronous parallel kinetic Monte Carlo (spkMC) algorithm. The spkMC implementation is based on the vacancy diffusion mechanism. To filter the raw data generated by the spkMC simulations, the density-based clustering with noise (DBSCAN) method has been employed. spkMC and DBSCAN algorithms were implemented in the C language and using MPI library. The simulations were conducted in the SeARCH cluster located at the University of Minho. The Al3Sc precipitation was successfully simulated at the atomistic scale with the spkMC. DBSCAN proved to be a valuable aid to identify the precipitates by performing a cluster analysis of the simulation results. The achieved simulations results are in good agreement with those reported in the literature under sequential kinetic Monte Carlo simulations (kMC). The parallel implementation of kMC has provided a 4x speedup over the sequential version.
Resumo:
ABSTRACT Quantitative evaluations of species distributional congruence allow evaluating previously proposed biogeographic regionalization and even identify undetected areas of endemism. The geographic scenery of Northwestern Argentina offers ideal conditions for the study of distributional patterns of species since the boundaries of a diverse group of biomes converge in a relatively small region, which also includes a diverse fauna of mammals. In this paper we applied a grid-based explicit method in order to recognize Patterns of Distributional Congruence (PDCs) and Areas of Endemism (AEs), and the species (native but non-endemic and endemic, respectively) that determine them. Also, we relate these distributional patterns to traditional biogeographic divisions of the study region and with a very recent phytogeographic study and we reconsider what previously rejected as 'spurious' areas. Finally, we assessed the generality of the patterns found. The analysis resulted in 165 consensus areas, characterized by seven species of marsupials, 28 species of bats, and 63 species of rodents, which represents a large percentage of the total species (10, 41, and 73, respectively). Twenty-five percent of the species that characterize consensus areas are endemic to the study region and define six AEs in strict sense while 12 PDCs are mainly defined by widely distributed species. While detailed quantitative analyses of plant species distribution data made by other authors does not result in units that correspond to Cabrera's phytogeographic divisions at this spatial scale, analyses of animal species distribution data does. We were able to identify previously unknown meaningful faunal patterns and more accurately define those already identified. We identify PDCs and AEs that conform Eastern Andean Slopes Patterns, Western High Andes Patterns, and Merged Eastern and Western Andean Slopes Patterns, some of which are re-interpreted at the light of known patterns of the endemic vascular flora. Endemism do not declines towards the south, but do declines towards the west of the study region. Peaks of endemism are found in the eastern Andean slopes in Jujuy and Tucumán/Catamarca, and in the western Andean biomes in Tucumán/Catamarca. The principal habitat types for endemic small mammal species are the eastern humid Andean slopes. Notwithstanding, arid/semi-arid biomes and humid landscapes are represented by the same number of AEs. Rodent species define 15 of the 18 General Patterns, and only in one they have no participation at all. Clearly, at this spatial scale, non-flying mammals, particularly rodents, are biogeographically more valuable species than flying mammals (bat species).
Resumo:
A 0.125 degree raster or grid-based Geographic Information System with data on tsetse, trypanosomosis, animal production, agriculture and land use has recently been developed in Togo. This paper addresses the problem of generating tsetse distribution and abundance maps from remotely sensed data, using a restricted amount of field data. A discriminant analysis model is tested using contemporary tsetse data and remotely sensed, low resolution data acquired from the National Oceanographic and Atmospheric Administration and Meteosat platforms. A split sample technique is adopted where a randomly selected part of the field measured data (training set) serves to predict the other part (predicted set). The obtained results are then compared with field measured data per corresponding grid-square. Depending on the size of the training set the percentage of concording predictions varies from 80 to 95 for distribution figures and from 63 to 74 for abundance. These results confirm the potential of satellite data application and multivariate analysis for the prediction, not only of the tsetse distribution, but more importantly of their abundance. This opens up new avenues because satellite predictions and field data may be combined to strengthen or substitute one another and thus reduce costs of field surveys.
Resumo:
Des progrès significatifs ont été réalisés dans le domaine de l'intégration quantitative des données géophysique et hydrologique l'échelle locale. Cependant, l'extension à de plus grandes échelles des approches correspondantes constitue encore un défi majeur. Il est néanmoins extrêmement important de relever ce défi pour développer des modèles fiables de flux des eaux souterraines et de transport de contaminant. Pour résoudre ce problème, j'ai développé une technique d'intégration des données hydrogéophysiques basée sur une procédure bayésienne de simulation séquentielle en deux étapes. Cette procédure vise des problèmes à plus grande échelle. L'objectif est de simuler la distribution d'un paramètre hydraulique cible à partir, d'une part, de mesures d'un paramètre géophysique pertinent qui couvrent l'espace de manière exhaustive, mais avec une faible résolution (spatiale) et, d'autre part, de mesures locales de très haute résolution des mêmes paramètres géophysique et hydraulique. Pour cela, mon algorithme lie dans un premier temps les données géophysiques de faible et de haute résolution à travers une procédure de réduction déchelle. Les données géophysiques régionales réduites sont ensuite reliées au champ du paramètre hydraulique à haute résolution. J'illustre d'abord l'application de cette nouvelle approche dintégration des données à une base de données synthétiques réaliste. Celle-ci est constituée de mesures de conductivité hydraulique et électrique de haute résolution réalisées dans les mêmes forages ainsi que destimations des conductivités électriques obtenues à partir de mesures de tomographic de résistivité électrique (ERT) sur l'ensemble de l'espace. Ces dernières mesures ont une faible résolution spatiale. La viabilité globale de cette méthode est testée en effectuant les simulations de flux et de transport au travers du modèle original du champ de conductivité hydraulique ainsi que du modèle simulé. Les simulations sont alors comparées. Les résultats obtenus indiquent que la procédure dintégration des données proposée permet d'obtenir des estimations de la conductivité en adéquation avec la structure à grande échelle ainsi que des predictions fiables des caractéristiques de transports sur des distances de moyenne à grande échelle. Les résultats correspondant au scénario de terrain indiquent que l'approche d'intégration des données nouvellement mise au point est capable d'appréhender correctement les hétérogénéitées à petite échelle aussi bien que les tendances à gande échelle du champ hydraulique prévalent. Les résultats montrent également une flexibilté remarquable et une robustesse de cette nouvelle approche dintégration des données. De ce fait, elle est susceptible d'être appliquée à un large éventail de données géophysiques et hydrologiques, à toutes les gammes déchelles. Dans la deuxième partie de ma thèse, j'évalue en détail la viabilité du réechantillonnage geostatique séquentiel comme mécanisme de proposition pour les méthodes Markov Chain Monte Carlo (MCMC) appliquées à des probmes inverses géophysiques et hydrologiques de grande dimension . L'objectif est de permettre une quantification plus précise et plus réaliste des incertitudes associées aux modèles obtenus. En considérant une série dexemples de tomographic radar puits à puits, j'étudie deux classes de stratégies de rééchantillonnage spatial en considérant leur habilité à générer efficacement et précisément des réalisations de la distribution postérieure bayésienne. Les résultats obtenus montrent que, malgré sa popularité, le réechantillonnage séquentiel est plutôt inefficace à générer des échantillons postérieurs indépendants pour des études de cas synthétiques réalistes, notamment pour le cas assez communs et importants où il existe de fortes corrélations spatiales entre le modèle et les paramètres. Pour résoudre ce problème, j'ai développé un nouvelle approche de perturbation basée sur une déformation progressive. Cette approche est flexible en ce qui concerne le nombre de paramètres du modèle et lintensité de la perturbation. Par rapport au rééchantillonage séquentiel, cette nouvelle approche s'avère être très efficace pour diminuer le nombre requis d'itérations pour générer des échantillons indépendants à partir de la distribution postérieure bayésienne. - Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale. However, extending corresponding approaches beyond the local scale still represents a major challenge, yet is critically important for the development of reliable groundwater flow and contaminant transport models. To address this issue, I have developed a hydrogeophysical data integration technique based on a two-step Bayesian sequential simulation procedure that is specifically targeted towards larger-scale problems. The objective is to simulate the distribution of a target hydraulic parameter based on spatially exhaustive, but poorly resolved, measurements of a pertinent geophysical parameter and locally highly resolved, but spatially sparse, measurements of the considered geophysical and hydraulic parameters. To this end, my algorithm links the low- and high-resolution geophysical data via a downscaling procedure before relating the downscaled regional-scale geophysical data to the high-resolution hydraulic parameter field. I first illustrate the application of this novel data integration approach to a realistic synthetic database consisting of collocated high-resolution borehole measurements of the hydraulic and electrical conductivities and spatially exhaustive, low-resolution electrical conductivity estimates obtained from electrical resistivity tomography (ERT). The overall viability of this method is tested and verified by performing and comparing flow and transport simulations through the original and simulated hydraulic conductivity fields. The corresponding results indicate that the proposed data integration procedure does indeed allow for obtaining faithful estimates of the larger-scale hydraulic conductivity structure and reliable predictions of the transport characteristics over medium- to regional-scale distances. The approach is then applied to a corresponding field scenario consisting of collocated high- resolution measurements of the electrical conductivity, as measured using a cone penetrometer testing (CPT) system, and the hydraulic conductivity, as estimated from electromagnetic flowmeter and slug test measurements, in combination with spatially exhaustive low-resolution electrical conductivity estimates obtained from surface-based electrical resistivity tomography (ERT). The corresponding results indicate that the newly developed data integration approach is indeed capable of adequately capturing both the small-scale heterogeneity as well as the larger-scale trend of the prevailing hydraulic conductivity field. The results also indicate that this novel data integration approach is remarkably flexible and robust and hence can be expected to be applicable to a wide range of geophysical and hydrological data at all scale ranges. In the second part of my thesis, I evaluate in detail the viability of sequential geostatistical resampling as a proposal mechanism for Markov Chain Monte Carlo (MCMC) methods applied to high-dimensional geophysical and hydrological inverse problems in order to allow for a more accurate and realistic quantification of the uncertainty associated with the thus inferred models. Focusing on a series of pertinent crosshole georadar tomographic examples, I investigated two classes of geostatistical resampling strategies with regard to their ability to efficiently and accurately generate independent realizations from the Bayesian posterior distribution. The corresponding results indicate that, despite its popularity, sequential resampling is rather inefficient at drawing independent posterior samples for realistic synthetic case studies, notably for the practically common and important scenario of pronounced spatial correlation between model parameters. To address this issue, I have developed a new gradual-deformation-based perturbation approach, which is flexible with regard to the number of model parameters as well as the perturbation strength. Compared to sequential resampling, this newly proposed approach was proven to be highly effective in decreasing the number of iterations required for drawing independent samples from the Bayesian posterior distribution.
Resumo:
Projecte de recerca elaborat a partir d’una estada a l’Snider Entrepreneurial Research Center de la Wharton School de la University of Pennsilvanya y, EUA entre juliol i desembre del 2007. L’objectiu d’aquest projecte és estudiar la relació entre les estratègies de gestió del coneixement i les tecnologies de la informació i la comunicació (TIC) en l’evolució de les poblacions d’organitzacions i els seus efectes en els patrons industrials d’aglomeració espacial. Per a això s’adopta una aproximació fonamentada en la utilització d'un model basats en agents per a obtenir hipòtesis significatives i provables sobre l’evolució de les poblacions d’organitzacions al si de clústers geogràfics. El model de simulació incorpora les perspectives i supòsits d’un marc conceptual, l’Espai de la Informació o I-Space. Això permet una conceptualització basada en la informació de l’entorn econòmic que té en compte les seves dimensions espacials i temporals. Mitjançant els paràmetres del model es dóna la possibilitat d’assignar estratègies específiques de gestió del coneixement als diversos agents i de localitzar-los en una posició de l’espai físic. La simulació mostra com l'adopció d'estratègies diverses pel que fa a la gestió del coneixement influeix en l'evolució de les organitzacions i de la seva localització espacial, i que aquesta evolució es veu modificada pel desenvolupament de les TIC. A través de la modelització de dos casos ben coneguts de clústers geogràfics d’alta tecnologia, com són Silicon Valley a Califòrnia i la Route 128 als voltants de Boston, s’estudia la interrelació entre les estratègies de gestió del coneixement adoptades per les empreses i la seva tria de localització espacial, i també com això és afectat per l’evolució de les tecnologies de la informació i de la comunicació (TIC). Els resultats obtinguts generen una sèrie d’hipòtesis de rica potencialitat sobre l’impacte del desenvolupament de les TIC en la dinàmica d’aquests clusters geogràfics. Concretament, es troba que la estructuració del coneixement i l’aglomeració espacial co-evolucionen i que aquesta coevolució es veu significativament alterada pel desenvolupament de les TIC.
Resumo:
Proteomics has changed the way proteins are analyzed in living systems. This approach has been applied to blood products and protein profiling has evolved in parallel with the development of techniques. The identification of proteins belonging to red blood cell, platelets or plasma was achieved at the end of the last century. Then, the questions on the applications emerged. Hence, several studies have focused on problems related to blood banking and products, such as the aging of blood products, identification of biomarkers, related diseases and the protein-protein interactions. More recently, a mass spectrometry-based proteomics approach to quality control has been applied in order to offer solutions and improve the quality of blood products. The current challenge we face is developing a closer relationship between transfusion medicine and proteomics. In this article, these issues will be approached by focusing first on the proteome identification of blood products and then on the applications and future developments within the field of proteomics and blood products.
Resumo:
The aim of the present study was to investigate the genetic structure of the Valais shrew (Sorex antinorii) by a combined phylogeographical and landscape genetic approach, and thereby to infer the locations of glacial refugia and establish the influence of geographical barriers. We sequenced part of the mitochondrial cytochrome b (cyt b) gene of 179 individuals of S. antinorii sampled across the entire species' range. Six specimens attributed to S. arunchi were included in the analysis. The phylogeographical pattern was assessed by Bayesian molecular phylogenetic reconstruction, population genetic analyses, and a species distribution modelling (SDM)-based hindcasting approach. We also used landscape genetics (including isolation-by-resistance) to infer the determinants of current intra-specific genetic structure. The phylogeographical analysis revealed shallow divergence among haplotypes and no clear substructure within S. antinorii. The starlike structure of the median-joining network is consistent with population expansion from a single refugium, probably located in the Apennines. Long branches observed on the same network also suggest that another refugium may have existed in the north-eastern part of Italy. This result is consistent with SDM, which also suggests several habitable areas for S. antinorii in the Italian peninsula during the LGM. Therefore S. antinorii appears to have occupied disconnected glacial refugia in the Italian peninsula, supporting previous data for other species showing multiple refugia within southern refugial areas. By coupling genetic analyses and SDM, we were able to infer how past climatic suitability contributed to genetic divergence of populations. The genetic differentiation shown in the present study does not support the specific status of S. arunchi.
Resumo:
BACKGROUND. Bioinformatics is commonly featured as a well assorted list of available web resources. Although diversity of services is positive in general, the proliferation of tools, their dispersion and heterogeneity complicate the integrated exploitation of such data processing capacity. RESULTS. To facilitate the construction of software clients and make integrated use of this variety of tools, we present a modular programmatic application interface (MAPI) that provides the necessary functionality for uniform representation of Web Services metadata descriptors including their management and invocation protocols of the services which they represent. This document describes the main functionality of the framework and how it can be used to facilitate the deployment of new software under a unified structure of bioinformatics Web Services. A notable feature of MAPI is the modular organization of the functionality into different modules associated with specific tasks. This means that only the modules needed for the client have to be installed, and that the module functionality can be extended without the need for re-writing the software client. CONCLUSIONS. The potential utility and versatility of the software library has been demonstrated by the implementation of several currently available clients that cover different aspects of integrated data processing, ranging from service discovery to service invocation with advanced features such as workflows composition and asynchronous services calls to multiple types of Web Services including those registered in repositories (e.g. GRID-based, SOAP, BioMOBY, R-bioconductor, and others).
Resumo:
INTRODUCTION Evidence-based recommendations are needed to guide the acute management of the bleeding trauma patient, which when implemented may improve patient outcomes. METHODS The multidisciplinary Task Force for Advanced Bleeding Care in Trauma was formed in 2005 with the aim of developing a guideline for the management of bleeding following severe injury. This document presents an updated version of the guideline published by the group in 2007. Recommendations were formulated using a nominal group process, the Grading of Recommendations Assessment, Development and Evaluation (GRADE) hierarchy of evidence and based on a systematic review of published literature. RESULTS Key changes encompassed in this version of the guideline include new recommendations on coagulation support and monitoring and the appropriate use of local haemostatic measures, tourniquets, calcium and desmopressin in the bleeding trauma patient. The remaining recommendations have been reevaluated and graded based on literature published since the last edition of the guideline. Consideration was also given to changes in clinical practice that have taken place during this time period as a result of both new evidence and changes in the general availability of relevant agents and technologies. CONCLUSIONS This guideline provides an evidence-based multidisciplinary approach to the management of critically injured bleeding trauma patients.
Resumo:
BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology. RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads. CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.
Resumo:
This paper reviews the methods for the inventory of below-ground biotas in the humid tropics, to document the (hypothesized) loss of soil biodiversity associated with deforestation and agricultural intensification at forest margins. The biotas were grouped into eight categories, each of which corresponded to a major functional group considered important or essential to soil function. An accurate inventory of soil organisms can assist in ecosystem management and help sustain agricultural production. The advantages and disadvantages of transect-based and grid-based sampling methods are discussed, illustrated by published protocols ranging from the original "TSBF transect", through versions developed for the alternatives to Slash-and-Burn Project (ASB) to the final schemes (with variants) adopted by the Conservation and Sustainable Management of Below-ground Biodiversity Project (CSM-BGBD). Consideration is given to the place and importance of replication in below-ground biological sampling and it is argued that the new sampling protocols are inclusive, i.e. designed to sample all eight biotic groups in the same field exercise; spatially scaled, i.e. provide biodiversity data at site, locality, landscape and regional levels, and link the data to land use and land cover; and statistically robust, as shown by a partial randomization of plot locations for sampling.
Resumo:
Access to new biological sources is a key element of natural product research. A particularly large number of biologically active molecules have been found to originate from microorganisms. Very recently, the use of fungal co-culture to activate the silent genes involved in metabolite biosynthesis was found to be a successful method for the induction of new compounds. However, the detection and identification of the induced metabolites in the confrontation zone where fungi interact remain very challenging. To tackle this issue, a high-throughput UHPLC-TOF-MS-based metabolomic approach has been developed for the screening of fungal co-cultures in solid media at the petri dish level. The metabolites that were overexpressed because of fungal interactions were highlighted by comparing the LC-MS data obtained from the co-cultures and their corresponding mono-cultures. This comparison was achieved by subjecting automatically generated peak lists to statistical treatments. This strategy has been applied to more than 600 co-culture experiments that mainly involved fungal strains from the Fusarium genera, although experiments were also completed with a selection of several other filamentous fungi. This strategy was found to provide satisfactory repeatability and was used to detect the biomarkers of fungal induction in a large panel of filamentous fungi. This study demonstrates that co-culture results in consistent induction of potentially new metabolites.
Resumo:
The objective of this work was to assess the genetic diversity and population structure of wheat genotypes, to detect significant and stable genetic associations, as well as to evaluate the efficiency of statistical models to identify chromosome regions responsible for the expression of spike-related traits. Eight important spike characteristics were measured during five growing seasons in Serbia. A set of 30 microsatellite markers positioned near important agronomic loci was used to evaluate genetic diversity, resulting in a total of 349 alleles. The marker-trait associations were analyzed using the general linear and mixed linear models. The results obtained for number of allelic variants per locus (11.5), average polymorphic information content value (0.68), and average gene diversity (0.722) showed that the exceptional level of polymorphism in the genotypes is the main requirement for association studies. The population structure estimated by model-based clustering distributed the genotypes into six subpopulations according to log probability of data. Significant and stable associations were detected on chromosomes 1B, 2A, 2B, 2D, and 6D, which explained from 4.7 to 40.7% of total phenotypic variations. The general linear model identified a significantly larger number of marker-trait associations (192) than the mixed linear model (76). The mixed linear model identified nine markers associated to six traits.