853 resultados para heterogeneous regressions algorithms
Resumo:
Forecasting coal resources and reserves is critical for coal mine development. Thickness maps are commonly used for assessing coal resources and reserves; however they are limited for capturing coal splitting effects in thick and heterogeneous coal zones. As an alternative, three-dimensional geostatistical methods are used to populate facies distributionwithin a densely drilled heterogeneous coal zone in the As Pontes Basin (NWSpain). Coal distribution in this zone is mainly characterized by coal-dominated areas in the central parts of the basin interfingering with terrigenous-dominated alluvial fan zones at the margins. The three-dimensional models obtained are applied to forecast coal resources and reserves. Predictions using subsets of the entire dataset are also generated to understand the performance of methods under limited data constraints. Three-dimensional facies interpolation methods tend to overestimate coal resources and reserves due to interpolation smoothing. Facies simulation methods yield similar resource predictions than conventional thickness map approximations. Reserves predicted by facies simulation methods are mainly influenced by: a) the specific coal proportion threshold used to determine if a block can be recovered or not, and b) the capability of the modelling strategy to reproduce areal trends in coal proportions and splitting between coal-dominated and terrigenousdominated areas of the basin. Reserves predictions differ between the simulation methods, even with dense conditioning datasets. Simulation methods can be ranked according to the correlation of their outputs with predictions from the directly interpolated coal proportion maps: a) with low-density datasets sequential indicator simulation with trends yields the best correlation, b) with high-density datasets sequential indicator simulation with post-processing yields the best correlation, because the areal trends are provided implicitly by the dense conditioning data.
Resumo:
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
Resumo:
The gold mineralization of the Hutti Mine is hosted by nine parallel, N - S trending, steeply dipping, 2 - 10 m wide shear zones, that transect Archaean amphibolites. The shear zones were formed after peak metamorphism during retrograde ductile D, shearing in the lower amphibolite facies. They were reactivated in the lower to mid greenschist facies by brittle-ductile D-3 shearing and intense quartz veining. The development of a S-2-S-3 crenulation cleavage facilitates the discrimination between the two deformation events and contemporaneous alteration and gold mineralization. Ductile D, shearing is associated with a pervasively developed distal chlorite - sed cite alteration assemblage in the outer parts of the shear zones and the proximal biotite-plagioclase alteration in the center of the shear zones. D3 is characterized by development of the inner chlorite-K-feldspar alteration, which forms a centimeter-scale alteration halo surrounding the laminated quartz veins and replaces earlier biotite along S-3. The average size of the laminated vein systems is 30-50 m along strike as well as down-dip and 2-6 m in width. Mass balance calculations suggest strong metasomatic changes for the proximal biotite-plagioclase alteration yielding mass and volume increase of ca. 16% and 12%, respectively. The calculated mass and volume changes of the distal chlorite-sericite alteration (ca. 11%, ca. 8%) are lower. The decrease in 6180 values of the whole rock from around 7.5 parts per thousand for the host rocks to 6-7 parts per thousand for the distal chlorite-sericite and the proximal biotite-plagioclase alteration and around 5 parts per thousand for the inner chlorite-K-feldspar alteration suggests hydrothermal alteration during two-stage deformation and fluid flow. The ductile D-2 deformation in the lower amphibolite facies has provided grain scale porosities by microfracturing. The pervasive, steady-state fluid flow resulted in a disseminated style of gold-sulfide mineralization and a penetrative alteration of the host rocks. Alternating ductile and brittle D3 deformation during lower to mid greenschist facies conditions followed the fault-valve process. Ductile creep in the shear zones resulted in a low permeability environment leading to fluid pressure build-up. Strongly episodic fluid advection and mass transfer was controlled by repeated seismic fracturing during the formation of laminated quartz(-gold) veins. The limitation of quartz veins to the extent of earlier shear zones indicate the importance of preexisting anisotropies for fault-valve action and economic gold mineralization. (C) 2003 Elsevier B.V. All rights reserved.
Resumo:
Résumé: L'automatisation du séquençage et de l'annotation des génomes, ainsi que l'application à large échelle de méthodes de mesure de l'expression génique, génèrent une quantité phénoménale de données pour des organismes modèles tels que l'homme ou la souris. Dans ce déluge de données, il devient très difficile d'obtenir des informations spécifiques à un organisme ou à un gène, et une telle recherche aboutit fréquemment à des réponses fragmentées, voir incomplètes. La création d'une base de données capable de gérer et d'intégrer aussi bien les données génomiques que les données transcriptomiques peut grandement améliorer la vitesse de recherche ainsi que la qualité des résultats obtenus, en permettant une comparaison directe de mesures d'expression des gènes provenant d'expériences réalisées grâce à des techniques différentes. L'objectif principal de ce projet, appelé CleanEx, est de fournir un accès direct aux données d'expression publiques par le biais de noms de gènes officiels, et de représenter des données d'expression produites selon des protocoles différents de manière à faciliter une analyse générale et une comparaison entre plusieurs jeux de données. Une mise à jour cohérente et régulière de la nomenclature des gènes est assurée en associant chaque expérience d'expression de gène à un identificateur permanent de la séquence-cible, donnant une description physique de la population d'ARN visée par l'expérience. Ces identificateurs sont ensuite associés à intervalles réguliers aux catalogues, en constante évolution, des gènes d'organismes modèles. Cette procédure automatique de traçage se fonde en partie sur des ressources externes d'information génomique, telles que UniGene et RefSeq. La partie centrale de CleanEx consiste en un index de gènes établi de manière hebdomadaire et qui contient les liens à toutes les données publiques d'expression déjà incorporées au système. En outre, la base de données des séquences-cible fournit un lien sur le gène correspondant ainsi qu'un contrôle de qualité de ce lien pour différents types de ressources expérimentales, telles que des clones ou des sondes Affymetrix. Le système de recherche en ligne de CleanEx offre un accès aux entrées individuelles ainsi qu'à des outils d'analyse croisée de jeux de donnnées. Ces outils se sont avérés très efficaces dans le cadre de la comparaison de l'expression de gènes, ainsi que, dans une certaine mesure, dans la détection d'une variation de cette expression liée au phénomène d'épissage alternatif. Les fichiers et les outils de CleanEx sont accessibles en ligne (http://www.cleanex.isb-sib.ch/). Abstract: The automatic genome sequencing and annotation, as well as the large-scale gene expression measurements methods, generate a massive amount of data for model organisms. Searching for genespecific or organism-specific information througout all the different databases has become a very difficult task, and often results in fragmented and unrelated answers. The generation of a database which will federate and integrate genomic and transcriptomic data together will greatly improve the search speed as well as the quality of the results by allowing a direct comparison of expression results obtained by different techniques. The main goal of this project, called the CleanEx database, is thus to provide access to public gene expression data via unique gene names and to represent heterogeneous expression data produced by different technologies in a way that facilitates joint analysis and crossdataset comparisons. A consistent and uptodate gene nomenclature is achieved by associating each single gene expression experiment with a permanent target identifier consisting of a physical description of the targeted RNA population or the hybridization reagent used. These targets are then mapped at regular intervals to the growing and evolving catalogues of genes from model organisms, such as human and mouse. The completely automatic mapping procedure relies partly on external genome information resources such as UniGene and RefSeq. The central part of CleanEx is a weekly built gene index containing crossreferences to all public expression data already incorporated into the system. In addition, the expression target database of CleanEx provides gene mapping and quality control information for various types of experimental resources, such as cDNA clones or Affymetrix probe sets. The Affymetrix mapping files are accessible as text files, for further use in external applications, and as individual entries, via the webbased interfaces . The CleanEx webbased query interfaces offer access to individual entries via text string searches or quantitative expression criteria, as well as crossdataset analysis tools, and crosschip gene comparison. These tools have proven to be very efficient in expression data comparison and even, to a certain extent, in detection of differentially expressed splice variants. The CleanEx flat files and tools are available online at: http://www.cleanex.isbsib. ch/.
Resumo:
The past few decades have seen a considerable increase in the number of parallel and distributed systems. With the development of more complex applications, the need for more powerful systems has emerged and various parallel and distributed environments have been designed and implemented. Each of the environments, including hardware and software, has unique strengths and weaknesses. There is no single parallel environment that can be identified as the best environment for all applications with respect to hardware and software properties. The main goal of this thesis is to provide a novel way of performing data-parallel computation in parallel and distributed environments by utilizing the best characteristics of difference aspects of parallel computing. For the purpose of this thesis, three aspects of parallel computing were identified and studied. First, three parallel environments (shared memory, distributed memory, and a network of workstations) are evaluated to quantify theirsuitability for different parallel applications. Due to the parallel and distributed nature of the environments, networks connecting the processors in these environments were investigated with respect to their performance characteristics. Second, scheduling algorithms are studied in order to make them more efficient and effective. A concept of application-specific information scheduling is introduced. The application- specific information is data about the workload extractedfrom an application, which is provided to a scheduling algorithm. Three scheduling algorithms are enhanced to utilize the application-specific information to further refine their scheduling properties. A more accurate description of the workload is especially important in cases where the workunits are heterogeneous and the parallel environment is heterogeneous and/or non-dedicated. The results obtained show that the additional information regarding the workload has a positive impact on the performance of applications. Third, a programming paradigm for networks of symmetric multiprocessor (SMP) workstations is introduced. The MPIT programming paradigm incorporates the Message Passing Interface (MPI) with threads to provide a methodology to write parallel applications that efficiently utilize the available resources and minimize the overhead. The MPIT allows for communication and computation to overlap by deploying a dedicated thread for communication. Furthermore, the programming paradigm implements an application-specific scheduling algorithm. The scheduling algorithm is executed by the communication thread. Thus, the scheduling does not affect the execution of the parallel application. Performance results achieved from the MPIT show that considerable improvements over conventional MPI applications are achieved.
Resumo:
En els darrers anys, la criptografia amb corbes el.líptiques ha adquirit una importància creixent, fins a arribar a formar part en la actualitat de diferents estàndards industrials. Tot i que s'han dissenyat variants amb corbes el.líptiques de criptosistemes clàssics, com el RSA, el seu màxim interès rau en la seva aplicació en criptosistemes basats en el Problema del Logaritme Discret, com els de tipus ElGamal. En aquest cas, els criptosistemes el.líptics garanteixen la mateixa seguretat que els construïts sobre el grup multiplicatiu d'un cos finit primer, però amb longituds de clau molt menor. Mostrarem, doncs, les bones propietats d'aquests criptosistemes, així com els requeriments bàsics per a que una corba sigui criptogràficament útil, estretament relacionat amb la seva cardinalitat. Revisarem alguns mètodes que permetin descartar corbes no criptogràficament útils, així com altres que permetin obtenir corbes bones a partir d'una de donada. Finalment, descriurem algunes aplicacions, com són el seu ús en Targes Intel.ligents i sistemes RFID, per concloure amb alguns avenços recents en aquest camp.
Resumo:
The concept of conditional stability constant is extended to the competitive binding of small molecules to heterogeneous surfaces or macromolecules via the introduction of the conditional affinity spectrum (CAS). The CAS describes the distribution of effective binding energies experienced by one complexing agent at a fixed concentration of the rest. We show that, when the multicomponent system can be described in terms of an underlying affinity spectrum [integral equation (IE) approach], the system can always be characterized by means of a CAS. The thermodynamic properties of the CAS and its dependence on the concentration of the rest of components are discussed. In the context of metal/proton competition, analytical expressions for the mean (conditional average affinity) and the variance (conditional heterogeneity) of the CAS as functions of pH are reported and their physical interpretation discussed. Furthermore, we show that the dependence of the CAS variance on pH allows for the analytical determination of the correlation coefficient between the binding energies of the metal and the proton. Nonideal competitive adsorption isotherm and Frumkin isotherms are used to illustrate the results of this work. Finally, the possibility of using CAS when the IE approach does not apply (for instance, when multidentate binding is present) is explored. © 2006 American Institute of Physics.
Resumo:
An analytical approach for the interpretation of multicomponent heterogeneous adsorption or complexation isotherms in terms of multidimensional affinity spectra is presented. Fourier transform, applied to analyze the corresponding integral equation, leads to an inversion formula which allows the computation of the multicomponent affinity spectrum underlying a given competitive isotherm. Although a different mathematical methodology is used, this procedure can be seen as the extension to multicomponent systems of the classical Sips’s work devoted to monocomponent systems. Furthermore, a methodology which yields analytical expressions for the main statistical properties (mean free energies of binding and covariance matrix) of multidimensional affinity spectra is reported. Thus, the level of binding correlation between the different components can be quantified. It has to be highlighted that the reported methodology does not require the knowledge of the affinity spectrum to calculate the means, variances, and covariance of the binding energies of the different components. Nonideal competitive consistent adsorption isotherm, widely used in metal/proton competitive complexation to environmental macromolecules, and Frumkin competitive isotherms are selected to illustrate the application of the reported results. Explicit analytical expressions for the affinity spectrum as well as for the matrix correlation are obtained for the NICCA case. © 2004 American Institute of Physics.
Resumo:
The goal of this work is to try to create a statistical model, based only on easily computable parameters from the CSP problem to predict runtime behaviour of the solving algorithms, and let us choose the best algorithm to solve the problem. Although it seems that the obvious choice should be MAC, experimental results obtained so far show, that with big numbers of variables, other algorithms perfom much better, specially for hard problems in the transition phase.
Resumo:
In this paper we design and develop several filtering strategies for the analysis of data generated by a resonant bar gravitational wave (GW) antenna, with the goal of assessing the presence (or absence) therein of long-duration monochromatic GW signals, as well as the eventual amplitude and frequency of the signals, within the sensitivity band of the detector. Such signals are most likely generated in the fast rotation of slightly asymmetric spinning stars. We develop practical procedures, together with a study of their statistical properties, which will provide us with useful information on the performance of each technique. The selection of candidate events will then be established according to threshold-crossing probabilities, based on the Neyman-Pearson criterion. In particular, it will be shown that our approach, based on phase estimation, presents a better signal-to-noise ratio than does pure spectral analysis, the most common approach.
Resumo:
Many ants forage in complex environments and use a combination of trail pheromone information and route memory to navigate between food sources and the nest. Previous research has shown that foraging routes differ in how easily they are learned. In particular, it is easier to learn feeding locations that are reached by repeating (e.g. left-left or right-right) than alternating choices (left-right or right-left) along a route with two T-bifurcations. This raises the hypothesis that the learnability of the feeding sites may influence overall colony foraging patterns. We studied this in the mass-recruiting ant Lasius niger. We used mazes with two T-bifurcations, and allowed colonies to exploit two equidistant food sources that differed in how easily their locations were learned. In experiment 1, learnability was manipulated by using repeating versus alternating routes from nest to feeder. In experiment 2, we added visual landmarks along the route to one food source. Our results suggest that colonies preferentially exploited the feeding site that was easier to learn. This was the case even if the more difficult to learn feeding site was discovered first. Furthermore, we show that these preferences were at least partly caused by lower error rates (experiment 1) and greater foraging speeds (experiment 2) of foragers visiting the more easily learned feeder locations. Our results indicate that the learnability of feeding sites is an important factor influencing collective foraging patterns of ant colonies under more natural conditions, given that in natural environments foragers often face multiple bifurcations on their way to food sources.
Resumo:
[cat] En aquest treball s'analitza l'efecte que comporta l'introducció de preferències inconsistents temporalment sobre les decisions òptimes de consum, inversió i compra d'assegurança de vida. En concret, es pretén recollir la creixent importància que un individu dóna a la herència que deixa i a la riquesa disponible per a la seva jubilació al llarg de la seva vida laboral. Amb aquesta finalitat, es parteix d'un model estocàstic en temps continu amb temps final aleatori, i s'introdueix el descompte heterogeni, considerant un agent amb una distribució de vida residual coneguda. Per tal d'obtenir solucions consistents temporalment es resol una equació de programació dinàmica no estàndard. Per al cas de funcions d'utilitat del tipus CRRA i CARA es troben solucions explícites. Finalment, els resultats obtinguts s'il·lustren numèricament.
Resumo:
[cat] En aquest treball s'analitza l'efecte que comporta l'introducció de preferències inconsistents temporalment sobre les decisions òptimes de consum, inversió i compra d'assegurança de vida. En concret, es pretén recollir la creixent importància que un individu dóna a la herència que deixa i a la riquesa disponible per a la seva jubilació al llarg de la seva vida laboral. Amb aquesta finalitat, es parteix d'un model estocàstic en temps continu amb temps final aleatori, i s'introdueix el descompte heterogeni, considerant un agent amb una distribució de vida residual coneguda. Per tal d'obtenir solucions consistents temporalment es resol una equació de programació dinàmica no estàndard. Per al cas de funcions d'utilitat del tipus CRRA i CARA es troben solucions explícites. Finalment, els resultats obtinguts s'il·lustren numèricament.
Resumo:
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
Resumo:
Background The MPER region of the HIV-1 envelope glycoprotein gp41 is targeted by broadly neutralizing antibodies. However, the localization of this epitope in a hydrophobic environment seems to hamper the elicitation of these antibodies in HIV infected individuals. We have quantified and characterized anti-MPER antibodies by ELISA and by flow cytometry using a collection of mini gp41-derived proteins expressed on the surface of 293T cells. Longitudinal plasma samples from 35 HIV-1 infected individuals were assayed for MPER recognition and MPER-dependent neutralizing capacity using HIV-2 viruses engrafted with HIV-1 MPER sequences. Results Miniproteins devoid of the cysteine loop of gp41 exposed the MPER on 293T cell membrane. Anti-MPER antibodies were identified in most individuals and were stable when analyzed in longitudinal samples. The magnitude of the responses was strongly correlated with the global response to the HIV-1 envelope glycoprotein, suggesting no specific limitation for anti-MPER antibodies. Peptide mapping showed poor recognition of the C-terminal MPER moiety and a wide presence of antibodies against the 2F5 epitope. However, antibody titers failed to correlate with 2F5-blocking activity and, more importantly, with the specific neutralization of HIV-2 chimeric viruses bearing the HIV-1 MPER sequence; suggesting a strong functional heterogeneity in anti-MPER humoral responses. Conclusions Anti-MPER antibodies can be detected in the vast majority of HIV-1 infected individuals and are generated in the context of the global anti-Env response. However, the neutralizing capacity is heterogeneous suggesting that eliciting neutralizing anti-MPER antibodies by immunization might require refinement of immunogens to skip nonneutralizing responses.