836 resultados para Data fusion applications


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite their importance in the evaluation of petroleum and gas reservoirs, measurements of self-potential data under borehole conditions (well-logging) have found only minor applications in aquifer and waste-site characterization. This can be attributed to lower signals from the diffusion fronts in near-surface environments because measurements are made long after the drilling of the well, when concentration fronts are already disappearing. Proportionally higher signals arise from streaming potentials that prevent using simple interpretation models that assume signals from diffusion only. Our laboratory experiments found that dual-source self-potential signals can be described by a simple linear model, and that contributions (from diffusion and streaming potentials) can be isolated by slightly perturbing the borehole conditions. Perturbations are applied either by changing the concentration of the borehole-filling solution or its column height. Parameters useful for formation evaluation can be estimated from data measured during perturbations, namely, pore water resistivity, pressure drop across the borehole wall, and electrokinetic coupling parameter. These are important parameters to assess, respectively, water quality, aquifer lateral continuity, and interfacial properties of permeable formations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: A current challenge in gene annotation is to define the gene function in the context of the network of relationships instead of using single genes. The inference of gene networks (GNs) has emerged as an approach to better understand the biology of the system and to study how several components of this network interact with each other and keep their functions stable. However, in general there is no sufficient data to accurately recover the GNs from their expression levels leading to the curse of dimensionality, in which the number of variables is higher than samples. One way to mitigate this problem is to integrate biological data instead of using only the expression profiles in the inference process. Nowadays, the use of several biological information in inference methods had a significant increase in order to better recover the connections between genes and reduce the false positives. What makes this strategy so interesting is the possibility of confirming the known connections through the included biological data, and the possibility of discovering new relationships between genes when observed the expression data. Although several works in data integration have increased the performance of the network inference methods, the real contribution of adding each type of biological information in the obtained improvement is not clear. Methods: We propose a methodology to include biological information into an inference algorithm in order to assess its prediction gain by using biological information and expression profile together. We also evaluated and compared the gain of adding four types of biological information: (a) protein-protein interaction, (b) Rosetta stone fusion proteins, (c) KEGG and (d) KEGG+GO. Results and conclusions: This work presents a first comparison of the gain in the use of prior biological information in the inference of GNs by considering the eukaryote (P. falciparum) organism. Our results indicates that information based on direct interaction can produce a higher improvement in the gain than data about a less specific relationship as GO or KEGG. Also, as expected, the results show that the use of biological information is a very important approach for the improvement of the inference. We also compared the gain in the inference of the global network and only the hubs. The results indicates that the use of biological information can improve the identification of the most connected proteins.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Each plasma physics laboratory has a proprietary scheme to control and data acquisition system. Usually, it is different from one laboratory to another. It means that each laboratory has its own way to control the experiment and retrieving data from the database. Fusion research relies to a great extent on international collaboration and this private system makes it difficult to follow the work remotely. The TCABR data analysis and acquisition system has been upgraded to support a joint research programme using remote participation technologies. The choice of MDSplus (Model Driven System plus) is proved by the fact that it is widely utilized, and the scientists from different institutions may use the same system in different experiments in different tokamaks without the need to know how each system treats its acquisition system and data analysis. Another important point is the fact that the MDSplus has a library system that allows communication between different types of language (JAVA, Fortran, C, C++, Python) and programs such as MATLAB, IDL, OCTAVE. In the case of tokamak TCABR interfaces (object of this paper) between the system already in use and MDSplus were developed, instead of using the MDSplus at all stages, from the control, and data acquisition to the data analysis. This was done in the way to preserve a complex system already in operation and otherwise it would take a long time to migrate. This implementation also allows add new components using the MDSplus fully at all stages. (c) 2012 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work proposes a method for data clustering based on complex networks theory. A data set is represented as a network by considering different metrics to establish the connection between each pair of objects. The clusters are obtained by taking into account five community detection algorithms. The network-based clustering approach is applied in two real-world databases and two sets of artificially generated data. The obtained results suggest that the exponential of the Minkowski distance is the most suitable metric to quantify the similarities between pairs of objects. In addition, the community identification method based on the greedy optimization provides the best cluster solution. We compare the network-based clustering approach with some traditional clustering algorithms and verify that it provides the lowest classification error rate. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance. Results: The empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application. Conclusions: We conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Turbulence is one of the key problems of classical physics, and it has been the object of intense research in the last decades in a large spectrum of problems involving fluids, plasmas, and waves. In order to review some advances in theoretical and experimental investigations on turbulence a mini-symposium on this subject was organized in the Dynamics Days South America 2010 Conference. The main goal of this mini-symposium was to present recent developments in both fundamental aspects and dynamical analysis of turbulence in nonlinear waves and fusion plasmas. In this paper we present a summary of the works presented at this mini-symposium. Among the questions to be addressed were the onset and control of turbulence and spatio-temporal chaos. (C) 2011 Elsevier B. V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this study was to estimate the prevalence of inadequate micronutrient intake and excess sodium intake among adults age 19 years and older in the city of Sao Paulo, Brazil. Twenty-four hour dietary recall and sociodemographic data were collected from each participant (n=1,663) in a cross-sectional study, Inquiry of Health of Sao Paulo, of a representative sample of the adult population of the city of Sao Paulo in 2003 (ISA-2003). The variability in intake was measured through two replications of the 24-hour recall in a subsample of this population in 2007 (ISA-2007). Usual intake was estimated by the PC-SIDE program (version 1.0, 2003, Department of Statistics, Iowa State University), which uses an approach developed by Iowa State University. The prevalence of nutrient inadequacy was calculated using the Estimated Average Requirement cut-point method for vitamins A and C, thiamin, riboflavin, niacin, copper, phosphorus, and selenium. For vitamin D, pantothenic acid, manganese, and sodium, the proportion of individuals with usual intake equal to or more than the Adequate Intake value was calculated. The percentage of individuals with intake equal to more than the Tolerable Upper Intake Level was calculated for sodium. The highest prevalence of inadequacy for males and females, respectively, occurred for vitamin A (67% and 58%), vitamin C (52% and 62%), thiamin (41% and 50%), and riboflavin (29% and 19%). The adjustment for the within-person variation presented lower prevalence of inadequacy due to removal of within-person variability. All adult residents of Sao Paulo had excess sodium intake, and the rates of nutrient inadequacy were high for certain key micronutrients. J Acad Nutr Diet. 2012;112:1614-1618.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For the first time, we introduce a generalized form of the exponentiated generalized gamma distribution [Cordeiro et al. The exponentiated generalized gamma distribution with application to lifetime data, J. Statist. Comput. Simul. 81 (2011), pp. 827-842.] that is the baseline for the log-exponentiated generalized gamma regression model. The new distribution can accommodate increasing, decreasing, bathtub- and unimodal-shaped hazard functions. A second advantage is that it includes classical distributions reported in the lifetime literature as special cases. We obtain explicit expressions for the moments of the baseline distribution of the new regression model. The proposed model can be applied to censored data since it includes as sub-models several widely known regression models. It therefore can be used more effectively in the analysis of survival data. We obtain maximum likelihood estimates for the model parameters by considering censored data. We show that our extended regression model is very useful by means of two applications to real data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study a five-parameter lifetime distribution called the McDonald extended exponential model to generalize the exponential, generalized exponential, Kumaraswamy exponential and beta exponential distributions, among others. We obtain explicit expressions for the moments and incomplete moments, quantile and generating functions, mean deviations, Bonferroni and Lorenz curves and Gini concentration index. The method of maximum likelihood and a Bayesian procedure are adopted for estimating the model parameters. The applicability of the new model is illustrated by means of a real data set.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper reports experiments on the use of a recently introduced advection bounded upwinding scheme, namely TOPUS (Computers & Fluids 57 (2012) 208-224), for flows of practical interest. The numerical results are compared against analytical, numerical and experimental data and show good agreement with them. It is concluded that the TOPUS scheme is a competent, powerful and generic scheme for complex flow phenomena.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The beta-Birnbaum-Saunders (Cordeiro and Lemonte, 2011) and Birnbaum-Saunders (Birnbaum and Saunders, 1969a) distributions have been used quite effectively to model failure times for materials subject to fatigue and lifetime data. We define the log-beta-Birnbaum-Saunders distribution by the logarithm of the beta-Birnbaum-Saunders distribution. Explicit expressions for its generating function and moments are derived. We propose a new log-beta-Birnbaum-Saunders regression model that can be applied to censored data and be used more effectively in survival analysis. We obtain the maximum likelihood estimates of the model parameters for censored data and investigate influence diagnostics. The new location-scale regression model is modified for the possibility that long-term survivors may be presented in the data. Its usefulness is illustrated by means of two real data sets. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The low efficiency of gene transfer is a recurrent problem in DNA vaccine development and gene therapy studies using non-viral vectors such as plasmid DNA (pDNA). This is mainly due to the fact that during their traffic to the target cell's nuclei, plasmid vectors must overcome a series of physical, enzymatic and diffusional barriers. The main objective of this work is the development of recombinant proteins specifically designed for pDNA delivery, which take advantage of molecular motors like dynein, for the transport of cargos from the periphery to the centrosome of mammalian cells. A DNA binding sequence was fused to the N-terminus of the recombinant human dynein light chain LC8. Expression studies indicated that the fusion protein was correctly expressed in soluble form using E. coli BL21(DE3) strain. As expected, gel permeation assays found the purified protein mainly present as dimers, the functional oligomeric state of LC8. Gel retardation assays and atomic force microscopy proved the ability of the fusion protein to interact and condense pDNA. Zeta potential measurements indicated that LC8 with DNA binding domain (LD4) has an enhanced capacity to interact and condense pDNA, generating positively charged complexes. Transfection of cultured HeLa cells confirmed the ability of the LD4 to facilitate pDNA uptake and indicate the involvement of the retrograde transport in the intracellular trafficking of pDNA: LD4 complexes. Finally, cytotoxicity studies demonstrated a very low toxicity of the fusion protein vector, indicating the potential for in vivo applications. The study presented here is part of an effort to develop new modular shuttle proteins able to take advantage of strategies used by viruses to infect mammalian cells, aiming to provide new tools for gene therapy and DNA vaccination studies. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A specific separated-local-field NMR experiment, dubbed Dipolar-Chemical-Shift Correlation (DIPSHIFT) is frequently used to study molecular motions by probing reorientations through the changes in XH dipolar coupling and T-2. In systems where the coupling is weak or the reorientation angle is small, a recoupled variant of the DIPSHIFT experiment is applied, where the effective dipolar coupling is amplified by a REDOR-like pi-pulse train. However, a previously described constant-time variant of this experiment is not sensitive to the motion-induced T-2 effect, which precludes the observation of motions over a large range of rates ranging from hundreds of Hz to around a MHz. We present a DIPSHIFT implementation which amplifies the dipolar couplings and is still sensitive to T-2 effects. Spin dynamics simulations, analytical calculations and experiments demonstrate the sensitivity of the technique to molecular motions, and suggest the best experimental conditions to avoid imperfections. Furthermore, an in-depth theoretical analysis of the interplay of REDOR-like recoupling and proton decoupling based on Average-Hamiltonian Theory was performed, which allowed explaining the origin of many artifacts found in literature data. (C) 2012 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statistical methods have been widely employed to assess the capabilities of credit scoring classification models in order to reduce the risk of wrong decisions when granting credit facilities to clients. The predictive quality of a classification model can be evaluated based on measures such as sensitivity, specificity, predictive values, accuracy, correlation coefficients and information theoretical measures, such as relative entropy and mutual information. In this paper we analyze the performance of a naive logistic regression model (Hosmer & Lemeshow, 1989) and a logistic regression with state-dependent sample selection model (Cramer, 2004) applied to simulated data. Also, as a case study, the methodology is illustrated on a data set extracted from a Brazilian bank portfolio. Our simulation results so far revealed that there is no statistically significant difference in terms of predictive capacity between the naive logistic regression models and the logistic regression with state-dependent sample selection models. However, there is strong difference between the distributions of the estimated default probabilities from these two statistical modeling techniques, with the naive logistic regression models always underestimating such probabilities, particularly in the presence of balanced samples. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Background A popular model for gene regulatory networks is the Boolean network model. In this paper, we propose an algorithm to perform an analysis of gene regulatory interactions using the Boolean network model and time-series data. Actually, the Boolean network is restricted in the sense that only a subset of all possible Boolean functions are considered. We explore some mathematical properties of the restricted Boolean networks in order to avoid the full search approach. The problem is modeled as a Constraint Satisfaction Problem (CSP) and CSP techniques are used to solve it. Results We applied the proposed algorithm in two data sets. First, we used an artificial dataset obtained from a model for the budding yeast cell cycle. The second data set is derived from experiments performed using HeLa cells. The results show that some interactions can be fully or, at least, partially determined under the Boolean model considered. Conclusions The algorithm proposed can be used as a first step for detection of gene/protein interactions. It is able to infer gene relationships from time-series data of gene expression, and this inference process can be aided by a priori knowledge available.