948 resultados para Data structures (Computer science)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The discovery of new materials and their functions has always been a fundamental component of technological progress. Nowadays, the quest for new materials is stronger than ever: sustainability, medicine, robotics and electronics are all key assets which depend on the ability to create specifically tailored materials. However, designing materials with desired properties is a difficult task, and the complexity of the discipline makes it difficult to identify general criteria. While scientists developed a set of best practices (often based on experience and expertise), this is still a trial-and-error process. This becomes even more complex when dealing with advanced functional materials. Their properties depend on structural and morphological features, which in turn depend on fabrication procedures and environment, and subtle alterations leads to dramatically different results. Because of this, materials modeling and design is one of the most prolific research fields. Many techniques and instruments are continuously developed to enable new possibilities, both in the experimental and computational realms. Scientists strive to enforce cutting-edge technologies in order to make progress. However, the field is strongly affected by unorganized file management, proliferation of custom data formats and storage procedures, both in experimental and computational research. Results are difficult to find, interpret and re-use, and a huge amount of time is spent interpreting and re-organizing data. This also strongly limit the application of data-driven and machine learning techniques. This work introduces possible solutions to the problems described above. Specifically, it talks about developing features for specific classes of advanced materials and use them to train machine learning models and accelerate computational predictions for molecular compounds; developing method for organizing non homogeneous materials data; automate the process of using devices simulations to train machine learning models; dealing with scattered experimental data and use them to discover new patterns.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdos-Renyi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabasi-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree k variation, decreasing its network recovery rate with the increase of k. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivation: Understanding the patterns of association between polymorphisms at different loci in a population ( linkage disequilibrium, LD) is of fundamental importance in various genetic studies. Many coefficients were proposed for measuring the degree of LD, but they provide only a static view of the current LD structure. Generative models (GMs) were proposed to go beyond these measures, giving not only a description of the actual LD structure but also a tool to help understanding the process that generated such structure. GMs based in coalescent theory have been the most appealing because they link LD to evolutionary factors. Nevertheless, the inference and parameter estimation of such models is still computationally challenging. Results: We present a more practical method to build GM that describe LD. The method is based on learning weighted Bayesian network structures from haplotype data, extracting equivalence structure classes and using them to model LD. The results obtained in public data from the HapMap database showed that the method is a promising tool for modeling LD. The associations represented by the learned models are correlated with the traditional measure of LD D`. The method was able to represent LD blocks found by standard tools. The granularity of the association blocks and the readability of the models can be controlled in the method. The results suggest that the causality information gained by our method can be useful to tell about the conservability of the genetic markers and to guide the selection of subset of representative markers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the modeling of a weed infestation risk inference system that implements a collaborative inference scheme based on rules extracted from two Bayesian network classifiers. The first Bayesian classifier infers a categorical variable value for the weed-crop competitiveness using as input categorical variables for the total density of weeds and corresponding proportions of narrow and broad-leaved weeds. The inferred categorical variable values for the weed-crop competitiveness along with three other categorical variables extracted from estimated maps for the weed seed production and weed coverage are then used as input for a second Bayesian network classifier to infer categorical variables values for the risk of infestation. Weed biomass and yield loss data samples are used to learn the probability relationship among the nodes of the first and second Bayesian classifiers in a supervised fashion, respectively. For comparison purposes, two types of Bayesian network structures are considered, namely an expert-based Bayesian classifier and a naive Bayes classifier. The inference system focused on the knowledge interpretation by translating a Bayesian classifier into a set of classification rules. The results obtained for the risk inference in a corn-crop field are presented and discussed. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes a method based on both preprocessing and data mining with the objective of identify harmonic current sources in residential consumers. In addition, this methodology can also be applied to identify linear and nonlinear loads. It should be emphasized that the entire database was obtained through laboratory essays, i.e., real data were acquired from residential loads. Thus, the residential system created in laboratory was fed by a configurable power source and in its output were placed the loads and the power quality analyzers (all measurements were stored in a microcomputer). So, the data were submitted to pre-processing, which was based on attribute selection techniques in order to minimize the complexity in identifying the loads. A newer database was generated maintaining only the attributes selected, thus, Artificial Neural Networks were trained to realized the identification of loads. In order to validate the methodology proposed, the loads were fed both under ideal conditions (without harmonics), but also by harmonic voltages within limits pre-established. These limits are in accordance with IEEE Std. 519-1992 and PRODIST (procedures to delivery energy employed by Brazilian`s utilities). The results obtained seek to validate the methodology proposed and furnish a method that can serve as alternative to conventional methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Inverse analysis is currently an important subject of study in several fields of science and engineering. The identification of physical and geometric parameters using experimental measurements is required in many applications. In this work a boundary element formulation to identify boundary and interface values as well as material properties is proposed. In particular the proposed formulation is dedicated to identifying material parameters when a cohesive crack model is assumed for 2D problems. A computer code is developed and implemented using the BEM multi-region technique and regularisation methods to perform the inverse analysis. Several examples are shown to demonstrate the efficiency of the proposed model. (C) 2010 Elsevier Ltd. All rights reserved,

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper compares the behaviour of two different control structures of automatic voltage regulators of synchronous machines equipped with static excitation systems. These systems have a fully controlled thyristor bridge that supplies DC current to the rotor winding. The rectifier bridge is fed by the stator terminals through a step-down transformer. The first control structure, named ""Direct Control"", has a single proportional-integral (PI) regulator that compares stator voltage setpoint with measured voltage and acts directly on the thyristor bridge`s firing angle. This control structure is usually employed in commercial excitation systems for hydrogenerators. The second structure, named ""Cascade Control"", was inspired on control loops of commercial DC motor drives. Such drives employ two PIs in a cascade arrangement, the external PI deals with the motor speed while the internal one regulates the armature current. In the adaptation proposed, the external PI compares setpoint with the actual stator voltage and produces the setpoint to the internal PI-loop which controls the field current.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

By means of continuous topology optimization, this paper discusses the influence of material gradation and layout in the overall stiffness behavior of functionally graded structures. The formulation is associated to symmetry and pattern repetition constraints, including material gradation effects at both global and local levels. For instance, constraints associated with pattern repetition are applied by considering material gradation either on the global structure or locally over the specific pattern. By means of pattern repetition, we recover previous results in the literature which were obtained using homogenization and optimization of cellular materials.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the several kinds of services that use the Internet and data networks infra-structures, the present networks are characterized by the diversity of types of traffic that have statistical properties as complex temporal correlation and non-gaussian distribution. The networks complex temporal correlation may be characterized by the Short Range Dependence (SRD) and the Long Range Dependence - (LRD). Models as the fGN (Fractional Gaussian Noise) may capture the LRD but not the SRD. This work presents two methods for traffic generation that synthesize approximate realizations of the self-similar fGN with SRD random process. The first one employs the IDWT (Inverse Discrete Wavelet Transform) and the second the IDWPT (Inverse Discrete Wavelet Packet Transform). It has been developed the variance map concept that allows to associate the LRD and SRD behaviors directly to the wavelet transform coefficients. The developed methods are extremely flexible and allow the generation of Gaussian time series with complex statistical behaviors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Computer viruses are an important risk to computational systems endangering either corporations of all sizes or personal computers used for domestic applications. Here, classical epidemiological models for disease propagation are adapted to computer networks and, by using simple systems identification techniques a model called SAIC (Susceptible, Antidotal, Infectious, Contaminated) is developed. Real data about computer viruses are used to validate the model. (c) 2008 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a regression model considering the modified Weibull distribution. This distribution can be used to model bathtub-shaped failure rate functions. Assuming censored data, we consider maximum likelihood and Jackknife estimators for the parameters of the model. We derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and we also present some ways to perform global influence. Besides, for different parameter settings, sample sizes and censoring percentages, various simulations are performed and the empirical distribution of the modified deviance residual is displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended for a martingale-type residual in log-modified Weibull regression models with censored data. Finally, we analyze a real data set under log-modified Weibull regression models. A diagnostic analysis and a model checking based on the modified deviance residual are performed to select appropriate models. (c) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this study, regression models are evaluated for grouped survival data when the effect of censoring time is considered in the model and the regression structure is modeled through four link functions. The methodology for grouped survival data is based on life tables, and the times are grouped in k intervals so that ties are eliminated. Thus, the data modeling is performed by considering the discrete models of lifetime regression. The model parameters are estimated by using the maximum likelihood and jackknife methods. To detect influential observations in the proposed models, diagnostic measures based on case deletion, which are denominated global influence, and influence measures based on small perturbations in the data or in the model, referred to as local influence, are used. In addition to those measures, the local influence and the total influential estimate are also employed. Various simulation studies are performed and compared to the performance of the four link functions of the regression models for grouped survival data for different parameter settings, sample sizes and numbers of intervals. Finally, a data set is analyzed by using the proposed regression models. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A four-parameter extension of the generalized gamma distribution capable of modelling a bathtub-shaped hazard rate function is defined and studied. The beauty and importance of this distribution lies in its ability to model monotone and non-monotone failure rate functions, which are quite common in lifetime data analysis and reliability. The new distribution has a number of well-known lifetime special sub-models, such as the exponentiated Weibull, exponentiated generalized half-normal, exponentiated gamma and generalized Rayleigh, among others. We derive two infinite sum representations for its moments. We calculate the density of the order statistics and two expansions for their moments. The method of maximum likelihood is used for estimating the model parameters and the observed information matrix is obtained. Finally, a real data set from the medical area is analysed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have used various computational methodologies including molecular dynamics, density functional theory, virtual screening, ADMET predictions and molecular interaction field studies to design and analyze four novel potential inhibitors of farnesyltransferase (FTase). Evaluation of two proposals regarding their drug potential as well as lead compounds have indicated them as novel promising FTase inhibitors, with theoretically interesting pharmacotherapeutic profiles, when Compared to the very active and most cited FTase inhibitors that have activity data reported, which are launched drugs or compounds in clinical tests. One of our two proposals appears to be a more promising drug candidate and FTase inhibitor, but both derivative molecules indicate potentially very good pharmacotherapeutic profiles in comparison with Tipifarnib and Lonafarnib, two reference pharmaceuticals. Two other proposals have been selected with virtual screening approaches and investigated by LIS, which suggest novel and alternatives scaffolds to design future potential FTase inhibitors. Such compounds can be explored as promising molecules to initiate a research protocol in order to discover novel anticancer drug candidates targeting farnesyltransferase, in the fight against cancer. (C) 2009 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The cost of spatial join processing can be very high because of the large sizes of spatial objects and the computation-intensive spatial operations. While parallel processing seems a natural solution to this problem, it is not clear how spatial data can be partitioned for this purpose. Various spatial data partitioning methods are examined in this paper. A framework combining the data-partitioning techniques used by most parallel join algorithms in relational databases and the filter-and-refine strategy for spatial operation processing is proposed for parallel spatial join processing. Object duplication caused by multi-assignment in spatial data partitioning can result in extra CPU cost as well as extra communication cost. We find that the key to overcome this problem is to preserve spatial locality in task decomposition. We show in this paper that a near-optimal speedup can be achieved for parallel spatial join processing using our new algorithms.