Biblioteca Digital

10 resultados para Sign Data LMS algorithm.

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo

MR-Radix: a multi-relational data mining algorithm

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Background Once multi-relational approach has emerged as an alternative for analyzing structured data such as relational databases, since they allow applying data mining in multiple tables directly, thus avoiding expensive joining operations and semantic losses, this work proposes an algorithm with multi-relational approach. Methods Aiming to compare traditional approach performance and multi-relational for mining association rules, this paper discusses an empirical study between PatriciaMine - an traditional algorithm - and its corresponding multi-relational proposed, MR-Radix. Results This work showed advantages of the multi-relational approach in performance over several tables, which avoids the high cost for joining operations from multiple tables and semantic losses. The performance provided by the algorithm MR-Radix shows faster than PatriciaMine, despite handling complex multi-relational patterns. The utilized memory indicates a more conservative growth curve for MR-Radix than PatriciaMine, which shows the increase in demand of frequent items in MR-Radix does not result in a significant growth of utilized memory like in PatriciaMine. Conclusion The comparative study between PatriciaMine and MR-Radix confirmed efficacy of the multi-relational approach in data mining process both in terms of execution time and in relation to memory usage. Besides that, the multi-relational proposed algorithm, unlike other algorithms of this approach, is efficient for use in large relational databases.

Veja mais

A system for classification of time-series data from industrial non-destructive device

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes a system for classification of industrial steel pieces by means of magnetic nondestructive device. The proposed classification system presents two main stages, online system stage and off-line system stage. In online stage, the system classifies inputs and saves misclassification information in order to perform posterior analyses. In the off-line optimization stage, the topology of a Probabilistic Neural Network is optimized by a Feature Selection algorithm combined with the Probabilistic Neural Network to increase the classification rate. The proposed Feature Selection algorithm searches for the signal spectrogram by combining three basic elements: a Sequential Forward Selection algorithm, a Feature Cluster Grow algorithm with classification rate gradient analysis and a Sequential Backward Selection. Also, a trash-data recycling algorithm is proposed to obtain the optimal feedback samples selected from the misclassified ones.

Veja mais

Robust statistical modeling using the Birnbaum-Saunders-t distribution applied to insurance

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we carry out robust modeling and influence diagnostics in Birnbaum-Saunders (BS) regression models. Specifically, we present some aspects related to BS and log-BS distributions and their generalizations from the Student-t distribution, and develop BS-t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright (c) 2011 John Wiley & Sons, Ltd.

Veja mais

Assessing the gain of biological data integration in gene networks inference

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: A current challenge in gene annotation is to define the gene function in the context of the network of relationships instead of using single genes. The inference of gene networks (GNs) has emerged as an approach to better understand the biology of the system and to study how several components of this network interact with each other and keep their functions stable. However, in general there is no sufficient data to accurately recover the GNs from their expression levels leading to the curse of dimensionality, in which the number of variables is higher than samples. One way to mitigate this problem is to integrate biological data instead of using only the expression profiles in the inference process. Nowadays, the use of several biological information in inference methods had a significant increase in order to better recover the connections between genes and reduce the false positives. What makes this strategy so interesting is the possibility of confirming the known connections through the included biological data, and the possibility of discovering new relationships between genes when observed the expression data. Although several works in data integration have increased the performance of the network inference methods, the real contribution of adding each type of biological information in the obtained improvement is not clear. Methods: We propose a methodology to include biological information into an inference algorithm in order to assess its prediction gain by using biological information and expression profile together. We also evaluated and compared the gain of adding four types of biological information: (a) protein-protein interaction, (b) Rosetta stone fusion proteins, (c) KEGG and (d) KEGG+GO. Results and conclusions: This work presents a first comparison of the gain in the use of prior biological information in the inference of GNs by considering the eukaryote (P. falciparum) organism. Our results indicates that information based on direct interaction can produce a higher improvement in the gain than data about a less specific relationship as GO or KEGG. Also, as expected, the results show that the use of biological information is a very important approach for the improvement of the inference. We also compared the gain in the inference of the global network and only the hubs. The results indicates that the use of biological information can improve the identification of the most connected proteins.

Veja mais

Automatic design of decision-tree induction algorithms tailored to flexible-receptor docking data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance. Results: The empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application. Conclusions: We conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor.

Veja mais

Management of gallstone disease in children: a new protocol based on the experience of a single center

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background/purpose: Gallstones and cholelithiasis are being increasingly diagnosed in children owing to the widespread use of ultrasonography. The treatment of choice is cholecystectomy, and routine intraoperative cholangiography is recommended to explore the common bile duct. The objectives of this study were to describe our experience with the management of gallstone disease in childhood over the last 18 years and to propose an algorithm to guide the approach to cholelithiasis in children based on clinical and ultrasonographic findings. Methods: The data for this study were obtained by reviewing the records of all patients with gallstone disease treated between January 1994 and October 2011. The patients were divided into the following 5 groups based on their symptoms: group 1, asymptomatic; group 2, nonbiliary obstructive symptoms; group 3, acute cholecystitis symptoms; group 4, a history of biliary obstructive symptoms that were completely resolved by the time of surgery; and group 5, ongoing biliary obstructive symptoms. Patients were treated according to an algorithm based on their clinical, ultrasonographic, and endoscopic retrograde cholangiopancreatography (ERCP) findings. Results: A total of 223 patients were diagnosed with cholelithiasis, and comorbidities were present in 177 patients (79.3%). The most common comorbidities were hemolytic disorders in 139 patients (62.3%) and previous bariatric surgery in 16 (7.1%). Although symptoms were present in 134 patients (60.0%), cholecystectomy was performed for all patients with cholelithiasis, even if they were asymptomatic; the surgery was laparoscopic in 204 patients and open in 19. Fifty-six patients (25.1%) presented with complications as the first sign of cholelithiasis (eg, pancreatitis, choledocolithiasis, or acute calculous cholecystitis). Intraoperative cholangiography was indicated in 15 children, and it was positive in only 1 (0.4%) for whom ERCP was necessary to extract the stone after a laparoscopic cholecystectomy (LC). Preoperative ERCP was performed in 11 patients to extract the stones, and a hepaticojejunostomy was indicated in 2 patients. There were no injuries to the hepatic artery or common bile duct in our series. Conclusions: Based on our experience, we can propose an algorithm to guide the approach to cholelithiasis in the pediatric population. The final conclusion is that LC results in limited postoperative complications in children with gallstones. When a diagnosis of choledocolithiasis or dilation of the choledocus is made, ERCP is necessary if obstructive symptoms persist either before or after an LC. Intraoperative cholangiography and laparoscopic common bile duct exploration are not mandatory. Published by Elsevier Inc.

Veja mais

Automatic aspect discrimination in data clustering

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The attributes describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that simultaneously performs fuzzy clustering and aspects weighting was proposed in the literature. However, SCAD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to reduce the number of parameters required to be set by the user. In this paper we prove that each step of the resulting algorithm, named ASCAD, globally minimizes its cost-function with respect to the argument being optimized. The asymptotic analysis of ASCAD leads to a time complexity which is the same as that of fuzzy c-means. A hard version of the algorithm and a novel validity criterion that considers aspect weights in order to estimate the number of clusters are also described. The proposed method is assessed over several artificial and real data sets.

Veja mais

Influence diagnostics in heteroscedastic and/or autoregressive nonlinear elliptical models for correlated data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we propose nonlinear elliptical models for correlated data with heteroscedastic and/or autoregressive structures. Our aim is to extend the models proposed by Russo et al. [22] by considering a more sophisticated scale structure to deal with variations in data dispersion and/or a possible autocorrelation among measurements taken throughout the same experimental unit. Moreover, to avoid the possible influence of outlying observations or to take into account the non-normal symmetric tails of the data, we assume elliptical contours for the joint distribution of random effects and errors, which allows us to attribute different weights to the observations. We propose an iterative algorithm to obtain the maximum-likelihood estimates for the parameters and derive the local influence curvatures for some specific perturbation schemes. The motivation for this work comes from a pharmacokinetic indomethacin data set, which was analysed previously by Bocheng and Xuping [1] under normality.

Veja mais

Constraint-based analysis of gene interactions using restricted boolean networks and time-series data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Background A popular model for gene regulatory networks is the Boolean network model. In this paper, we propose an algorithm to perform an analysis of gene regulatory interactions using the Boolean network model and time-series data. Actually, the Boolean network is restricted in the sense that only a subset of all possible Boolean functions are considered. We explore some mathematical properties of the restricted Boolean networks in order to avoid the full search approach. The problem is modeled as a Constraint Satisfaction Problem (CSP) and CSP techniques are used to solve it. Results We applied the proposed algorithm in two data sets. First, we used an artificial dataset obtained from a model for the budding yeast cell cycle. The second data set is derived from experiments performed using HeLa cells. The results show that some interactions can be fully or, at least, partially determined under the Boolean model considered. Conclusions The algorithm proposed can be used as a first step for detection of gene/protein interactions. It is able to infer gene relationships from time-series data of gene expression, and this inference process can be aided by a priori knowledge available.

Veja mais

Programmable logic design of a compact Genetic Algorithm for phasor estimation in real-time

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main objective of this work is to present an efﬁcient method for phasor estimation based on a compact Genetic Algorithm (cGA) implemented in Field Programmable Gate Array (FPGA). To validate the proposed method, an Electrical Power System (EPS) simulated by the Alternative Transients Program (ATP) provides data to be used by the cGA. This data is as close as possible to the actual data provided by the EPS. Real life situations such as islanding, sudden load increase and permanent faults were considered. The implementation aims to take advantage of the inherent parallelism in Genetic Algorithms in a compact and optimized way, making them an attractive option for practical applications in real-time estimations concerning Phasor Measurement Units (PMUs).

Veja mais

10 resultados para Sign Data LMS algorithm.

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo

Filtro por publicador