992 resultados para 0802 Computation Theory and Mathematics


Relevância:

100.00% 100.00%

Publicador:

Resumo:

H. Simon and B. Szörényi have found an error in the proof of Theorem 52 of “Shifting: One-inclusion mistake bounds and sample compression”, Rubinstein et al. (2009). In this note we provide a corrected proof of a slightly weakened version of this theorem. Our new bound on the density of one-inclusion hypergraphs is again in terms of the capacity of the multilabel concept class. Simon and Szörényi have recently proved an alternate result in Simon and Szörényi (2009).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background The majority of peptide bonds in proteins are found to occur in the trans conformation. However, for proline residues, a considerable fraction of Prolyl peptide bonds adopt the cis form. Proline cis/trans isomerization is known to play a critical role in protein folding, splicing, cell signaling and transmembrane active transport. Accurate prediction of proline cis/trans isomerization in proteins would have many important applications towards the understanding of protein structure and function. Results In this paper, we propose a new approach to predict the proline cis/trans isomerization in proteins using support vector machine (SVM). The preliminary results indicated that using Radial Basis Function (RBF) kernels could lead to better prediction performance than that of polynomial and linear kernel functions. We used single sequence information of different local window sizes, amino acid compositions of different local sequences, multiple sequence alignment obtained from PSI-BLAST and the secondary structure information predicted by PSIPRED. We explored these different sequence encoding schemes in order to investigate their effects on the prediction performance. The training and testing of this approach was performed on a newly enlarged dataset of 2424 non-homologous proteins determined by X-Ray diffraction method using 5-fold cross-validation. Selecting the window size 11 provided the best performance for determining the proline cis/trans isomerization based on the single amino acid sequence. It was found that using multiple sequence alignments in the form of PSI-BLAST profiles could significantly improve the prediction performance, the prediction accuracy increased from 62.8% with single sequence to 69.8% and Matthews Correlation Coefficient (MCC) improved from 0.26 with single local sequence to 0.40. Furthermore, if coupled with the predicted secondary structure information by PSIPRED, our method yielded a prediction accuracy of 71.5% and MCC of 0.43, 9% and 0.17 higher than the accuracy achieved based on the singe sequence information, respectively. Conclusion A new method has been developed to predict the proline cis/trans isomerization in proteins based on support vector machine, which used the single amino acid sequence with different local window sizes, the amino acid compositions of local sequence flanking centered proline residues, the position-specific scoring matrices (PSSMs) extracted by PSI-BLAST and the predicted secondary structures generated by PSIPRED. The successful application of SVM approach in this study reinforced that SVM is a powerful tool in predicting proline cis/trans isomerization in proteins and biological sequence analysis.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Premature convergence to local optimal solutions is one of the main difficulties when using evolutionary algorithms in real-world optimization problems. To prevent premature convergence and degeneration phenomenon, this paper proposes a new optimization computation approach, human-simulated immune evolutionary algorithm (HSIEA). Considering that the premature convergence problem is due to the lack of diversity in the population, the HSIEA employs the clonal selection principle of artificial immune system theory to preserve the diversity of solutions for the search process. Mathematical descriptions and procedures of the HSIEA are given, and four new evolutionary operators are formulated which are clone, variation, recombination, and selection. Two benchmark optimization functions are investigated to demonstrate the effectiveness of the proposed HSIEA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a combined structure for using real, complex, and binary valued vectors for semantic representation. The theory, implementation, and application of this structure are all significant. For the theory underlying quantum interaction, it is important to develop a core set of mathematical operators that describe systems of information, just as core mathematical operators in quantum mechanics are used to describe the behavior of physical systems. The system described in this paper enables us to compare more traditional quantum mechanical models (which use complex state vectors), alongside more generalized quantum models that use real and binary vectors. The implementation of such a system presents fundamental computational challenges. For large and sometimes sparse datasets, the demands on time and space are different for real, complex, and binary vectors. To accommodate these demands, the Semantic Vectors package has been carefully adapted and can now switch between different number types comparatively seamlessly. This paper describes the key abstract operations in our semantic vector models, and describes the implementations for real, complex, and binary vectors. We also discuss some of the key questions that arise in the field of quantum interaction and informatics, explaining how the wide availability of modelling options for different number fields will help to investigate some of these questions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work identifies the limitations of n-way data analysis techniques in multidimensional stream data, such as Internet chat room communications data, and establishes a link between data collection and performance of these techniques. Its contributions are twofold. First, it extends data analysis to multiple dimensions by constructing n-way data arrays known as high order tensors. Chat room tensors are generated by a simulator which collects and models actual communication data. The accuracy of the model is determined by the Kolmogorov-Smirnov goodness-of-fit test which compares the simulation data with the observed (real) data. Second, a detailed computational comparison is performed to test several data analysis techniques including svd [1], and multi-way techniques including Tucker1, Tucker3 [2], and Parafac [3].

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work investigates the accuracy and efficiency tradeoffs between centralized and collective (distributed) algorithms for (i) sampling, and (ii) n-way data analysis techniques in multidimensional stream data, such as Internet chatroom communications. Its contributions are threefold. First, we use the Kolmogorov-Smirnov goodness-of-fit test to show that statistical differences between real data obtained by collective sampling in time dimension from multiple servers and that of obtained from a single server are insignificant. Second, we show using the real data that collective data analysis of 3-way data arrays (users x keywords x time) known as high order tensors is more efficient than centralized algorithms with respect to both space and computational cost. Furthermore, we show that this gain is obtained without loss of accuracy. Third, we examine the sensitivity of collective constructions and analysis of high order data tensors to the choice of server selection and sampling window size. We construct 4-way tensors (users x keywords x time x servers) and analyze them to show the impact of server and window size selections on the results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The generation of a correlation matrix from a large set of long gene sequences is a common requirement in many bioinformatics problems such as phylogenetic analysis. The generation is not only computationally intensive but also requires significant memory resources as, typically, few gene sequences can be simultaneously stored in primary memory. The standard practice in such computation is to use frequent input/output (I/O) operations. Therefore, minimizing the number of these operations will yield much faster run-times. This paper develops an approach for the faster and scalable computing of large-size correlation matrices through the full use of available memory and a reduced number of I/O operations. The approach is scalable in the sense that the same algorithms can be executed on different computing platforms with different amounts of memory and can be applied to different problems with different correlation matrix sizes. The significant performance improvement of the approach over the existing approaches is demonstrated through benchmark examples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Numeric sets can be used to store and distribute important information such as currency exchange rates and stock forecasts. It is useful to watermark such data for proving ownership in case of illegal distribution by someone. This paper analyzes the numerical set watermarking model presented by Sion et. al in “On watermarking numeric sets”, identifies it’s weaknesses, and proposes a novel scheme that overcomes these problems. One of the weaknesses of Sion’s watermarking scheme is the requirement to have a normally-distributed set, which is not true for many numeric sets such as forecast figures. Experiments indicate that the scheme is also susceptible to subset addition and secondary watermarking attacks. The watermarking model we propose can be used for numeric sets with arbitrary distribution. Theoretical analysis and experimental results show that the scheme is strongly resilient against sorting, subset selection, subset addition, distortion, and secondary watermarking attacks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Diffusion equations that use time fractional derivatives are attractive because they describe a wealth of problems involving non-Markovian Random walks. The time fractional diffusion equation (TFDE) is obtained from the standard diffusion equation by replacing the first-order time derivative with a fractional derivative of order α ∈ (0, 1). Developing numerical methods for solving fractional partial differential equations is a new research field and the theoretical analysis of the numerical methods associated with them is not fully developed. In this paper an explicit conservative difference approximation (ECDA) for TFDE is proposed. We give a detailed analysis for this ECDA and generate discrete models of random walk suitable for simulating random variables whose spatial probability density evolves in time according to this fractional diffusion equation. The stability and convergence of the ECDA for TFDE in a bounded domain are discussed. Finally, some numerical examples are presented to show the application of the present technique.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A planar polynomial differential system has a finite number of limit cycles. However, finding the upper bound of the number of limit cycles is an open problem for the general nonlinear dynamical systems. In this paper, we investigated a class of Liénard systems of the form x'=y, y'=f(x)+y g(x) with deg f=5 and deg g=4. We proved that the related elliptic integrals of the Liénard systems have at most three zeros including multiple zeros, which implies that the number of limit cycles bifurcated from the periodic orbits of the unperturbed system is less than or equal to 3.