Biblioteca Digital

883 resultados para non separable data

Calculation of fractional derivatives of noisy data with genetic algorithms

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper addresses the calculation of derivatives of fractional order for non-smooth data. The noise is avoided by adopting an optimization formulation using genetic algorithms (GA). Given the flexibility of the evolutionary schemes, a hierarchical GA composed by a series of two GAs, each one with a distinct fitness function, is established.

Statistical learning theory for geospatial data. Case study: Aral sea

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).

Webdatanet : Innovation and quality in web-based data collection

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The article discusses the development of WEBDATANET established in 2011 which aims to create a multidisciplinary network of web-based data collection experts in Europe. Topics include the presence of 190 experts in 30 European countries and abroad, the establishment of web-based teaching and discussion platforms and working groups and task forces. Also discussed is the scope of the research carried by WEBDATANET. In light of the growing importance of web-based data in the social and behavioral sciences, WEBDATANET was established in 2011 as a COST Action (IS 1004) to create a multidisciplinary network of web-based data collection experts: (web) survey methodologists, psychologists, sociologists, linguists, economists, Internet scientists, media and public opinion researchers. The aim was to accumulate and synthesize knowledge regarding methodological issues of web-based data collection (surveys, experiments, tests, non-reactive data, and mobile Internet research), and foster its scientific usage in a broader community.

Asymmetric Smiles, Leverage Effects and Structural Parameters.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we characterize the asymmetries of the smile through multiple leverage effects in a stochastic dynamic asset pricing framework. The dependence between price movements and future volatility is introduced through a set of latent state variables. These latent variables can capture not only the volatility risk and the interest rate risk which potentially affect option prices, but also any kind of correlation risk and jump risk. The standard financial leverage effect is produced by a cross-correlation effect between the state variables which enter into the stochastic volatility process of the stock price and the stock price process itself. However, we provide a more general framework where asymmetric implied volatility curves result from any source of instantaneous correlation between the state variables and either the return on the stock or the stochastic discount factor. In order to draw the shapes of the implied volatility curves generated by a model with latent variables, we specify an equilibrium-based stochastic discount factor with time non-separable preferences. When we calibrate this model to empirically reasonable values of the parameters, we are able to reproduce the various types of implied volatility curves inferred from option market data.

Optimal control of a class of discrete–continuous non-linear systems — decomposition and hierarchical structure

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A technique is derived for solving a non-linear optimal control problem by iterating on a sequence of simplified problems in linear quadratic form. The technique is designed to achieve the correct solution of the original non-linear optimal control problem in spite of these simplifications. A mixed approach with a discrete performance index and continuous state variable system description is used as the basis of the design, and it is shown how the global problem can be decomposed into local sub-system problems and a co-ordinator within a hierarchical framework. An analysis of the optimality and convergence properties of the algorithm is presented and the effectiveness of the technique is demonstrated using a simulation example with a non-separable performance index.

Dynamic group communication for large-scale parallel data mining

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.

A Framework for Exploring Multidimensional Data with 3D Projections

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Visualization of high-dimensional data requires a mapping to a visual space. Whenever the goal is to preserve similarity relations a frequent strategy is to use 2D projections, which afford intuitive interactive exploration, e. g., by users locating and selecting groups and gradually drilling down to individual objects. In this paper, we propose a framework for projecting high-dimensional data to 3D visual spaces, based on a generalization of the Least-Square Projection (LSP). We compare projections to 2D and 3D visual spaces both quantitatively and through a user study considering certain exploration tasks. The quantitative analysis confirms that 3D projections outperform 2D projections in terms of precision. The user study indicates that certain tasks can be more reliably and confidently answered with 3D projections. Nonetheless, as 3D projections are displayed on 2D screens, interaction is more difficult. Therefore, we incorporate suitable interaction functionalities into a framework that supports 3D transformations, predefined optimal 2D views, coordinated 2D and 3D views, and hierarchical 3D cluster definition and exploration. For visually encoding data clusters in a 3D setup, we employ color coding of projected data points as well as four types of surface renderings. A second user study evaluates the suitability of these visual encodings. Several examples illustrate the framework`s applicability for both visual exploration of multidimensional abstract (non-spatial) data as well as the feature space of multi-variate spatial data.

Estimation methods of non-additive effects for characteristics of weight and scrotal circumference in crossbred beef cattle

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The objective of this study was to investigate, in a population of crossbred cattle, the obtainment of the non-additive genetic effects for the characteristics weight at 205 and 390 days and scrotal circumference, and to evaluate the consideration of these effects in the prediction of breeding values of sires using different estimation methodologies. In method 1, the data were pre-adjusted for the non-additive effects obtained by least squares means method in a model that considered the direct additive, maternal and non-additive fixed genetic effects, the direct and total maternal heterozygosities, and epistasis. In method 2, the non-additive effects were considered covariates in genetic model. Genetic values for adjusted and non-adjusted data were predicted considering additive direct and maternal effects, and for weight at 205 days, also the permanent environmental effect, as random effects in the model. The breeding values of the categories of sires considered for the weight characteristic at 205 days were organized in files, in order to verify alterations in the magnitude of the predictions and ranking of animals in the two methods of correction data for the non-additives effects. The non-additive effects were not similar in magnitude and direction in the two estimation methods used, nor for the characteristics evaluated. Pearson and Spearman correlations between breeding values were higher than 0.94, and the use of different methods does not imply changes in the selection of animals.

On the Existence and Characterization of Markovian Equilibrium in Models with Simple Non-Paternalistic Altruism

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper provides sufficient conditions for existence of Markovian equilibrium in models with non-paternalistic altruism extending to one generation ahead. When utility is non-separable, we show that each equilibrium savings policy correspondence is increasing everywhere and single-valued, except perhaps on a countable number of points. It is also upper hemi-continuous where it is single valued. When utility is separable, we show that the equilibrium is unique, increasing, and continuous, and we provide an algorithm converging uniformly to the equilibrium.

DEVELOPMENT OF NOVEL METHODS TO MINIMIZE THE IMPACT OF SEQUENCING ERRORS IN THE NEXT-GENERATION SEQUENCING DATA ANALYSIS

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Next-generation sequencing (NGS) technology has become a prominent tool in biological and biomedical research. However, NGS data analysis, such as de novo assembly, mapping and variants detection is far from maturity, and the high sequencing error-rate is one of the major problems. . To minimize the impact of sequencing errors, we developed a highly robust and efficient method, MTM, to correct the errors in NGS reads. We demonstrated the effectiveness of MTM on both single-cell data with highly non-uniform coverage and normal data with uniformly high coverage, reflecting that MTM’s performance does not rely on the coverage of the sequencing reads. MTM was also compared with Hammer and Quake, the best methods for correcting non-uniform and uniform data respectively. For non-uniform data, MTM outperformed both Hammer and Quake. For uniform data, MTM showed better performance than Quake and comparable results to Hammer. By making better error correction with MTM, the quality of downstream analysis, such as mapping and SNP detection, was improved. SNP calling is a major application of NGS technologies. However, the existence of sequencing errors complicates this process, especially for the low coverage (

High-speed UDP Data Transmission with Multithreading and Automatic Resource Allocation

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper a utilization of the high data-rates channels by threading of sending and receiving is studied. As a communication technology evolves the higher speeds are used more and more in various applications. But generating traffic with Gbps data-rates also brings some complications. Especially if UDP protocol is used and it is necessary to avoid packet fragmentation, for example for high-speed reliable transport protocols based on UDP. For such situation the Ethernet network packet size has to correspond to standard 1500 bytes MTU[1], which is widely used in the Internet. System may not has enough capacity to send messages with necessary rate in a single-threaded mode. A possible solution is to use more threads. It can be efficient on widespread multicore systems. Also the fact that in real network non-constant data flow can be expected brings another object of study –- an automatic adaptation to the traffic which is changing during runtime. Cases investigated in this paper include adjusting number of threads to a given speed and keeping speed on a given rate when CPU gets heavily loaded by other processes while sending data.

Negative data in DEA:a directional distance approach applied to bank branches

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper is drawn from the use of data envelopment analysis (DEA) in helping a Portuguese bank to manage the performance of its branches. The bank wanted to set targets for the branches on such variables as growth in number of clients, growth in funds deposited and so on. Such variables can take positive and negative values but apart from some exceptions, traditional DEA models have hitherto been restricted to non-negative data. We report on the development of a model to handle unrestricted data in a DEA framework and illustrate the use of this model on data from the bank concerned.

Novel visualization methods for protein data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Visualization of high-dimensional data has always been a challenging task. Here we discuss and propose variants of non-linear data projection methods (Generative Topographic Mapping (GTM) and GTM with simultaneous feature saliency (GTM-FS)) that are adapted to be effective on very high-dimensional data. The adaptations use log space values at certain steps of the Expectation Maximization (EM) algorithm and during the visualization process. We have tested the proposed algorithms by visualizing electrostatic potential data for Major Histocompatibility Complex (MHC) class-I proteins. The experiments show that the variation in the original version of GTM and GTM-FS worked successfully with data of more than 2000 dimensions and we compare the results with other linear/nonlinear projection methods: Principal Component Analysis (PCA), Neuroscale (NSC) and Gaussian Process Latent Variable Model (GPLVM).

Using interior point algorithms for the solution of linear programs with special structural features

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Linear Programming (LP) is a powerful decision making tool extensively used in various economic and engineering activities. In the early stages the success of LP was mainly due to the efficiency of the simplex method. After the appearance of Karmarkar's paper, the focus of most research was shifted to the field of interior point methods. The present work is concerned with investigating and efficiently implementing the latest techniques in this field taking sparsity into account. The performance of these implementations on different classes of LP problems is reported here. The preconditional conjugate gradient method is one of the most powerful tools for the solution of the least square problem, present in every iteration of all interior point methods. The effect of using different preconditioners on a range of problems with various condition numbers is presented. Decomposition algorithms has been one of the main fields of research in linear programming over the last few years. After reviewing the latest decomposition techniques, three promising methods were chosen the implemented. Sparsity is again a consideration and suggestions have been included to allow improvements when solving problems with these methods. Finally, experimental results on randomly generated data are reported and compared with an interior point method. The efficient implementation of the decomposition methods considered in this study requires the solution of quadratic subproblems. A review of recent work on algorithms for convex quadratic was performed. The most promising algorithms are discussed and implemented taking sparsity into account. The related performance of these algorithms on randomly generated separable and non-separable problems is also reported.

Practical methods of tracking of nonstationary time series applied to real-world data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we discuss some practical implications for implementing adaptable network algorithms applied to non-stationary time series problems. Two real world data sets, containing electricity load demands and foreign exchange market prices, are used to test several different methods, ranging from linear models with fixed parameters, to non-linear models which adapt both parameters and model order on-line. Training with the extended Kalman filter, we demonstrate that the dynamic model-order increment procedure of the resource allocating RBF network (RAN) is highly sensitive to the parameters of the novelty criterion. We investigate the use of system noise for increasing the plasticity of the Kalman filter training algorithm, and discuss the consequences for on-line model order selection. The results of our experiments show that there are advantages to be gained in tracking real world non-stationary data through the use of more complex adaptive models.

«
1
2
3
4
5
6
7
8
...
58
59
»