919 resultados para VLE data sets
Resumo:
The Morse-Smale complex is a useful topological data structure for the analysis and visualization of scalar data. This paper describes an algorithm that processes all mesh elements of the domain in parallel to compute the Morse-Smale complex of large two-dimensional data sets at interactive speeds. We employ a reformulation of the Morse-Smale complex using Forman's Discrete Morse Theory and achieve scalability by computing the discrete gradient using local accesses only. We also introduce a novel approach to merge gradient paths that ensures accurate geometry of the computed complex. We demonstrate that our algorithm performs well on both multicore environments and on massively parallel architectures such as the GPU.
Resumo:
Long-distance dispersal (LDD) events, although rare for most plant species, can strongly influence population and community dynamics. Animals function as a key biotic vector of seeds and thus, a mechanistic and quantitative understanding of how individual animal behaviors scale to dispersal patterns at different spatial scales is a question of critical importance from both basic and applied perspectives. Using a diffusion-theory based analytical approach for a wide range of animal movement and seed transportation patterns, we show that the scale (a measure of local dispersal) of the seed dispersal kernel increases with the organisms' rate of movement and mean seed retention time. We reveal that variations in seed retention time is a key determinant of various measures of LDD such as kurtosis (or shape) of the kernel, thinkness of tails and the absolute number of seeds falling beyond a threshold distance. Using empirical data sets of frugivores, we illustrate the importance of variability in retention times for predicting the key disperser species that influence LDD. Our study makes testable predictions linking animal movement behaviors and gut retention times to dispersal patterns and, more generally, highlights the potential importance of animal behavioral variability for the LDD of seeds.
Resumo:
1. Dispersal ability of a species is a key ecological characteristic, affecting a range of processes from adaptation, community dynamics and genetic structure, to distribution and range size. It is determined by both intrinsic species traits and extrinsic landscape-related properties. 2. Using butterflies as a model system, the following questions were addressed: (i) given similar extrinsic factors, which intrinsic species trait(s) explain dispersal ability? (ii) can one of these traits be used as a proxy for dispersal ability? (iii) the effect of interactions between the traits, and phylogenetic relatedness, on dispersal ability. 3. Four data sets, using different measures of dispersal, were compiled from published literature. The first data set uses mean dispersal distances from capture-mark-recapture studies, and the other three use mobility indices. Data for six traits that can potentially affect dispersal ability were collected: wingspan, larval host plant specificity, adult habitat specificity, mate location strategy, voltinism and flight period duration. Each data set was subjected to both unifactorial, and multifactorial, phylogenetically controlled analyses. 4. Among the factors considered, wingspan was the most important determinant of dispersal ability, although the predictive powers of regression models were low. Voltinism and flight period duration also affect dispersal ability, especially in case of temperate species. Interactions between the factors did not affect dispersal ability, and phylogenetic relatedness was significant in one data set. 5. While using wingspan as the only proxy for dispersal ability maybe problematic, it is usually the only easily accessible species-specific trait for a large number of species. It can thus be a satisfactory proxy when carefully interpreted, especially for analyses involving many species from all across the world.
Resumo:
In this paper, we explore a novel idea of using high dynamic range (HDR) technology for uncertainty visualization. We focus on scalar volumetric data sets where every data point is associated with scalar uncertainty. We design a transfer function that maps each data point to a color in HDR space. The luminance component of the color is exploited to capture uncertainty. We modify existing tone mapping techniques and suitably integrate them with volume ray casting to obtain a low dynamic range (LDR) image. The resulting image is displayed on a conventional 8-bits-per-channel display device. The usage of HDR mapping reveals fine details in uncertainty distribution and enables the users to interactively study the data in the context of corresponding uncertainty information. We demonstrate the utility of our method and evaluate the results using data sets from ocean modeling.
Intelligent Approach for Fault Diagnosis in Power Transmission Systems Using Support Vector Machines
Resumo:
This paper presents an approach for identifying the faulted line section and fault location on transmission systems using support vector machines (SVMs) for diagnosis/post-fault analysis purpose. Power system disturbances are often caused by faults on transmission lines. When fault occurs on a transmission system, the protective relay detects the fault and initiates the tripping operation, which isolates the affected part from the rest of the power system. Based on the fault section identified, rapid and corrective restoration procedures can thus be taken to minimize the power interruption and limit the impact of outage on the system. The approach is particularly important for post-fault diagnosis of any mal-operation of relays following a disturbance in the neighboring line connected to the same substation. This may help in improving the fault monitoring/diagnosis process, thus assuring secure operation of the power systems. In this paper we compare SVMs with radial basis function neural networks (RBFNN) in data sets corresponding to different faults on a transmission system. Classification and regression accuracy is reported for both strategies. Studies on a practical 24-Bus equivalent EHV transmission system of the Indian Southern region is presented for indicating the improved generalization with the large margin classifiers in enhancing the efficacy of the chosen model.
Resumo:
Fragment Finder 2.0 is a web-based interactive computing server which can be used to retrieve structurally similar protein fragments from 25 and 90% nonredundant data sets. The computing server identifies structurally similar fragments using the protein backbone C alpha angles. In addition, the identified fragments can be superimposed using either of the two structural superposition programs, STAMP and PROFIT, provided in the server. The freely available Java plug-in Jmol has been interfaced with the server for the visualization of the query and superposed fragments. The server is the updated version of a previously developed search engine and employs an in-house-developed fast pattern matching algorithm. This server can be accessed freely over the World Wide Web through the URL http://cluster.physics.iisc.ernet.in/ff/.
Resumo:
Convergence of the vast sequence space of proteins into a highly restricted fold/conformational space suggests a simple yet unique underlying mechanism of protein folding that has been the subject of much debate in the last several decades. One of the major challenges related to the understanding of protein folding or in silico protein structure prediction is the discrimination of non-native structures/decoys from the native structure. Applications of knowledge-based potentials to attain this goal have been extensively reported in the literature. Also, scoring functions based on accessible surface area and amino acid neighbourhood considerations were used in discriminating the decoys from native structures. In this article, we have explored the potential of protein structure network (PSN) parameters to validate the native proteins against a large number of decoy structures generated by diverse methods. We are guided by two principles: (a) the PSNs capture the local properties from a global perspective and (b) inclusion of non-covalent interactions, at all-atom level, including the side-chain atoms, in the network construction accommodates the sequence dependent features. Several network parameters such as the size of the largest cluster, community size, clustering coefficient are evaluated and scored on the basis of the rank of the native structures and the Z-scores. The network analysis of decoy structures highlights the importance of the global properties contributing to the uniqueness of native structures. The analysis also exhibits that the network parameters can be used as metrics to identify the native structures and filter out non-native structures/decoys in a large number of data-sets; thus also has a potential to be used in the protein `structure prediction' problem.
Resumo:
In this paper we study the problem of designing SVM classifiers when the kernel matrix, K, is affected by uncertainty. Specifically K is modeled as a positive affine combination of given positive semi definite kernels, with the coefficients ranging in a norm-bounded uncertainty set. We treat the problem using the Robust Optimization methodology. This reduces the uncertain SVM problem into a deterministic conic quadratic problem which can be solved in principle by a polynomial time Interior Point (IP) algorithm. However, for large-scale classification problems, IP methods become intractable and one has to resort to first-order gradient type methods. The strategy we use here is to reformulate the robust counterpart of the uncertain SVM problem as a saddle point problem and employ a special gradient scheme which works directly on the convex-concave saddle function. The algorithm is a simplified version of a general scheme due to Juditski and Nemirovski (2011). It achieves an O(1/T-2) reduction of the initial error after T iterations. A comprehensive empirical study on both synthetic data and real-world protein structure data sets show that the proposed formulations achieve the desired robustness, and the saddle point based algorithm outperforms the IP method significantly.
Resumo:
In many real world prediction problems the output is a structured object like a sequence or a tree or a graph. Such problems range from natural language processing to compu- tational biology or computer vision and have been tackled using algorithms, referred to as structured output learning algorithms. We consider the problem of structured classifi- cation. In the last few years, large margin classifiers like sup-port vector machines (SVMs) have shown much promise for structured output learning. The related optimization prob -lem is a convex quadratic program (QP) with a large num-ber of constraints, which makes the problem intractable for large data sets. This paper proposes a fast sequential dual method (SDM) for structural SVMs. The method makes re-peated passes over the training set and optimizes the dual variables associated with one example at a time. The use of additional heuristics makes the proposed method more efficient. We present an extensive empirical evaluation of the proposed method on several sequence learning problems.Our experiments on large data sets demonstrate that the proposed method is an order of magnitude faster than state of the art methods like cutting-plane method and stochastic gradient descent method (SGD). Further, SDM reaches steady state generalization performance faster than the SGD method. The proposed SDM is thus a useful alternative for large scale structured output learning.
Resumo:
There are many applications such as software for processing customer records in telecom, patient records in hospitals, email processing software accessing a single email in a mailbox etc. which require to access a single record in a database consisting of millions of records. A basic feature of these applications is that they need to access data sets which are very large but simple. Cloud computing provides computing requirements for these kinds of new generation of applications involving very large data sets which cannot possibly be handled efficiently using traditional computing infrastructure. In this paper, we describe storage services provided by three well-known cloud service providers and give a comparison of their features with a view to characterize storage requirements of very large data sets as examples and we hope that it would act as a catalyst for the design of storage services for very large data set requirements in future. We also give a brief overview of other kinds of storage that have come up in the recent past for cloud computing.
Resumo:
Ranking problems have become increasingly important in machine learning and data mining in recent years, with applications ranging from information retrieval and recommender systems to computational biology and drug discovery. In this paper, we describe a new ranking algorithm that directly maximizes the number of relevant objects retrieved at the absolute top of the list. The algorithm is a support vector style algorithm, but due to the different objective, it no longer leads to a quadratic programming problem. Instead, the dual optimization problem involves l1, ∞ constraints; we solve this dual problem using the recent l1, ∞ projection method of Quattoni et al (2009). Our algorithm can be viewed as an l∞-norm extreme of the lp-norm based algorithm of Rudin (2009) (albeit in a support vector setting rather than a boosting setting); thus we refer to the algorithm as the ‘Infinite Push’. Experiments on real-world data sets confirm the algorithm’s focus on accuracy at the absolute top of the list.
Resumo:
In this paper, we approach the classical problem of clustering using solution concepts from cooperative game theory such as Nucleolus and Shapley value. We formulate the problem of clustering as a characteristic form game and develop a novel algorithm DRAC (Density-Restricted Agglomerative Clustering) for clustering. With extensive experimentation on standard data sets, we compare the performance of DRAC with that of well known algorithms. We show an interesting result that four prominent solution concepts, Nucleolus, Shapley value, Gately point and \tau-value coincide for the defined characteristic form game. This vindicates the choice of the characteristic function of the clustering game and also provides strong intuitive foundation for our approach.
Resumo:
Future space-based gravity wave (GW) experiments such as the Big Bang Observatory (BBO), with their excellent projected, one sigma angular resolution, will measure the luminosity distance to a large number of GW sources to high precision, and the redshift of the single galaxies in the narrow solid angles towards the sources will provide the redshifts of the gravity wave sources. One sigma BBO beams contain the actual source in only 68% of the cases; the beams that do not contain the source may contain a spurious single galaxy, leading to misidentification. To increase the probability of the source falling within the beam, larger beams have to be considered, decreasing the chances of finding single galaxies in the beams. Saini et al. T.D. Saini, S.K. Sethi, and V. Sahni, Phys. Rev. D 81, 103009 (2010)] argued, largely analytically, that identifying even a small number of GW source galaxies furnishes a rough distance-redshift relation, which could be used to further resolve sources that have multiple objects in the angular beam. In this work we further develop this idea by introducing a self-calibrating iterative scheme which works in conjunction with Monte Carlo simulations to determine the luminosity distance to GW sources with progressively greater accuracy. This iterative scheme allows one to determine the equation of state of dark energy to within an accuracy of a few percent for a gravity wave experiment possessing a beam width an order of magnitude larger than BBO (and therefore having a far poorer angular resolution). This is achieved with no prior information about the nature of dark energy from other data sets such as type Ia supernovae, baryon acoustic oscillations, cosmic microwave background, etc. DOI:10.1103/PhysRevD.87.083001
Resumo:
Moving shadow detection and removal from the extracted foreground regions of video frames, aim to limit the risk of misconsideration of moving shadows as a part of moving objects. This operation thus enhances the rate of accuracy in detection and classification of moving objects. With a similar reasoning, the present paper proposes an efficient method for the discrimination of moving object and moving shadow regions in a video sequence, with no human intervention. Also, it requires less computational burden and works effectively under dynamic traffic road conditions on highways (with and without marking lines), street ways (with and without marking lines). Further, we have used scale-invariant feature transform-based features for the classification of moving vehicles (with and without shadow regions), which enhances the effectiveness of the proposed method. The potentiality of the method is tested with various data sets collected from different road traffic scenarios, and its superiority is compared with the existing methods. (C) 2013 Elsevier GmbH. All rights reserved.
Resumo:
The contour tree is a topological abstraction of a scalar field that captures evolution in level set connectivity. It is an effective representation for visual exploration and analysis of scientific data. We describe a work-efficient, output sensitive, and scalable parallel algorithm for computing the contour tree of a scalar field defined on a domain that is represented using either an unstructured mesh or a structured grid. A hybrid implementation of the algorithm using the GPU and multi-core CPU can compute the contour tree of an input containing 16 million vertices in less than ten seconds with a speedup factor of upto 13. Experiments based on an implementation in a multi-core CPU environment show near-linear speedup for large data sets.