Biblioteca Digital

969 resultados para Real-world

Scalable non-linear Support Vector Machine using hierarchical clustering

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper discusses a method for scaling SVM with Gaussian kernel function to handle large data sets by using a selective sampling strategy for the training set. It employs a scalable hierarchical clustering algorithm to construct cluster indexing structures of the training data in the kernel induced feature space. These are then used for selective sampling of the training data for SVM to impart scalability to the training process. Empirical studies made on real world data sets show that the proposed strategy performs well on large data sets.

An improved representation of vehicle incompatibility in frontal NCAP tests using a modified rigid barrier

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the objective of better understanding the significance of New Car Assessment Program (NCAP) tests conducted by the National Highway Traffic Safety Administration (NHTSA), head-on collisions between two identical cars of different sizes and between cars and a pickup truck are studied in the present paper using LS-DYNA models. Available finite element models of a compact car (Dodge Neon), midsize car (Dodge Intrepid), and pickup truck (Chevrolet C1500) are first improved and validated by comparing theanalysis-based vehicle deceleration pulses against corresponding NCAP crash test histories reported by NHTSA. In confirmation of prevalent perception, simulation-bascd results indicate that an NCAP test against a rigid barrier is a good representation of a collision between two similar cars approaching each other at a speed of 56.3 kmph (35 mph) both in terms of peak deceleration and intrusions. However, analyses carried out for collisions between two incompatible vehicles, such as an Intrepid or Neon against a C1500, point to the inability of the NCAP tests in representing the substantially higher intrusions in the front upper regions experienced by the cars, although peak decelerations in cars arc comparable to those observed in NCAP tests. In an attempt to improve the capability of a front NCAP test to better represent real-world crashes between incompatible vehicles, i.e., ones with contrasting ride height and lower body stiffness, two modified rigid barriers are studied. One of these barriers, which is of stepped geometry with a curved front face, leads to significantly improved correlation of intrusions in the upper regions of cars with respect to those yielded in the simulation of collisions between incompatible vehicles, together with the yielding of similar vehicle peak decelerations obtained in NCAP tests.

Appropriate Institutions and Economic Growth - one size fits no-one

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Modern-day economics is increasingly biased towards believing that institutions matter for growth, an argument that has been further enforced by the recent economic crisis. There is also a wide consensus on what these growth-promoting institutions should look like, and countries are periodically ranked depending on how their institutional structure compares with the best-practice institutions, mostly in place in the developing world. In this paper, it is argued that ”non-desirable” or “second-best” institutions can be beneficial for fostering investment and thus providing a starting point for sustained growth, and that what matters is the appropriateness of institutions to the economy’s distance to the frontier or current phase of development. Anecdotal evidence from Japan and South-Korea is used as a motivation for studying the subject and a model is presented to describe this phenomenon. In the model, the rigidity or non-rigidity of the institutions is described by entrepreneurial selection. It is assumed that entrepreneurs are the ones taking part in the imitation and innovation of technologies, and that decisions on whether or not their projects are refinanced comes from capitalists. The capitalists in turn have no entrepreneurial skills and act merely as financers of projects. The model has two periods, and two kinds of entrepreneurs: those with high skills and those with low skills. The society’s choice of whether an imitation or innovation – based strategy is chosen is modeled as the trade-off between refinancing a low-skill entrepreneur or investing in the selection of the entrepreneurs resulting in a larger fraction of high-skill entrepreneurs with the ability to innovate but less total investment. Finally, a real-world example from India is presented as an initial attempt to test the theory. The data from the example is not included in this paper. It is noted that the model may be lacking explanatory power due to difficulties in testing the predictions, but that this should not be seen as a reason to disregard the theory – the solution might lie in developing better tools, not better just better theories. The conclusion presented is that institutions do matter. There is no one-size-fits-all-solution when it comes to institutional arrangements in different countries, and developing countries should be given space to develop their own institutional structures that cater to their specific needs.

Models of Dispersal and Diversification

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ecology and evolutionary biology is the study of life on this planet. One of the many methods applied to answering the great diversity of questions regarding the lives and characteristics of individual organisms, is the utilization of mathematical models. Such models are used in a wide variety of ways. Some help us to reason, functioning as aids to, or substitutes for, our own fallible logic, thus making argumentation and thinking clearer. Models which help our reasoning can lead to conceptual clarification; by expressing ideas in algebraic terms, the relationship between different concepts become clearer. Other mathematical models are used to better understand yet more complicated models, or to develop mathematical tools for their analysis. Though helping us to reason and being used as tools in the craftmanship of science, many models do not tell us much about the real biological phenomena we are, at least initially, interested in. The main reason for this is that any mathematical model is a simplification of the real world, reducing the complexity and variety of interactions and idiosynchracies of individual organisms. What such models can tell us, however, both is and has been very valuable throughout the history of ecology and evolution. Minimally, a model simplifying the complex world can tell us that in principle, the patterns produced in a model could also be produced in the real world. We can never know how different a simplified mathematical representation is from the real world, but the similarity models do strive for, gives us confidence that their results could apply. This thesis deals with a variety of different models, used for different purposes. One model deals with how one can measure and analyse invasions; the expanding phase of invasive species. Earlier analyses claims to have shown that such invasions can be a regulated phenomena, that higher invasion speeds at a given point in time will lead to a reduction in speed. Two simple mathematical models show that analysis on this particular measure of invasion speed need not be evidence of regulation. In the context of dispersal evolution, two models acting as proof-of-principle are presented. Parent-offspring conflict emerges when there are different evolutionary optima for adaptive behavior for parents and offspring. We show that the evolution of dispersal distances can entail such a conflict, and that under parental control of dispersal (as, for example, in higher plants) wider dispersal kernels are optimal. We also show that dispersal homeostasis can be optimal; in a setting where dispersal decisions (to leave or stay in a natal patch) are made, strategies that divide their seeds or eggs into fractions that disperse or not, as opposed to randomized for each seed, can prevail. We also present a model of the evolution of bet-hedging strategies; evolutionary adaptations that occur despite their fitness, on average, being lower than a competing strategy. Such strategies can win in the long run because they have a reduced variance in fitness coupled with a reduction in mean fitness, and fitness is of a multiplicative nature across generations, and therefore sensitive to variability. This model is used for conceptual clarification; by developing a population genetical model with uncertain fitness and expressing genotypic variance in fitness as a product between individual level variance and correlations between individuals of a genotype. We arrive at expressions that intuitively reflect two of the main categorizations of bet-hedging strategies; conservative vs diversifying and within- vs between-generation bet hedging. In addition, this model shows that these divisions in fact are false dichotomies.

On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations

Relevância:

60.00% 60.00%

Publicador:

Resumo:

It is important to identify the ``correct'' number of topics in mechanisms like Latent Dirichlet Allocation(LDA) as they determine the quality of features that are presented as features for classifiers like SVM. In this work we propose a measure to identify the correct number of topics and offer empirical evidence in its favor in terms of classification accuracy and the number of topics that are naturally present in the corpus. We show the merit of the measure by applying it on real-world as well as synthetic data sets(both text and images). In proposing this measure, we view LDA as a matrix factorization mechanism, wherein a given corpus C is split into two matrix factors M-1 and M-2 as given by C-d*w = M1(d*t) x Q(t*w).Where d is the number of documents present in the corpus anti w is the size of the vocabulary. The quality of the split depends on ``t'', the right number of topics chosen. The measure is computed in terms of symmetric KL-Divergence of salient distributions that are derived from these matrix factors. We observe that the divergence values are higher for non-optimal number of topics - this is shown by a `dip' at the right value for `t'.

Inventory Routing - A Strategic Management Accounting Perspective

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This dissertation develops a strategic management accounting perspective of inventory routing. The thesis studies the drivers of cost efficiency gains by identifying the role of the underlying cost structure, demand, information sharing, forecasting accuracy, service levels, vehicle fleet, planning horizon and other strategic factors as well as the interaction effects among these factors with respect to performance outcomes. The task is to enhance the knowledge of the strategic situations that favor the implementation of inventory routing systems, understanding cause-and-effect relationships, linkages and gaining a holistic view of the value proposition of inventory routing. The thesis applies an exploratory case study design, which is based on normative quantitative empirical research using optimization, simulation and factor analysis. Data and results are drawn from a real world application to cash supply chains. The first research paper shows that performance gains require a common cost component and cannot be explained by simple linear or affine cost structures. Inventory management and distribution decisions become separable in the absence of a set-dependent cost structure, and neither economies of scope nor coordination problems are present in this case. The second research paper analyzes whether information sharing improves the overall forecasting accuracy. Analysis suggests that the potential for information sharing is limited to coordination of replenishments and that central information do not yield more accurate forecasts based on joint forecasting. The third research paper develops a novel formulation of the stochastic inventory routing model that accounts for minimal service levels and forecasting accuracy. The developed model allows studying the interaction of minimal service levels and forecasting accuracy with the underlying cost structure in inventory routing. Interestingly, results show that the factors minimal service level and forecasting accuracy are not statistically significant, and subsequently not relevant for the strategic decision problem to introduce inventory routing, or in other words, to effectively internalize inventory management and distribution decisions at the supplier. Consequently the main contribution of this thesis is the result that cost benefits of inventory routing are derived from the joint decision model that accounts for the underlying set-dependent cost structure rather than the level of information sharing. This result suggests that the value of information sharing of demand and inventory data is likely to be overstated in prior literature. In other words, cost benefits of inventory routing are primarily determined by the cost structure (i.e. level of fixed costs and transportation costs) rather than the level of information sharing, joint forecasting, forecasting accuracy or service levels.

Patterns in permuted binary matrices

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Reorganizing a dataset so that its hidden structure can be observed is useful in any data analysis task. For example, detecting a regularity in a dataset helps us to interpret the data, compress the data, and explain the processes behind the data. We study datasets that come in the form of binary matrices (tables with 0s and 1s). Our goal is to develop automatic methods that bring out certain patterns by permuting the rows and columns. We concentrate on the following patterns in binary matrices: consecutive-ones (C1P), simultaneous consecutive-ones (SC1P), nestedness, k-nestedness, and bandedness. These patterns reflect specific types of interplay and variation between the rows and columns, such as continuity and hierarchies. Furthermore, their combinatorial properties are interlinked, which helps us to develop the theory of binary matrices and efficient algorithms. Indeed, we can detect all these patterns in a binary matrix efficiently, that is, in polynomial time in the size of the matrix. Since real-world datasets often contain noise and errors, we rarely witness perfect patterns. Therefore we also need to assess how far an input matrix is from a pattern: we count the number of flips (from 0s to 1s or vice versa) needed to bring out the perfect pattern in the matrix. Unfortunately, for most patterns it is an NP-complete problem to find the minimum distance to a matrix that has the perfect pattern, which means that the existence of a polynomial-time algorithm is unlikely. To find patterns in datasets with noise, we need methods that are noise-tolerant and work in practical time with large datasets. The theory of binary matrices gives rise to robust heuristics that have good performance with synthetic data and discover easily interpretable structures in real-world datasets: dialectical variation in the spoken Finnish language, division of European locations by the hierarchies found in mammal occurrences, and co-occuring groups in network data. In addition to determining the distance from a dataset to a pattern, we need to determine whether the pattern is significant or a mere occurrence of a random chance. To this end, we use significance testing: we deem a dataset significant if it appears exceptional when compared to datasets generated from a certain null hypothesis. After detecting a significant pattern in a dataset, it is up to domain experts to interpret the results in the terms of the application.

Semi-Supervised Classification Using Sparse Gaussian Process Regression

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Gaussian Processes (GPs) are promising Bayesian methods for classification and regression problems. They have also been used for semi-supervised learning tasks. In this paper, we propose a new algorithm for solving semi-supervised binary classification problem using sparse GP regression (GPR) models. It is closely related to semi-supervised learning based on support vector regression (SVR) and maximum margin clustering. The proposed algorithm is simple and easy to implement. It gives a sparse solution directly unlike the SVR based algorithm. Also, the hyperparameters are estimated easily without resorting to expensive cross-validation technique. Use of sparse GPR model helps in making the proposed algorithm scalable. Preliminary results on synthetic and real-world data sets demonstrate the efficacy of the new algorithm.

The Enlightenment Idea of History as a Legitimation Tool of Kemalism in Turkey

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The study analyzes the effort to build political legitimacy in the Republic of Turkey by ex-ploring a group of influential texts produced by Kemalist writers. The study explores how the Kemalist regime reproduced certain long-lasting enlightenment meta-narrative in its effort to build political legitimacy. Central in this process was a hegemonic representation of history, namely the interpretation of the Anatolian Resistance Struggle of 1919 1922 as a Turkish Revolution executing the enlightenment in the Turkish nation-state. The method employed in the study is contextualizing narratological analysis. The Kemalist texts are analyzed with a repertoire of concepts originally developed in the theory of narra-tive. By bringing these concepts together with epistemological foundations of historical sciences, the study creates a theoretical frame inside of which it is possible to highlight how initially very controversial historical representations in the end manage to construct long-lasting, emotionally and intellectually convincing bases of national identity for the secular middle classes in Turkey. The two most important explanatory concepts in this sense are di-egesis and implied reader. The diegesis refers to the ability of narrative representation to create an inherently credible story-world that works as the basis of national community. The implied reader refers to the process where a certain hegemonic narrative creates a formula of identification and a position through which any individual real-world reader of a story can step inside the narrative story-world and identify oneself as one of us of the national narra-tive. The study demonstrates that the Kemalist enlightenment meta-narrative created a group of narrative accruals which enabled generations of secular middle classes to internalize Kemalist ideology. In this sense, the narrative in question has not only worked as a tool utilized by the so-called Kemalist state-elite to justify its leadership, but has been internalized by various groups in Turkey, working as their genuine world-view. It is shown in the study that secular-ism must be seen as the core ingredient of these groups national identity. The study proposes that the enlightenment narrative reproduced in the Kemalist ideology had its origin in a simi-lar totalizing cultural narrative created in and for Europe. Currently this enlightenment project is challenged in Turkey by those who are in an attempt to give religion a greater role in Turkish society. The study argues that the enduring practice of legitimizing political power through the enlightenment meta-narrative has not only become a major factor contributing to social polarization in Turkey, but has also, in contradiction to the very real potentials for crit-ical approaches inherent in the Enlightenment tradition, crucially restricted the development of critical and rational modes of thinking in the Republic of Turkey.

A Shapley Value-Based Approach to Discover Influential Nodes in Social Networks

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Our study concerns an important current problem, that of diffusion of information in social networks. This problem has received significant attention from the Internet research community in the recent times, driven by many potential applications such as viral marketing and sales promotions. In this paper, we focus on the target set selection problem, which involves discovering a small subset of influential players in a given social network, to perform a certain task of information diffusion. The target set selection problem manifests in two forms: 1) top-k nodes problem and 2) lambda-coverage problem. In the top-k nodes problem, we are required to find a set of k key nodes that would maximize the number of nodes being influenced in the network. The lambda-coverage problem is concerned with finding a set of k key nodes having minimal size that can influence a given percentage lambda of the nodes in the entire network. We propose a new way of solving these problems using the concept of Shapley value which is a well known solution concept in cooperative game theory. Our approach leads to algorithms which we call the ShaPley value-based Influential Nodes (SPINs) algorithms for solving the top-k nodes problem and the lambda-coverage problem. We compare the performance of the proposed SPIN algorithms with well known algorithms in the literature. Through extensive experimentation on four synthetically generated random graphs and six real-world data sets (Celegans, Jazz, NIPS coauthorship data set, Netscience data set, High-Energy Physics data set, and Political Books data set), we show that the proposed SPIN approach is more powerful and computationally efficient. Note to Practitioners-In recent times, social networks have received a high level of attention due to their proven ability in improving the performance of web search, recommendations in collaborative filtering systems, spreading a technology in the market using viral marketing techniques, etc. It is well known that the interpersonal relationships (or ties or links) between individuals cause change or improvement in the social system because the decisions made by individuals are influenced heavily by the behavior of their neighbors. An interesting and key problem in social networks is to discover the most influential nodes in the social network which can influence other nodes in the social network in a strong and deep way. This problem is called the target set selection problem and has two variants: 1) the top-k nodes problem, where we are required to identify a set of k influential nodes that maximize the number of nodes being influenced in the network and 2) the lambda-coverage problem which involves finding a set of influential nodes having minimum size that can influence a given percentage lambda of the nodes in the entire network. There are many existing algorithms in the literature for solving these problems. In this paper, we propose a new algorithm which is based on a novel interpretation of information diffusion in a social network as a cooperative game. Using this analogy, we develop an algorithm based on the Shapley value of the underlying cooperative game. The proposed algorithm outperforms the existing algorithms in terms of generality or computational complexity or both. Our results are validated through extensive experimentation on both synthetically generated and real-world data sets.

Chance constrained uncertain classification via robust optimization

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper studies the problem of constructing robust classifiers when the training is plagued with uncertainty. The problem is posed as a Chance-Constrained Program (CCP) which ensures that the uncertain data points are classified correctly with high probability. Unfortunately such a CCP turns out to be intractable. The key novelty is in employing Bernstein bounding schemes to relax the CCP as a convex second order cone program whose solution is guaranteed to satisfy the probabilistic constraint. Prior to this work, only the Chebyshev based relaxations were exploited in learning algorithms. Bernstein bounds employ richer partial information and hence can be far less conservative than Chebyshev bounds. Due to this efficient modeling of uncertainty, the resulting classifiers achieve higher classification margins and hence better generalization. Methodologies for classifying uncertain test data points and error measures for evaluating classifiers robust to uncertain data are discussed. Experimental results on synthetic and real-world datasets show that the proposed classifiers are better equipped to handle data uncertainty and outperform state-of-the-art in many cases.

Robust Formulations for Handling Uncertainty in Kernel Matrices

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We study the problem of uncertainty in the entries of the Kernel matrix, arising in SVM formulation. Using Chance Constraint Programming and a novel large deviation inequality we derive a formulation which is robust to such noise. The resulting formulation applies when the noise is Gaussian, or has finite support. The formulation in general is non-convex, but in several cases of interest it reduces to a convex program. The problem of uncertainty in kernel matrix is motivated from the real world problem of classifying proteins when the structures are provided with some uncertainty. The formulation derived here naturally incorporates such uncertainty in a principled manner leading to significant improvements over the state of the art. 1.

Efficient algorithms for learning kernels from multiple similarity matrices with general convex loss functions

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we consider the problem of learning an n × n kernel matrix from m(1) similarity matrices under general convex loss. Past research have extensively studied the m = 1 case and have derived several algorithms which require sophisticated techniques like ACCP, SOCP, etc. The existing algorithms do not apply if one uses arbitrary losses and often can not handle m > 1 case. We present several provably convergent iterative algorithms, where each iteration requires either an SVM or a Multiple Kernel Learning (MKL) solver for m > 1 case. One of the major contributions of the paper is to extend the well knownMirror Descent(MD) framework to handle Cartesian product of psd matrices. This novel extension leads to an algorithm, called EMKL, which solves the problem in O(m2 log n 2) iterations; in each iteration one solves an MKL involving m kernels and m eigen-decomposition of n × n matrices. By suitably defining a restriction on the objective function, a faster version of EMKL is proposed, called REKL,which avoids the eigen-decomposition. An alternative to both EMKL and REKL is also suggested which requires only an SVMsolver. Experimental results on real world protein data set involving several similarity matrices illustrate the efficacy of the proposed algorithms.

Believe it or Not! Multicore CPUs can Match GPUs for FLOP-intensive Applications

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this work, we evaluate performance of a real-world image processing application that uses a cross-correlation algorithm to compare a given image with a reference one. The algorithm processes individual images represented as 2-dimensional matrices of single-precision floating-point values using O(n4) operations involving dot-products and additions. We implement this algorithm on a nVidia GTX 285 GPU using CUDA, and also parallelize it for the Intel Xeon (Nehalem) and IBM Power7 processors, using both manual and automatic techniques. Pthreads and OpenMP with SSE and VSX vector intrinsics are used for the manually parallelized version, while a state-of-the-art optimization framework based on the polyhedral model is used for automatic compiler parallelization and optimization. The performance of this algorithm on the nVidia GPU suffers from: (1) a smaller shared memory, (2) unaligned device memory access patterns, (3) expensive atomic operations, and (4) weaker single-thread performance. On commodity multi-core processors, the application dataset is small enough to fit in caches, and when parallelized using a combination of task and short-vector data parallelism (via SSE/VSX) or through fully automatic optimization from the compiler, the application matches or beats the performance of the GPU version. The primary reasons for better multi-core performance include larger and faster caches, higher clock frequency, higher on-chip memory bandwidth, and better compiler optimization and support for parallelization. The best performing versions on the Power7, Nehalem, and GTX 285 run in 1.02s, 1.82s, and 1.75s, respectively. These results conclusively demonstrate that, under certain conditions, it is possible for a FLOP-intensive structured application running on a multi-core processor to match or even beat the performance of an equivalent GPU version.

Environment Education for Ecosystem Conservation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Conservation of natural resources through sustainable ecosystem management and development is the key to our secured future. The management of ecosystem involves inventorying and monitoring, and applying integrated technologies, methodologies and interdisciplinary approaches for its conservation. Hence, now it is even more critical than ever before for the humans to be environmentally literate. To realise this vision, both ecological and environmental education must become a fundamental part of the education system at all levels of education. Currently, it is even more critical than ever before for the humankind as a whole to have a clear understanding of environmental concerns and to follow sustainable development practices. The degradation of our environment is linked to continuing problems of pollution, loss of forest, solid waste disposal, and issues related to economic productivity and national as well as ecological security. Environmental management has gained momentum in the recent years with the initiatives focussing on managing environmental hazards and preventing possible disasters. Environmental issues make better sense, when one can understand them in the context of one’s own cognitive sphere. Environmental education focusing on real-world contexts and issues often begins close to home, encouraging learners to understand and forge connections with their immediate surroundings. The awareness, knowledge, and skills needed for these local connections and understandings provide a base for moving out into larger systems, broader issues, and a more sophisticated comprehension of causes, connections, and consequences. Environmental Education Programme at CES in collaboration with Karnataka Environment Research Foundation (KERF) referred as ‘Know your Ecosystem’ focuses on the importance of investigating the ecosystems within the context of human influences, incorporating an examination of ecology, economics, culture, political structure, and social equity as well as natural processes and systems. The ultimate goal of environment education is to develop an environmentally literate public. It needs to address the connection between our conception and practice of education and our relationship as human cultures to life-sustaining ecological systems. For each environmental issue there are many perspectives and much uncertainty. Environmental education cultivates the ability to recognise uncertainty, envision alternative scenarios, and adapt to changing conditions and information. These knowledge, skills, and mindset translate into a citizenry who is better equipped to address its common problems and take advantage of opportunities, whether environmental concerns are involved or not.

«
1
2
...
48
49
50
51
52
53
54
...
64
65
»