901 resultados para Clustering search algorithm
Resumo:
This thesis presents research within empirical financial economics with focus on liquidity and portfolio optimisation in the stock market. The discussion on liquidity is focused on measurement issues, including TAQ data processing and measurement of systematic liquidity factors (FSO). Furthermore, a framework for treatment of the two topics in combination is provided. The liquidity part of the thesis gives a conceptual background to liquidity and discusses several different approaches to liquidity measurement. It contributes to liquidity measurement by providing detailed guidelines on the data processing needed for applying TAQ data to liquidity research. The main focus, however, is the derivation of systematic liquidity factors. The principal component approach to systematic liquidity measurement is refined by the introduction of moving and expanding estimation windows, allowing for time-varying liquidity co-variances between stocks. Under several liability specifications, this improves the ability to explain stock liquidity and returns, as compared to static window PCA and market average approximations of systematic liquidity. The highest ability to explain stock returns is obtained when using inventory cost as a liquidity measure and a moving window PCA as the systematic liquidity derivation technique. Systematic factors of this setting also have a strong ability in explaining a cross-sectional liquidity variation. Portfolio optimisation in the FSO framework is tested in two empirical studies. These contribute to the assessment of FSO by expanding the applicability to stock indexes and individual stocks, by considering a wide selection of utility function specifications, and by showing explicitly how the full-scale optimum can be identified using either grid search or the heuristic search algorithm of differential evolution. The studies show that relative to mean-variance portfolios, FSO performs well in these settings and that the computational expense can be mitigated dramatically by application of differential evolution.
Resumo:
This thesis presents a large scale numerical investigation of heterogeneous terrestrial optical communications systems and the upgrade of fourth generation terrestrial core to metro legacy interconnects to fifth generation transmission system technologies. Retrofitting (without changing infrastructure) is considered for commercial applications. ROADM are crucial enabling components for future core network developments however their re-routing ability means signals can be switched mid-link onto sub-optimally configured paths which raises new challenges in network management. System performance is determined by a trade-off between nonlinear impairments and noise, where the nonlinear signal distortions depend critically on deployed dispersion maps. This thesis presents a comprehensive numerical investigation into the implementation of phase modulated signals in transparent reconfigurable wavelength division multiplexed fibre optic communication terrestrial heterogeneous networks. A key issue during system upgrades is whether differential phase encoded modulation formats are compatible with the cost optimised dispersion schemes employed in current 10 Gb/s systems. We explore how robust transmission is to inevitable variations in the dispersion mapping and how large the margins are when suboptimal dispersion management is applied. We show that a DPSK transmission system is not drastically affected by reconfiguration from periodic dispersion management to lumped dispersion mapping. A novel DPSK dispersion map optimisation methodology which reduces drastically the optimisation parameter space and the many ways to deploy dispersion maps is also presented. This alleviates strenuous computing requirements in optimisation calculations. This thesis provides a very efficient and robust way to identify high performing lumped dispersion compensating schemes for use in heterogeneous RZ-DPSK terrestrial meshed networks with ROADMs. A modified search algorithm which further reduces this number of configuration combinations is also presented. The results of an investigation of the feasibility of detouring signals locally in multi-path heterogeneous ring networks is also presented.
Resumo:
2000 Mathematics Subject Classification: 62P10, 92D10, 92D30, 94A17, 62L10.
Resumo:
We describe an approach for recovering the plaintext in block ciphers having a design structure similar to the Data Encryption Standard but with improperly constructed S-boxes. The experiments with a backtracking search algorithm performing this kind of attack against modified DES/Triple-DES in ECB mode show that the unknown plaintext can be recovered with a small amount of uncertainty and this algorithm is highly efficient both in time and memory costs for plaintext sources with relatively low entropy. Our investigations demonstrate once again that modifications resulting to S-boxes which still satisfy some design criteria may lead to very weak ciphers. ACM Computing Classification System (1998): E.3, I.2.7, I.2.8.
Resumo:
The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^
Resumo:
Adjoint methods have proven to be an efficient way of calculating the gradient of an objective function with respect to a shape parameter for optimisation, with a computational cost nearly independent of the number of the design variables [1]. The approach in this paper links the adjoint surface sensitivities (gradient of objective function with respect to the surface movement) with the parametric design velocities (movement of the surface due to a CAD parameter perturbation) in order to compute the gradient of the objective function with respect to CAD variables.
For a successful implementation of shape optimization strategies in practical industrial cases, the choice of design variables or parameterisation scheme used for the model to be optimized plays a vital role. Where the goal is to base the optimization on a CAD model the choices are to use a NURBS geometry generated from CAD modelling software, where the position of the NURBS control points are the optimisation variables [2] or to use the feature based CAD model with all of the construction history to preserve the design intent [3]. The main advantage of using the feature based model is that the optimized model produced can be directly used for the downstream applications including manufacturing and process planning.
This paper presents an approach for optimization based on the feature based CAD model, which uses CAD parameters defining the features in the model geometry as the design variables. In order to capture the CAD surface movement with respect to the change in design variable, the “Parametric Design Velocity” is calculated, which is defined as the movement of the CAD model boundary in the normal direction due to a change in the parameter value.
The approach presented here for calculating the design velocities represents an advancement in terms of capability and robustness of that described by Robinson et al. [3]. The process can be easily integrated to most industrial optimisation workflows and is immune to the topology and labelling issues highlighted by other CAD based optimisation processes. It considers every continuous (“real value”) parameter type as an optimisation variable, and it can be adapted to work with any CAD modelling software, as long as it has an API which provides access to the values of the parameters which control the model shape and allows the model geometry to be exported. To calculate the movement of the boundary the methodology employs finite differences on the shape of the 3D CAD models before and after the parameter perturbation. The implementation procedure includes calculating the geometrical movement along a normal direction between two discrete representations of the original and perturbed geometry respectively. Parametric design velocities can then be directly linked with adjoint surface sensitivities to extract the gradients to use in a gradient-based optimization algorithm.
The optimisation of a flow optimisation problem is presented, in which the power dissipation of the flow in an automotive air duct is to be reduced by changing the parameters of the CAD geometry created in CATIA V5. The flow sensitivities are computed with the continuous adjoint method for a laminar and turbulent flow [4] and are combined with the parametric design velocities to compute the cost function gradients. A line-search algorithm is then used to update the design variables and proceed further with optimisation process.
Resumo:
A dissertation submitted in fulfillment of the requirements to the degree of Master in Computer Science and Computer Engineering
Resumo:
The majority of research work carried out in the field of Operations-Research uses methods and algorithms to optimize the pick-up and delivery problem. Most studies aim to solve the vehicle routing problem, to accommodate optimum delivery orders, vehicles etc. This paper focuses on green logistics approach, where existing Public Transport infrastructure capability of a city is used for the delivery of small and medium sized packaged goods thus, helping improve the situation of urban congestion and greenhouse gas emissions reduction. It carried out a study to investigate the feasibility of the proposed multi-agent based simulation model, for efficiency of cost, time and energy consumption. Multimodal Dijkstra Shortest Path algorithm and Nested Monte Carlo Search have been employed for a two-phase algorithmic approach used for generation of time based cost matrix. The quality of the tour is dependent on the efficiency of the search algorithm implemented for plan generation and route planning. The results reveal a definite advantage of using Public Transportation over existing delivery approaches in terms of energy efficiency.
Resumo:
Overrecentdecades,remotesensinghasemergedasaneffectivetoolforimprov- ing agriculture productivity. In particular, many works have dealt with the problem of identifying characteristics or phenomena of crops and orchards on different scales using remote sensed images. Since the natural processes are scale dependent and most of them are hierarchically structured, the determination of optimal study scales is mandatory in understanding these processes and their interactions. The concept of multi-scale/multi- resolution inherent to OBIA methodologies allows the scale problem to be dealt with. But for that multi-scale and hierarchical segmentation algorithms are required. The question that remains unsolved is to determine the suitable scale segmentation that allows different objects and phenomena to be characterized in a single image. In this work, an adaptation of the Simple Linear Iterative Clustering (SLIC) algorithm to perform a multi-scale hierarchi- cal segmentation of satellite images is proposed. The selection of the optimal multi-scale segmentation for different regions of the image is carried out by evaluating the intra- variability and inter-heterogeneity of the regions obtained on each scale with respect to the parent-regions defined by the coarsest scale. To achieve this goal, an objective function, that combines weighted variance and the global Moran index, has been used. Two different kinds of experiment have been carried out, generating the number of regions on each scale through linear and dyadic approaches. This methodology has allowed, on the one hand, the detection of objects on different scales and, on the other hand, to represent them all in a sin- gle image. Altogether, the procedure provides the user with a better comprehension of the land cover, the objects on it and the phenomena occurring.
Resumo:
With the eye-catching advances in sensing technologies, smart water networks have been attracting immense research interest in recent years. One of the most overarching tasks in smart water network management is the reduction of water loss (such as leaks and bursts in a pipe network). In this paper, we propose an efficient scheme to position water loss event based on water network topology. The state-of-the-art approach to this problem, however, utilizes the limited topology information of the water network, that is, only one single shortest path between two sensor locations. Consequently, the accuracy of positioning water loss events is still less desirable. To resolve this problem, our scheme consists of two key ingredients: First, we design a novel graph topology-based measure, which can recursively quantify the "average distances" for all pairs of senor locations simultaneously in a water network. This measure will substantially improve the accuracy of our positioning strategy, by capturing the entire water network topology information between every two sensor locations, yet without any sacrifice of computational efficiency. Then, we devise an efficient search algorithm that combines the "average distances" with the difference in the arrival times of the pressure variations detected at sensor locations. The viable experimental evaluations on real-world test bed (WaterWiSe@SG) demonstrate that our proposed positioning scheme can identify water loss event more accurately than the best-known competitor.
Resumo:
In the framework of a global transition to a low-carbon energy mix, the interest in advanced nuclear Small Modular Reactors (SMRs) has been growing at the international level. Due to the high level of maturity reached by Severe Accident Codes for currently operating rectors, their applicability to advanced SMRs is starting to be studied. Within the present work of thesis and in the framework of a collaboration between ENEA, UNIBO and IRSN, an ASTEC code model of a generic IRIS reactor has been developed. The simulation of a DBA sequence involving the operation of all the passive safety systems of the generic IRIS has been carried out to investigate the code model capability in the prediction of the thermal-hydraulics characterizing an integral SMR adopting a passive mitigation strategy. The following simulation of 4 BDBAs sequences explores the applicability of Severe Accident Codes to advance SMRs in beyond-design and core-degradation conditions. The uncertainty affecting a code simulation can be estimated by using the method of Input Uncertainty Propagation, whose application has been realized through the RAVEN-ASTEC coupling and implementation on an HPC platform. This probabilistic methodology has been employed in a study of the uncertainty affecting the passive safety system operation in the DBA simulation of ASTEC, providing a further characterization of the thermal-hydraulics of this sequence. The application of the Uncertainty Quantification method to early core-melt phenomena has been investigated in the framework of a BEPU analysis of the ASTEC simulation of the QUENCH test-6 experiment. A possible solution to the encountered challenges has been proposed through the application of a Limit Surface search algorithm.
Resumo:
Latency can be defined as the sum of the arrival times at the customers. Minimum latency problems are specially relevant in applications related to humanitarian logistics. This thesis presents algorithms for solving a family of vehicle routing problems with minimum latency. First the latency location routing problem (LLRP) is considered. It consists of determining the subset of depots to be opened, and the routes that a set of homogeneous capacitated vehicles must perform in order to visit a set of customers such that the sum of the demands of the customers assigned to each vehicle does not exceed the capacity of the vehicle. For solving this problem three metaheuristic algorithms combining simulated annealing and variable neighborhood descent, and an iterated local search (ILS) algorithm, are proposed. Furthermore, the multi-depot cumulative capacitated vehicle routing problem (MDCCVRP) and the multi-depot k-traveling repairman problem (MDk-TRP) are solved with the proposed ILS algorithm. The MDCCVRP is a special case of the LLRP in which all the depots can be opened, and the MDk-TRP is a special case of the MDCCVRP in which the capacity constraints are relaxed. Finally, a LLRP with stochastic travel times is studied. A two-stage stochastic programming model and a variable neighborhood search algorithm are proposed for solving the problem. Furthermore a sampling method is developed for tackling instances with an infinite number of scenarios. Extensive computational experiments show that the proposed methods are effective for solving the problems under study.
Resumo:
A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
In this paper we propose a nature-inspired approach that can boost the Optimum-Path Forest (OPF) clustering algorithm by optimizing its parameters in a discrete lattice. The experiments in two public datasets have shown that the proposed algorithm can achieve similar parameters' values compared to the exhaustive search. Although, the proposed technique is faster than the traditional one, being interesting for intrusion detection in large scale traffic networks. © 2012 IEEE.
Resumo:
A graph clustering algorithm constructs groups of closely related parts and machines separately. After they are matched for the least intercell moves, a refining process runs on the initial cell formation to decrease the number of intercell moves. A simple modification of this main approach can deal with some practical constraints, such as the popular constraint of bounding the maximum number of machines in a cell. Our approach makes a big improvement in the computational time. More importantly, improvement is seen in the number of intercell moves when the computational results were compared with best known solutions from the literature. (C) 2009 Elsevier Ltd. All rights reserved.