947 resultados para Model Mining
Resumo:
Telecommunications network management is based on huge amounts of data that are continuously collected from elements and devices from all around the network. The data is monitored and analysed to provide information for decision making in all operation functions. Knowledge discovery and data mining methods can support fast-pace decision making in network operations. In this thesis, I analyse decision making on different levels of network operations. I identify the requirements decision-making sets for knowledge discovery and data mining tools and methods, and I study resources that are available to them. I then propose two methods for augmenting and applying frequent sets to support everyday decision making. The proposed methods are Comprehensive Log Compression for log data summarisation and Queryable Log Compression for semantic compression of log data. Finally I suggest a model for a continuous knowledge discovery process and outline how it can be implemented and integrated to the existing network operations infrastructure.
Resumo:
Segmentation is a data mining technique yielding simplified representations of sequences of ordered points. A sequence is divided into some number of homogeneous blocks, and all points within a segment are described by a single value. The focus in this thesis is on piecewise-constant segments, where the most likely description for each segment and the most likely segmentation into some number of blocks can be computed efficiently. Representing sequences as segmentations is useful in, e.g., storage and indexing tasks in sequence databases, and segmentation can be used as a tool in learning about the structure of a given sequence. The discussion in this thesis begins with basic questions related to segmentation analysis, such as choosing the number of segments, and evaluating the obtained segmentations. Standard model selection techniques are shown to perform well for the sequence segmentation task. Segmentation evaluation is proposed with respect to a known segmentation structure. Applying segmentation on certain features of a sequence is shown to yield segmentations that are significantly close to the known underlying structure. Two extensions to the basic segmentation framework are introduced: unimodal segmentation and basis segmentation. The former is concerned with segmentations where the segment descriptions first increase and then decrease, and the latter with the interplay between different dimensions and segments in the sequence. These problems are formally defined and algorithms for solving them are provided and analyzed. Practical applications for segmentation techniques include time series and data stream analysis, text analysis, and biological sequence analysis. In this thesis segmentation applications are demonstrated in analyzing genomic sequences.
Resumo:
This study presents a comprehensive mathematical formulation model for a short-term open-pit mine block sequencing problem, which considers nearly all relevant technical aspects in open-pit mining. The proposed model aims to obtain the optimum extraction sequences of the original-size (smallest) blocks over short time intervals and in the presence of real-life constraints, including precedence relationship, machine capacity, grade requirements, processing demands and stockpile management. A hybrid branch-and-bound and simulated annealing algorithm is developed to solve the problem. Computational experiments show that the proposed methodology is a promising way to provide quantitative recommendations for mine planning and scheduling engineers.
Resumo:
This paper proposes a new multi-stage mine production timetabling (MMPT) model to optimise open-pit mine production operations including drilling, blasting and excavating under real-time mining constraints. The MMPT problem is formulated as a mixed integer programming model and can be optimally solved for small-size MMPT instances by IBM ILOG-CPLEX. Due to NP-hardness, an improved shifting-bottleneck-procedure algorithm based on the extended disjunctive graph is developed to solve large-size MMPT instances in an effective and efficient way. Extensive computational experiments are presented to validate the proposed algorithm that is able to efficiently obtain the near-optimal operational timetable of mining equipment units. The advantages are indicated by sensitivity analysis under various real-life scenarios. The proposed MMPT methodology is promising to be implemented as a tool for mining industry because it is straightforwardly modelled as a standard scheduling model, efficiently solved by the heuristic algorithm, and flexibly expanded by adopting additional industrial constraints.
Resumo:
A three-dimensional mathematical model has been developed to simulate the gas flow, composition, and temperature profiles inside a cupola. Comparison of the model with the reported experimental data shows the presence of a zone with low combustion rate at the tuyere level. For a 24 in (610 mm) cupola with four rows of tuyeres, the combustion zones from each tuyere overlap each other, forming an overall combustion zone of cylindrical shape of height similar to 0.2 m. Using the model, it is found that the spout temperature initially increases with increasing blast velocity and attains a maximum. Further increase in blast velocity does not change the spout temperature. This suggests that smaller size tuyeres and higher permeability of the bed can give superior cupola performance. (C) 1997 The Institute of Materials.
Resumo:
Rapid urbanisation in India has posed serious challenges to the decision makers in regional planning involving plethora of issues including provision of basic amenities (like electricity, water, sanitation, transport, etc.). Urban planning entails an understanding of landscape and urban dynamics with causal factors. Identifying, delineating and mapping landscapes on temporal scale provide an opportunity to monitor the changes, which is important for natural resource management and sustainable planning activities. Multi-source, multi-sensor, multi-temporal, multi-frequency or multi-polarization remote sensing data with efficient classification algorithms and pattern recognition techniques aid in capturing these dynamics. This paper analyses the landscape dynamics of Greater Bangalore by: (i) characterisation of direct impervious surface, (ii) computation of forest fragmentation indices and (iii) modeling to quantify and categorise urban changes. Linear unmixing is used for solving the mixed pixel problem of coarse resolution super spectral MODIS data for impervious surface characterisation. Fragmentation indices were used to classify forests – interior, perforated, edge, transitional, patch and undetermined. Based on this, urban growth model was developed to determine the type of urban growth – Infill, Expansion and Outlying growth. This helped in visualising urban growth poles and consequence of earlier policy decisions that can help in evolving strategies for effective land use policies.
Suite of tools for statistical N-gram language modeling for pattern mining in whole genome sequences
Resumo:
Genome sequences contain a number of patterns that have biomedical significance. Repetitive sequences of various kinds are a primary component of most of the genomic sequence patterns. We extended the suffix-array based Biological Language Modeling Toolkit to compute n-gram frequencies as well as n-gram language-model based perplexity in windows over the whole genome sequence to find biologically relevant patterns. We present the suite of tools and their application for analysis on whole human genome sequence.
Resumo:
In order to obtain the distribution rules of in situ stress and mining-induced stress of Beiminghe Iron Mine, the stress relief method by overcoring was used to measure the in situ stress, and the MC type bore-hole stress gauge was adopted to measure the mining-induced stress. In the in situ stress measuring, the technique of improved hollow inclusion cells was adopted, which can realize complete temperature compensation. Based on the measuring results, the distribution model of in situ stress was established and analyzed. The in situ stress measuring result shows that the maximum horizontal stress is 1.75-2.45 times of vertical stress and almost 1.83 times of the minimum horizontal stress in this mineral field. And the mining-induced stress measuring result shows that, according to the magnitude of front abutment pressure the stress region can be separated into stress-relaxed area, stress-concentrated area and initial stress area. At the -50 m mining level of this mine, the range of stress-relaxed area is 0-3 m before mining face; the range of stress-concentrated area is 3-55 m before mining face, and the maximum mining-induced stress is 16.5-17.5 MPa, which is 15-20 m from the mining face. The coefficient of stress concentration is 1.85.
Resumo:
A Data Mining model that is able to predict if a flight is going to leave late due to a weather delay. It is used, to be able to get a later connection if you have a connecting flight.
Resumo:
This paper describes the application of variable-horizon model predictive control to trajectory generation in surface excavation. A nonlinear dynamic model of a surface mining machine digging in oil sand is developed as a test platform. This model is then stabilised with an inner-loop controller before being linearised to generate a prediction model. The linear model is used to design a predictive controller for trajectory generation. A variable horizon formulation is augmented with extra terms in the cost function to allow more control over digging, whilst still preserving the guarantee of finite-time completion. Simulations show the generation of realistic trajectories, motivating new applications of variable horizon MPC for autonomy that go beyond the realm of vehicle path planning. ©2010 IEEE.
Resumo:
Although single nucleotide polymorphisms (SNPs) are important resources for population genetics, pedigree analysis and genomic mapping, such loci have not been reported in Pacific abalone so far. In this study, a bioinformatics strategy was adopted to discover SNPs within the expressed sequences (ESTs) of Pacific abalone, Haliotis discus hannai, and furthermore, polymerase chain reaction direct sequencing (PCR-DS) and allele-specific PCR (AS-PCR) were used for SNPs detection and genotype scoring respectively. A total of 5893 ESTs were assembled and 302 putative SNPs were identified. The average density of SNPs in ESTs was 1%. Fifty-two sets of sequencing primers were designed from SNPs flanking ESTs to amplify the genomic DNA, and 13 could generate products of expected size. Polymerase chain reaction direct sequencing of the amplification products from pooled DNA samples revealed 40 polymorphic SNP loci. Using a modified tetra-primer AS-PCR, seven mitochondrial and six nuclear SNPs were typed and characterized among 37 wild abalones. In conclusion, it is feasible to discover SNPs from number limited ESTs and the AS-PCR as a simple, robust and reliable assay could be a primary method for small- and medium-scale SNPs detection in abalones as well as other non-model organisms.
Resumo:
Ferr?, S. and King, R. D. (2004) A dichotomic search algorithm for mining and learning in domain-specific logics. Fundamenta Informaticae. IOS Press. To appear
Resumo:
High-integrity castings require sophisticated design and manufacturing procedures to ensure they are essentially macrodefect free. Unfortunately, an important class of such defects—macroporosity, misruns, and pipe shrinkage—are all functions of the interactions of free surface flow, heat transfer, and solidication in complex geometries. Because these defects arise as an interaction of the preceding continuum phenomena, genuinely predictive models of these defects must represent these interactions explicitly. This work describes an attempt to model the formation of macrodefects explicitly as a function of the interacting continuum phenomena in arbitrarily complex three-dimensional geometries. The computational approach exploits a compatible set of finite volume procedures extended to unstructured meshes. The implementation of the model is described together with its testing and a measure of validation. The model demonstrates the potential to predict reliably shrinkage macroporosity, misruns, and pipe shrinkage directly as a result of interactions among free-surface fluid flow, heat transfer, and solidification.
Resumo:
The design and development of a comprehensive computational model of a copper stockpile leach process is summarized. The computational fluid dynamic software framework PHYSICA+ and various phenomena were used to model transport phenomena, mineral reaction kinetics, bacterial effects, and heat, energy and acid balances for the overall leach process. In this paper, the performance of the model is investigated, in particular its sensitvity to particle size and ore permeability. A combination of literature and laboratory sources was used to parameterize the model. The simulation results from the leach model are compared with closely controlled column pilot scale tests. The main performance characteristics (e.g. copper recovery rate) predicted by the model compare reasonably well with the experimental data and clearly reflect the qualitiative behavior of the process in many respects. The model is used to provide a measure of the sensitivity of ore permeability on leach behavior, and simulation results are examined for several different particle size distributions.