771 resultados para Multi-relational data mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The van der Heijden Studies Database has been reviewed to identify 'Draw Studies' with sub-7-man positions in the main line which are not draws. The data-mining method is described. Some 1,500 studies were faulted, 700 for the first time: 14 of the more interesting faults are highlighted and discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ground surface net solar radiation is the energy that drives physical and chemical processes at the ground surface. In this paper, multi-spectral data from the Landsat-5 TM, topographic data from a gridded digital elevation model, field measurements, and the atmosphere model LOWTRAN 7 are used to estimate surface net solar radiation over the FIFE site. Firstly an improved method is presented and used for calculating total surface incoming radiation. Then, surface albedo is integrated from surface reflectance factors derived from remotely sensed data from Landsat-5 TM. Finally, surface net solar radiation is calculated by subtracting surface upwelling radiation from the total surface incoming radiation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many recent inverse scattering techniques have been designed for single frequency scattered fields in the frequency domain. In practice, however, the data is collected in the time domain. Frequency domain inverse scattering algorithms obviously apply to time-harmonic scattering, or nearly time-harmonic scattering, through application of the Fourier transform. Fourier transform techniques can also be applied to non-time-harmonic scattering from pulses. Our goal here is twofold: first, to establish conditions on the time-dependent waves that provide a correspondence between time domain and frequency domain inverse scattering via Fourier transforms without recourse to the conventional limiting amplitude principle; secondly, we apply the analysis in the first part of this work toward the extension of a particular scattering technique, namely the point source method, to scattering from the requisite pulses. Numerical examples illustrate the method and suggest that reconstructions from admissible pulses deliver superior reconstructions compared to straight averaging of multi-frequency data. Copyright (C) 2006 John Wiley & Sons, Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This is a review of progress in the Chess Endgame field. It includes news of the promulgation of Endgame Tables, their use, non-use and potential runtime creation. It includes news of data-mining achievements related to 7-man chess and to the field of Chess Studies. It includes news of an algorithm to create Endgame Tables for variants of the normal game of chess.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper is concerned with the selection of inputs for classification models based on ratios of measured quantities. For this purpose, all possible ratios are built from the quantities involved and variable selection techniques are used to choose a convenient subset of ratios. In this context, two selection techniques are proposed: one based on a pre-selection procedure and another based on a genetic algorithm. In an example involving the financial distress prediction of companies, the models obtained from ratios selected by the proposed techniques compare favorably to a model using ratios usually found in the financial distress literature.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Technology-enhanced or Computer Aided Learning (e-learning) can be institutionally integrated and supported by learning management systems or Virtual Learning Environments (VLEs) to offer efficiency gains, effectiveness and scalability of the e-leaning paradigm. However this can only be achieved through integration of pedagogically intelligent approaches and lesson preparation tools environment and VLE that is well accepted by both the students and teachers. This paper critically explores some of the issues relevant to scalable routinisation of e-learning at the tertiary level, typically first year university undergraduates, with the teaching of Relational Data Analysis (RDA), as supported by multimedia authoring, as a case study. The paper concludes that blended learning approaches which balance the deployment of e-learning with other modalities of learning delivery such as instructor–mediated group learning etc offer the most flexible and scalable route to e-learning but that this requires the graceful integration of platforms for multimedia production, distribution and delivery through advanced interactive spaces that provoke learner engagement and promote learning autonomy and group learning facilitated by a cooperative-creative learning environment that remains open to personal exploration of constructivist-constructionist pathways to learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This review of recent developments starts with the publication of Harold van der Heijden's Study Database Edition IV, John Nunn's second trilogy on the endgame, and a range of endgame tables (EGTs) to the DTC, DTZ and DTZ50 metrics. It then summarises data-mining work by Eiko Bleicher and Guy Haworth in 2010. This used CQL and pgn2fen to find some 3,000 EGT-faulted studies in the database above, and the Type A (value-critical) and Type B-DTM (DTM-depth-critical) zugzwangs in the mainlines of those studies. The same technique was used to mine Chessbase's BIG DATABASE 2010 to identify Type A/B zugzwangs, and to identify the pattern of value-concession and DTM-depth concession in sub-7-man play.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Peak picking is an early key step in MS data analysis. We compare three commonly used approaches to peak picking and discuss their merits by means of statistical analysis. Methods investigated encompass signal-to-noise ratio, continuous wavelet transform, and a correlation-based approach using a Gaussian template. Functionality of the three methods is illustrated and discussed in a practical context using a mass spectral data set created with MALDI-TOF technology. Sensitivity and specificity are investigated using a manually defined reference set of peaks. As an additional criterion, the robustness of the three methods is assessed by a perturbation analysis and illustrated using ROC curves.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces a new fast, effective and practical model structure construction algorithm for a mixture of experts network system utilising only process data. The algorithm is based on a novel forward constrained regression procedure. Given a full set of the experts as potential model bases, the structure construction algorithm, formed on the forward constrained regression procedure, selects the most significant model base one by one so as to minimise the overall system approximation error at each iteration, while the gate parameters in the mixture of experts network system are accordingly adjusted so as to satisfy the convex constraints required in the derivation of the forward constrained regression procedure. The procedure continues until a proper system model is constructed that utilises some or all of the experts. A pruning algorithm of the consequent mixture of experts network system is also derived to generate an overall parsimonious construction algorithm. Numerical examples are provided to demonstrate the effectiveness of the new algorithms. The mixture of experts network framework can be applied to a wide variety of applications ranging from multiple model controller synthesis to multi-sensor data fusion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Van der Heijden’s ENDGAME STUDY DATABASE IV, HHDBIV, is the definitive collection of 76,132 chess studies. The zugzwang position or zug, one in which the side to move would prefer not to, is a frequent theme in the literature of chess studies. In this third data-mining of HHDBIV, we report on the occurrence of sub-7-man zugs there as discovered by the use of CQL and Nalimov endgame tables (EGTs). We also mine those Zugzwang Studies in which a zug more significantly appears in both its White-to-move (wtm) and Black-to-move (btm) forms. We provide some illustrative and extreme examples of zugzwangs in studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Self-Organizing Map (SOM) is a popular unsupervised neural network able to provide effective clustering and data visualization for multidimensional input datasets. In this paper, we present an application of the simulated annealing procedure to the SOM learning algorithm with the aim to obtain a fast learning and better performances in terms of quantization error. The proposed learning algorithm is called Fast Learning Self-Organized Map, and it does not affect the easiness of the basic learning algorithm of the standard SOM. The proposed learning algorithm also improves the quality of resulting maps by providing better clustering quality and topology preservation of input multi-dimensional data. Several experiments are used to compare the proposed approach with the original algorithm and some of its modification and speed-up techniques.