880 resultados para structuration of lexical data bases
Resumo:
The high temperature region of the MnO-A1203 phase diagram has been redetermined to resolve some discrepancies reported in the literature regarding the melting behaviour of MnA1,04. This spinel was found to melt congruently at 2108 (+ 15) K. Theactivity of MnOin MnO-Al,03 meltsand in the two phase regions, melt + MnAI,04 and MnAI2O4 + A1203, has been determined by measuring the manganese concentration in platinum foils in equilibrium under controlled oxygen potentials. The activity of MnO obtained in this study for M ~ O - A I ,m~el~ts is in fair agreement with the results of Sharma and Richardson.However. the alumina-rich melt is found to be in equilibrium with MnAl,04 rather than AI2O3. as suggested by ~ha rmaan d Richardson. The value for the acthity of MnO in the M~AI ,O,+ A1,03 two phaseregion permits a rigorous application of the Gibbs-Duhem equation for calculating the activity of A1203 and the integral Gibbs' energy of mixing of MnO-A1203 melts, which are significantly different from those reported in the literature.
Resumo:
Data mining is concerned with analysing large volumes of (often unstructured) data to automatically discover interesting regularities or relationships which in turn lead to better understanding of the underlying processes. The field of temporal data mining is concerned with such analysis in the case of ordered data streams with temporal interdependencies. Over the last decade many interesting techniques of temporal data mining were proposed and shown to be useful in many applications. Since temporal data mining brings together techniques from different fields such as statistics, machine learning and databases, the literature is scattered among many different sources. In this article, we present an overview of techniques of temporal data mining.We mainly concentrate on algorithms for pattern discovery in sequential data streams.We also describe some recent results regarding statistical analysis of pattern discovery methods.
Resumo:
A system for temporal data mining includes a computer readable medium having an application configured to receive at an input module a temporal data series having events with start times and end times, a set of allowed dwelling times and a threshold frequency. The system is further configured to identify, using a candidate identification and tracking module, one or more occurrences in the temporal data series of a candidate episode and increment a count for each identified occurrence. The system is also configured to produce at an output module an output for those episodes whose count of occurrences results in a frequency exceeding the threshold frequency.
Resumo:
This paper proposes a new approach for solving the state estimation problem. The approach is aimed at producing a robust estimator that rejects bad data, even if they are associated with leverage-point measurements. This is achieved by solving a sequence of Linear Programming (LP) problems. Optimization is carried via a new algorithm which is a combination of “upper bound optimization technique" and “an improved algorithm for discrete linear approximation". In this formulation of the LP problem, in addition to the constraints corresponding to the measurement set, constraints corresponding to bounds of state variables are also involved, which enables the LP problem more efficient in rejecting bad data, even if they are associated with leverage-point measurements. Results of the proposed estimator on IEEE 39-bus system and a 24-bus EHV equivalent system of the southern Indian grid are presented for illustrative purpose.
Resumo:
Purpose: To optimize the data-collection strategy for diffuse optical tomography and to obtain a set of independent measurements among the total measurements using the model based data-resolution matrix characteristics. Methods: The data-resolution matrix is computed based on the sensitivity matrix and the regularization scheme used in the reconstruction procedure by matching the predicted data with the actual one. The diagonal values of data-resolution matrix show the importance of a particular measurement and the magnitude of off-diagonal entries shows the dependence among measurements. Based on the closeness of diagonal value magnitude to off-diagonal entries, the independent measurements choice is made. The reconstruction results obtained using all measurements were compared to the ones obtained using only independent measurements in both numerical and experimental phantom cases. The traditional singular value analysis was also performed to compare the results obtained using the proposed method. Results: The results indicate that choosing only independent measurements based on data-resolution matrix characteristics for the image reconstruction does not compromise the reconstructed image quality significantly, in turn reduces the data-collection time associated with the procedure. When the same number of measurements (equivalent to independent ones) are chosen at random, the reconstruction results were having poor quality with major boundary artifacts. The number of independent measurements obtained using data-resolution matrix analysis is much higher compared to that obtained using the singular value analysis. Conclusions: The data-resolution matrix analysis is able to provide the high level of optimization needed for effective data-collection in diffuse optical imaging. The analysis itself is independent of noise characteristics in the data, resulting in an universal framework to characterize and optimize a given data-collection strategy. (C) 2012 American Association of Physicists in Medicine. http://dx.doi.org/10.1118/1.4736820]
Resumo:
This paper considers the problem of weak signal detection in the presence of navigation data bits for Global Navigation Satellite System (GNSS) receivers. Typically, a set of partial coherent integration outputs are non-coherently accumulated to combat the effects of model uncertainties such as the presence of navigation data-bits and/or frequency uncertainty, resulting in a sub-optimal test statistic. In this work, the test-statistic for weak signal detection is derived in the presence of navigation data-bits from the likelihood ratio. It is highlighted that averaging the likelihood ratio based test-statistic over the prior distributions of the unknown data bits and the carrier phase uncertainty leads to the conventional Post Detection Integration (PDI) technique for detection. To improve the performance in the presence of model uncertainties, a novel cyclostationarity based sub-optimal PDI technique is proposed. The test statistic is analytically characterized, and shown to be robust to the presence of navigation data-bits, frequency, phase and noise uncertainties. Monte Carlo simulation results illustrate the validity of the theoretical results and the superior performance offered by the proposed detector in the presence of model uncertainties.
Resumo:
The presence of a large number of spectral bands in the hyperspectral images increases the capability to distinguish between various physical structures. However, they suffer from the high dimensionality of the data. Hence, the processing of hyperspectral images is applied in two stages: dimensionality reduction and unsupervised classification techniques. The high dimensionality of the data has been reduced with the help of Principal Component Analysis (PCA). The selected dimensions are classified using Niche Hierarchical Artificial Immune System (NHAIS). The NHAIS combines the splitting method to search for the optimal cluster centers using niching procedure and the merging method is used to group the data points based on majority voting. Results are presented for two hyperspectral images namely EO-1 Hyperion image and Indian pines image. A performance comparison of this proposed hierarchical clustering algorithm with the earlier three unsupervised algorithms is presented. From the results obtained, we deduce that the NHAIS is efficient.
Resumo:
The paper describes an algorithm for multi-label classification. Since a pattern can belong to more than one class, the task of classifying a test pattern is a challenging one. We propose a new algorithm to carry out multi-label classification which works for discrete data. We have implemented the algorithm and presented the results for different multi-label data sets. The results have been compared with the algorithm multi-label KNN or ML-KNN and found to give good results.
Resumo:
Gene microarray technology is highly effective in screening for differential gene expression and has hence become a popular tool in the molecular investigation of cancer. When applied to tumours, molecular characteristics may be correlated with clinical features such as response to chemotherapy. Exploitation of the huge amount of data generated by microarrays is difficult, however, and constitutes a major challenge in the advancement of this methodology. Independent component analysis (ICA), a modern statistical method, allows us to better understand data in such complex and noisy measurement environments. The technique has the potential to significantly increase the quality of the resulting data and improve the biological validity of subsequent analysis. We performed microarray experiments on 31 postmenopausal endometrial biopsies, comprising 11 benign and 20 malignant samples. We compared ICA to the established methods of principal component analysis (PCA), Cyber-T, and SAM. We show that ICA generated patterns that clearly characterized the malignant samples studied, in contrast to PCA. Moreover, ICA improved the biological validity of the genes identified as differentially expressed in endometrial carcinoma, compared to those found by Cyber-T and SAM. In particular, several genes involved in lipid metabolism that are differentially expressed in endometrial carcinoma were only found using this method. This report highlights the potential of ICA in the analysis of microarray data.
Resumo:
The stress release model, a stochastic version of the elastic-rebound theory, is applied to the historical earthquake data from three strong earthquake-prone regions of China, including North China, Southwest China, and the Taiwan seismic regions. The results show that the seismicity along a plate boundary (Taiwan) is more active than in intraplate regions (North and Southwest China). The degree of predictability or regularity of seismic events in these seismic regions, based on both the Akaike information criterion (AIC) and fitted sensitivity parameters, follows the order Taiwan, Southwest China, and North China, which is further identified by numerical simulations. (c) 2004 Elsevier Ltd. All rights reserved.
Resumo:
This report presents discharge, chemical analyses, temperatures, and specific conductance records collected at 25 surface-water sites and chemical analyses of ground water, well descriptions and records of ground-water levels collected at 164 ground-water sites. It also contains 35 logs of the sedimentary rocks penetrated in the drilling of wells and test borings ranging in depth from 147 to 625 feet. These hydrologic data were collected as part of an investigation of the water resources of the county. The interpretative results of the investigation are in the report entitled, "Water resources of Walton County," by C. A. Pascale (in preparation, 1971). (108 page document)