Biblioteca Digital

939 resultados para 650200 Mining and Extraction

Environmental data mining and modeling based on machine learning algorithms and geostatistics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper presents some contemporary approaches to spatial environmental data analysis. The main topics are concentrated on the decision-oriented problems of environmental spatial data mining and modeling: valorization and representativity of data with the help of exploratory data analysis, spatial predictions, probabilistic and risk mapping, development and application of conditional stochastic simulation models. The innovative part of the paper presents integrated/hybrid model-machine learning (ML) residuals sequential simulations-MLRSS. The models are based on multilayer perceptron and support vector regression ML algorithms used for modeling long-range spatial trends and sequential simulations of the residuals. NIL algorithms deliver non-linear solution for the spatial non-stationary problems, which are difficult for geostatistical approach. Geostatistical tools (variography) are used to characterize performance of ML algorithms, by analyzing quality and quantity of the spatially structured information extracted from data with ML algorithms. Sequential simulations provide efficient assessment of uncertainty and spatial variability. Case study from the Chernobyl fallouts illustrates the performance of the proposed model. It is shown that probability mapping, provided by the combination of ML data driven and geostatistical model based approaches, can be efficiently used in decision-making process. (C) 2003 Elsevier Ltd. All rights reserved.

Influence of rootstocks and pruning times on yield and on nutrient content and extraction in 'Niagara Rosada' grapevine

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this work was to evaluate the influence of rootstocks and pruning times on yield and on nutrient content and extraction by pruned branches and harvested bunches of 'Niagara Rosada' grapevine in subtropical climate. The rootstocks 'IAC 766', 'IAC 572', 'IAC 313', 'IAC 571-6', and '106-8 Mgt' were evaluated. Treatments consisted of a combination between five rootstocks and three pruning times. At pruning, fresh and dry matter mass of branches were evaluated to estimate biomass accumulation. At harvest, yield was estimated by weighing of bunches per plant. Branches and bunches were sampled at pruning and at harvest, respectively, for nutrient content analysis. Nutrient content and dry matter mass of branches and bunches were used to estimate total nutrient extraction. 'Niagara Rosada' grapevine grafted onto the 'IAC 572' rootstock had the highest yield and dry matter mass of bunches, which were significantly different from the ones observed in 'Niagara Rosada'/'IAC 313'. 'Niagara Rosada' grafted onto the 'IAC 572' rootstock extracted the largest quantity of K, P, Mg, S, Cu, and Fe, differing from 'IAC 313' and 'IAC 766' in K and P extraction, and from '106-8 Mgt' in Mg and S extraction. Winter pruning results in higher yield, dry matter accumulation by branches, and total nutrient content and extraction.

Educational data mining and learning analytics : Clasificación de las matriculaciones de A.D.E. en la UOC

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Trabajo de investigación que realiza un estudio clasificatorio de las asignaturas matriculadas en la carrera de Administración y Dirección de Empresas de la UOC en relación a su resultado. Se proponen diferentes métodos y modelos de comprensión del entorno en el que se realiza el estudio.

On Option Price Information Content and Extraction of Implied Probability Distributions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this study we used market settlement prices of European call options on stock index futures to extract implied probability distribution function (PDF). The method used produces a PDF of returns of an underlying asset at expiration date from implied volatility smile. With this method, the assumption of lognormal distribution (Black-Scholes model) is tested. The market view of the asset price dynamics can then be used for various purposes (hedging, speculation). We used the so called smoothing approach for implied PDF extraction presented by Shimko (1993). In our analysis we obtained implied volatility smiles from index futures markets (S&P 500 and DAX indices) and standardized them. The method introduced by Breeden and Litzenberger (1978) was then used on PDF extraction. The results show significant deviations from the assumption of lognormal returns for S&P500 options while DAX options mostly fit the lognormal distribution. A deviant subjective view of PDF can be used to form a strategy as discussed in the last section.

Construction of an index of information from clinical practice in Radiology and Imaging Diagnosis based on text mining and thesaurus

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective To construct a Portuguese language index of information on the practice of diagnostic radiology in order to improve the standardization of the medical language and terminology. Materials and Methods A total of 61,461 definitive reports were collected from the database of the Radiology Information System at Hospital das Clínicas – Faculdade de Medicina de Ribeirão Preto (RIS/HCFMRP) as follows: 30,000 chest x-ray reports; 27,000 mammography reports; and 4,461 thyroid ultrasonography reports. The text mining technique was applied for the selection of terms, and the ANSI/NISO Z39.19-2005 standard was utilized to construct the index based on a thesaurus structure. The system was created in *html. Results The text mining resulted in a set of 358,236 (n = 100%) words. Out of this total, 76,347 (n = 21%) terms were selected to form the index. Such terms refer to anatomical pathology description, imaging techniques, equipment, type of study and some other composite terms. The index system was developed with 78,538 *html web pages. Conclusion The utilization of text mining on a radiological reports database has allowed the construction of a lexical system in Portuguese language consistent with the clinical practice in Radiology.

Data mining and mall users profile

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Marketing scholars have suggested a need for more empirical research on consumer response to malls, in order to have a better understanding of the variables that explain the behavior of the consumers. The segmentation methodology CHAID (Chi-square automatic interaction detection) was used in order to identify the profiles of consumers with regard to their activities at malls, on the basis of socio-demographic variables and behavioral variables (how and with whom they go to the malls). A sample of 790 subjects answered an online questionnaire. The CHAID analysis of the results was used to identify the profiles of consumers with regard to their activities at malls. In the set of variables analyzed the transport used in order to go shopping and the frequency of visits to centers are the main predictors of behavior in malls. The results provide guidelines for the development of effective strategies to attract consumers to malls and retain them there.

Use of data mining and spectral profiles to differentiate condition after harvest of coffee plants

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study aimed at identifying different conditions of coffee plants after harvesting period, using data mining and spectral behavior profiles from Hyperion/EO1 sensor. The Hyperion image, with spatial resolution of 30 m, was acquired in August 28th, 2008, at the end of the coffee harvest season in the studied area. For pre-processing imaging, atmospheric and signal/noise effect corrections were carried out using Flaash and MNF (Minimum Noise Fraction Transform) algorithms, respectively. Spectral behavior profiles (38) of different coffee varieties were generated from 150 Hyperion bands. The spectral behavior profiles were analyzed by Expectation-Maximization (EM) algorithm considering 2; 3; 4 and 5 clusters. T-test with 5% of significance was used to verify the similarity among the wavelength cluster means. The results demonstrated that it is possible to separate five different clusters, which were comprised by different coffee crop conditions making possible to improve future intervention actions.

Identification of research trends in the field of separation processes. Application of epidemiological model, citation analysis, text mining, and technical analysis of the financial markets

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Choice of industrial development options and the relevant allocation of the research funds become more and more difficult because of the increasing R&D costs and pressure for shorter development period. Forecast of the research progress is based on the analysis of the publications activity in the field of interest as well as on the dynamics of its change. Moreover, allocation of funds is hindered by exponential growth in the number of publications and patents. Thematic clusters become more and more difficult to identify, and their evolution hard to follow. The existing approaches of research field structuring and identification of its development are very limited. They do not identify the thematic clusters with adequate precision while the identified trends are often ambiguous. Therefore, there is a clear need to develop methods and tools, which are able to identify developing fields of research. The main objective of this Thesis is to develop tools and methods helping in the identification of the promising research topics in the field of separation processes. Two structuring methods as well as three approaches for identification of the development trends have been proposed. The proposed methods have been applied to the analysis of the research on distillation and filtration. The results show that the developed methods are universal and could be used to study of the various fields of research. The identified thematic clusters and the forecasted trends of their development have been confirmed in almost all tested cases. It proves the universality of the proposed methods. The results allow for identification of the fast-growing scientific fields as well as the topics characterized by stagnant or diminishing research activity.

A Phase Model for Building a Customer Value Based Sales Tool – Case Mining and Metals Industry

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this research is to observe the state of customer value management in Outotec Oyj, determine the key development areas and develop a phase model with which to guide the development of a customer value based sales tool. The study was conducted with a constructive research approach with the focus of identifying a problem and developing a solution for the problem. As a basis for the study, the current literature involving customer value assessment and solution and customer value selling was studied. The data was collected by conducting 16 interviews in two rounds within the company and it was analyzed by coding openly. First, seven important development areas were identified, out of which the most critical were “Customer value mindset inside the company” and “Coordination of customer value management activities”. Utilizing these seven areas three functionality requirements, “Preparation”, “Outotec’s value creation and communication” and “Documentation” and three development requirements for a customer value sales tool were identified. The study concluded with the formulation of a phase model for building a customer value based sales tool. The model included five steps that were defined as 1) Enable customer value utilization, 2) Connect with the customer, 3) Create customer value, 4) Define tool to facilitate value selling and 5) Develop sales tool. Further practical activities were also recommended as a guide for executing the phase model.

Committing employees through internal communication in cross-border acquisitions : Case: Sandvik Mining and Construction

Relevância:

100.00% 100.00%

Publicador:

Kustannustiedon jakaminen alihankintaverkoston ohjauksessa : case: Sandvik Mining and Construction, UHM

Relevância:

100.00% 100.00%

Publicador:

Characterization and extraction of volatile compounds from pineapple (Ananas comosus L. Merril) processing residues

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to extract and identify volatile compounds from pineapple residues generated during concentrated juice processing. Distillates of pineapple residues were obtained using the following techniques: simple hydrodistillation and hydrodistillation by passing nitrogen gas. The volatile compounds present in the distillates were captured by the solid-phase microextraction technique. The volatile compounds were identified in a system of high resolution gas chromatography system coupled with mass spectrometry using a polyethylene glycol polar capillary column as stationary phase. The pineapple residues constituted mostly of esters (35%), followed by ketones (26%), alcohols (18%), aldehydes (9%), acids (3%) and other compounds (9%). Odor-active volatile compounds were mainly identified in the distillate obtained using hydrodistillation by passing nitrogen gas, namely decanal, ethyl octanoate, acetic acid, 1-hexanol, and ketones such as γ-hexalactone, γ-octalactone, δ-octalactone, γ-decalactone, and γ-dodecalactone. This suggests that the use of an inert gas and lower temperatures helped maintain higher amounts of flavor compounds. These data indicate that pineapple processing residue contained important volatile compounds which can be extracted and used as aroma enhancing products and have high potential for the production of value-added natural essences.

QUANTIFICATION AND EXTRACTION OF SURFACE FEATURES FROM DIGITAL TERRAIN MODELS

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Digital Terrain Models (DTMs) are important in geology and geomorphology, since elevation data contains a lot of information pertaining to geomorphological processes that influence the topography. The first derivative of topography is attitude; the second is curvature. GIS tools were developed for derivation of strike, dip, curvature and curvature orientation from Digital Elevation Models (DEMs). A method for displaying both strike and dip simultaneously as colour-coded visualization (AVA) was implemented. A plug-in for calculating strike and dip via Least Squares Regression was created first using VB.NET. Further research produced a more computationally efficient solution, convolution filtering, which was implemented as Python scripts. These scripts were also used for calculation of curvature and curvature orientation. The application of these tools was demonstrated by performing morphometric studies on datasets from Earth and Mars. The tools show promise, however more work is needed to explore their full potential and possible uses.

Semantic web mining and the representation, analysis, and evolution of web space

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: Growing numbers of researchers work on improving the results of Web Mining by exploiting semantic structures in the Web, and they use Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The second aim of this paper is to use these concepts to circumscribe what Web space is, what it represents and how it can be represented and analyzed. This is used to sketch the role that Semantic Web Mining and the software agents and human agents involved in it can play in the evolution of Web space.

Adaptive Real-time Anomaly-based Intrusion Detection using Data Mining and Machine Learning Techniques

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.

«
1
2
3
4
5
6
7
8
...
62
63
»