792 resultados para Pattern mining
Resumo:
The Mississippi Valley-type (MVT) Pb-Zn ore district at Mezica is hosted by Middle to Upper Triassic platform carbonate rocks in the Northern Karavanke/Drau Range geotectonic units of the Eastern Alps, northeastern Slovenia. The mineralization at Mezica covers an area of 64 km(2) with more than 350 orebodies and numerous galena and sphalerite occurrences, which formed epigenetically, both conformable and discordant to bedding. While knowledge on the style of mineralization has grown considerably, the origin of discordant mineralization is still debated. Sulfur stable isotope analyses of 149 sulfide samples from the different types of orebodies provide new insights on the genesis of these mineralizations and their relationship. Over the whole mining district, sphalerite and galena have delta(34)S values in the range of -24.7 to -1.5% VCDT (-13.5 +/- 5.0%) and -24.7 to -1.4% (-10.7 +/- 5.9%), respectively. These values are in the range of the main MVT deposits of the Drau Range. All sulfide delta(34)S values are negative within a broad range, with delta(34)S(pyrite) < delta(34)S(sphalerite) < delta(34)S(galena) for both conformable and discordant orebodies, indicating isotopically heterogeneous H(2)S in the ore-forming fluids and precipitation of the sulfides at thermodynamic disequilibrium. This clearly supports that the main sulfide sulfur originates from bacterially mediated reduction (BSR) of Middle to Upper Triassic seawater sulfate or evaporite sulfate. Thermochemical sulfate reduction (TSR) by organic compounds contributed a minor amount of (34)S-enriched H(2)S to the ore fluid. The variations of delta(34)S values of galena and coarse-grained sphalerite at orefield scale are generally larger than the differences observed in single hand specimens. The progressively more negative delta(34)S values with time along the different sphalerite generations are consistent with mixing of different H(2)S sources, with a decreasing contribution of H(2)S from regional TSR, and an increase from a local H(2)S reservoir produced by BSR (i.e., sedimentary biogenic pyrite, organo-sulfur compounds). Galena in discordant ore (-11.9 to -1.7%; -7.0 +/- 2.7%, n=12) tends to be depleted in (34)S compared with conformable ore (-24.7 to -2.8%, -11.7 +/- 6.2%, n=39). A similar trend is observed from fine-crystalline sphalerite I to coarse open-space filling sphalerite II. Some variation of the sulfide delta(34)S values is attributed to the inherent variability of bacterial sulfate reduction, including metabolic recycling in a locally partially closed system and contribution of H(2)S from hydrolysis of biogenic pyrite and thermal cracking of organo-sulfur compounds. The results suggest that the conformable orebodies originated by mixing of hydrothermal saline metal-rich fluid with H(2)S-rich pore waters during late burial diagenesis, while the discordant orebodies formed by mobilization of the earlier conformable mineralization.
Resumo:
Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.
Resumo:
The aim of this study was to group temporal profiles of 10-day composites NDVI product by similarity, which was obtained by the SPOT Vegetation sensor, for municipalities with high soybean production in the state of Paraná, Brazil, in the 2005/2006 cropping season. Data mining is a valuable tool that allows extracting knowledge from a database, identifying valid, new, potentially useful and understandable patterns. Therefore, it was used the methods for clusters generation by means of the algorithms K-Means, MAXVER and DBSCAN, implemented in the WEKA software package. Clusters were created based on the average temporal profiles of NDVI of the 277 municipalities with high soybean production in the state and the best results were found with the K-Means algorithm, grouping the municipalities into six clusters, considering the period from the beginning of October until the end of March, which is equivalent to the crop vegetative cycle. Half of the generated clusters presented spectro-temporal pattern, a characteristic of soybeans and were mostly under the soybean belt in the state of Paraná, which shows good results that were obtained with the proposed methodology as for identification of homogeneous areas. These results will be useful for the creation of regional soybean "masks" to estimate the planted area for this crop.
Resumo:
This study aimed to identify differences in swine vocalization pattern according to animal gender and different stress conditions. A total of 150 barrow males and 150 females (Dalland® genetic strain), aged 100 days, were used in the experiment. Pigs were exposed to different stressful situations: thirst (no access to water), hunger (no access to food), and thermal stress (THI exceeding 74). For the control treatment, animals were kept under a comfort situation (animals with full access to food and water, with environmental THI lower than 70). Acoustic signals were recorded every 30 minutes, totaling six samples for each stress situation. Afterwards, the audios were analyzed by Praat® 5.1.19 software, generating a sound spectrum. For determination of stress conditions, data were processed by WEKA® 3.5 software, using the decision tree algorithm C4.5, known as J48 in the software environment, considering cross-validation with samples of 10% (10-fold cross-validation). According to the Decision Tree, the acoustic most important attribute for the classification of stress conditions was sound Intensity (root node). It was not possible to identify, using the tested attributes, the animal gender by vocal register. A decision tree was generated for recognition of situations of swine hunger, thirst, and heat stress from records of sound intensity, Pitch frequency, and Formant 1.
Resumo:
Full Text / Article complet
Resumo:
This thesis Entitled Environmental impact of Sand Mining :A case Study in the river catchments of vembanad lake southwest india.The entire study is addressed in nine chapters. Chapter l deals with the general introduction about rivers, problems of river sand mining, objectives, location of the study area and scope of the study. A detailed review on river classification, classic concepts in riverine studies, geological work of rivers and channel processes, importance of river ecosystems and its need for management are dealt in Chapter 2. Chapter 3 gives a comprehensive account of the study area - its location, administrative divisions, physiography, soil, geology, land use and living and non-living resources. The various methods adopted in the study are dealt in Chapter 4. Chapter 5 contains river characteristics like drainage, environmental and geologic setting, channel characteristics, river discharge and water quality of the study area. Chapter 6 gives an account of river sand mining (instream and floodplain mining) from the study area. The various environmental problems of river sand mining on the land adjoining the river banks, river channel, water, biotic and social / human environments of the area and data interpretation are presented in Chapter 7. Chapter 8 deals with the Environmental Impact Assessment (EIA) and Environmental Management Plan (EMP) of sand mining from the river catchments of Vembanad lake.
Resumo:
Data mining is one of the hottest research areas nowadays as it has got wide variety of applications in common man’s life to make the world a better place to live. It is all about finding interesting hidden patterns in a huge history data base. As an example, from a sales data base, one can find an interesting pattern like “people who buy magazines tend to buy news papers also” using data mining. Now in the sales point of view the advantage is that one can place these things together in the shop to increase sales. In this research work, data mining is effectively applied to a domain called placement chance prediction, since taking wise career decision is so crucial for anybody for sure. In India technical manpower analysis is carried out by an organization named National Technical Manpower Information System (NTMIS), established in 1983-84 by India's Ministry of Education & Culture. The NTMIS comprises of a lead centre in the IAMR, New Delhi, and 21 nodal centres located at different parts of the country. The Kerala State Nodal Centre is located at Cochin University of Science and Technology. In Nodal Centre, they collect placement information by sending postal questionnaire to passed out students on a regular basis. From this raw data available in the nodal centre, a history data base was prepared. Each record in this data base includes entrance rank ranges, reservation, Sector, Sex, and a particular engineering. From each such combination of attributes from the history data base of student records, corresponding placement chances is computed and stored in the history data base. From this data, various popular data mining models are built and tested. These models can be used to predict the most suitable branch for a particular new student with one of the above combination of criteria. Also a detailed performance comparison of the various data mining models is done.This research work proposes to use a combination of data mining models namely a hybrid stacking ensemble for better predictions. A strategy to predict the overall absorption rate for various branches as well as the time it takes for all the students of a particular branch to get placed etc are also proposed. Finally, this research work puts forward a new data mining algorithm namely C 4.5 * stat for numeric data sets which has been proved to have competent accuracy over standard benchmarking data sets called UCI data sets. It also proposes an optimization strategy called parameter tuning to improve the standard C 4.5 algorithm. As a summary this research work passes through all four dimensions for a typical data mining research work, namely application to a domain, development of classifier models, optimization and ensemble methods.
Resumo:
Frequent pattern discovery in structured data is receiving an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset.
Resumo:
This paper provides an extended analysis of livelihood diversification in rural Tanzania, with special emphasis on artisanal and small-scale mining (ASM). Over the past decade, this sector of industry, which is labour-intensive and comprises an array of rudimentary and semi-mechanized operations, has become an indispensable economic activity throughout Sub-Saharan Africa, providing employment to a host of redundant public sector workers, retrenched large-scale mine labourers and poor farmers. In many of the region’s rural areas, it is overtaking subsistence agriculture as the primary industry. Such a pattern appears to be unfolding within the Morogoro and Mbeya regions of southern Tanzania, where findings from recent research suggest that a growing number of smallholder farmers are turning to ASM for employment and financial support. It is imperative that national rural development programmes take this trend into account and provide support to these people.
Resumo:
Artisanal and small-scale mining (ASM) is replacing smallholder farming as the principal income source in parts of rural Ghana. Structural adjustment policies have removed support for the country’s smallholders, devalued their produce substantially and stiffened competition with large-scale counterparts. Over one million people nationwide are now engaged in ASM. Findings from qualitative research in Ghana’s Eastern Region are drawn upon to improve understanding of the factors driving this pattern of rural livelihood diversification. The ASM sector and farming are shown to be complementary, contrary to common depictions in policy and academic literature.
Resumo:
In the recent years, the area of data mining has been experiencing considerable demand for technologies that extract knowledge from large and complex data sources. There has been substantial commercial interest as well as active research in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from large datasets. Artificial neural networks (NNs) are popular biologically-inspired intelligent methodologies, whose classification, prediction, and pattern recognition capabilities have been utilized successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction, and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks. © 2012 Wiley Periodicals, Inc.
Resumo:
Groundwaters and surface waters from an area of treatment of sand for industrial purposes at Analandia municipality, nearly in the center of Sao Paulo State, Brazil, were chemically and isotopically analyzed with two aims: to evaluate if the anthropogenic activities that has taken place for the last 6 years is affecting the quality of the hydrological resources and to relate the hydrogeochemical behaviour of the uranium isotopes 234U and 238U with the pattern of circulation of groundwaters.
Resumo:
The increase in the number of spatial data collected has motivated the development of geovisualisation techniques, aiming to provide an important resource to support the extraction of knowledge and decision making. One of these techniques are 3D graphs, which provides a dynamic and flexible increase of the results analysis obtained by the spatial data mining algorithms, principally when there are incidences of georeferenced objects in a same local. This work presented as an original contribution the potentialisation of visual resources in a computational environment of spatial data mining and, afterwards, the efficiency of these techniques is demonstrated with the use of a real database. The application has shown to be very interesting in interpreting obtained results, such as patterns that occurred in a same locality and to provide support for activities which could be done as from the visualisation of results. © 2013 Springer-Verlag.
Resumo:
Multi-element analysis of honey samples was carried out with the aim of developing a reliable method of tracing the origin of honey. Forty-two chemical elements were determined (Al, Cu, Pb, Zn, Mn, Cd, Tl, Co, Ni, Rb, Ba, Be, Bi, U, V, Fe, Pt, Pd, Te, Hf, Mo, Sn, Sb, P, La, Mg, I, Sm, Tb, Dy, Sd, Th, Pr, Nd, Tm, Yb, Lu, Gd, Ho, Er, Ce, Cr) by inductively coupled plasma mass spectrometry (ICP-MS). Then, three machine learning tools for classification and two for attribute selection were applied in order to prove that it is possible to use data mining tools to find the region where honey originated. Our results clearly demonstrate the potential of Support Vector Machine (SVM), Multilayer Perceptron (MLP) and Random Forest (RF) chemometric tools for honey origin identification. Moreover, the selection tools allowed a reduction from 42 trace element concentrations to only 5. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
The Dipteran a native Brazilian insect that has become a valuable model system for developmental biology research because it provides an interesting opportunity to study a different type of insect oogenesis. Sequences from a cDNA library that was constructed with poly A + RNA from the ovaries of larvae at different ages were analyzed. Molecular characterization confirmed interesting findings, such as the presence of . The gene encodes a conserved RNA-binding protein that is required during early development for the maintenance and division of the primordial germ cells of Diptera. plays an important role in specifying the posterior regions of insect embryos and is important for abdomen formation. In the present work, we showed the spatial and temporal expression profiles of this important gene, which is involved in oogenesis and early development. Data mining techniques were used to obtain the complete sequence of . Bioinformatic tools were used to determine the following: (1) the secondary structure of the 3'-untranslated region of the mRNA, (2) the encoded protein of the isolated gene, (3) the conserved zinc-finger domains of the Nanos protein, and (4) phylogenetic analyses. Furthermore, RNA in situ hybridization and immunolocalization were used to determine mRNA and protein expression in the tissues that were studied and to define as a germ cell molecular marker.