807 resultados para Link Mining
Resumo:
This paper critically reflects on why, in many rural stretches of sub-Saharan Africa, scores of people engage in artisanal and small-scale mining (ASM) activity – low-tech, labour intensive mineral extraction – for lengthy periods of time. It argues that a large share of the region’s ASM operators have mounting debts which prevent them from pursuing alternative, less arduous, employment. The paper concludes with an analysis of findings from research carried out by the author in Talensi-Nabdam District, Northern Ghana, which captures the essence of the poverty trap now plaguing so many ASM communities in sub-Saharan Africa.
Resumo:
To understand the resilience of aquatic ecosystems to environmental change, it is important to determine how multiple, related environmental factors, such as near-surface air temperature and river flow, will change during the next century. This study develops a novel methodology that combines statistical downscaling and fish species distribution modeling, to enhance the understanding of how global climate changes (modeled by global climate models at coarse-resolution) may affect local riverine fish diversity. The novelty of this work is the downscaling framework developed to provide suitable future projections of fish habitat descriptors, focusing particularly on the hydrology which has been rarely considered in previous studies. The proposed modeling framework was developed and tested in a major European system, the Adour-Garonne river basin (SW France, 116,000 km(2)), which covers distinct hydrological and thermal regions from the Pyrenees to the Atlantic coast. The simulations suggest that, by 2100, the mean annual stream flow is projected to decrease by approximately 15% and temperature to increase by approximately 1.2 °C, on average. As consequence, the majority of cool- and warm-water fish species is projected to expand their geographical range within the basin while the few cold-water species will experience a reduction in their distribution. The limitations and potential benefits of the proposed modeling approach are discussed. Copyright © 2012 Elsevier B.V. All rights reserved.
Resumo:
Artisanal and small-scale mining (ASM) is replacing smallholder farming as the principal income source in parts of rural Ghana. Structural adjustment policies have removed support for the country’s smallholders, devalued their produce substantially and stiffened competition with large-scale counterparts. Over one million people nationwide are now engaged in ASM. Findings from qualitative research in Ghana’s Eastern Region are drawn upon to improve understanding of the factors driving this pattern of rural livelihood diversification. The ASM sector and farming are shown to be complementary, contrary to common depictions in policy and academic literature.
Resumo:
An analysis of observational data in the Barents Sea along a meridian at 33°30' E between 70°30' and 72°30' N has reported a negative correlation between El Niño/La Niña Southern Oscillation (ENSO) events and water temperature in the top 200 m: the temperature drops about 0.5 °C during warm ENSO events while during cold ENSO events the top 200 m layer of the Barents Sea is warmer. Results from 1 and 1/4-degree global NEMO models show a similar response for the whole Barents Sea. During the strong warm ENSO event in 1997–1998 an anomalous anticyclonic atmospheric circulation over the Barents Sea enhances heat loses, as well as substantially influencing the Barents Sea inflow from the North Atlantic, via changes in ocean currents. Under normal conditions along the Scandinavian peninsula there is a warm current entering the Barents Sea from the North Atlantic, however after the 1997–1998 event this current is weakened. During 1997–1998 the model annual mean temperature in the Barents Sea is decreased by about 0.8 °C, also resulting in a higher sea ice volume. In contrast during the cold ENSO events in 1999–2000 and 2007–2008, the model shows a lower sea ice volume, and higher annual mean temperatures in the upper layer of the Barents Sea of about 0.7 °C. An analysis of model data shows that the strength of the Atlantic inflow in the Barents Sea is the main cause of heat content variability, and is forced by changing pressure and winds in the North Atlantic. However, surface heat-exchange with the atmosphere provides the means by which the Barents sea heat budget relaxes to normal in the subsequent year after the ENSO events.
Resumo:
Many researchers have tried to assess the number of words adults know. A general conclusion which emerges from such studies is that vocabularies of English monolingual adults are very large with considerable variation. This variation is important given that the vocabulary size of schoolchildren in the early years of school is thought to materially affect subsequent educational attainment. The data is difficult to interpret, however, because of the different methodologies which researchers use. The study in this paper uses the frequency-based vocabulary size test from Goulden et al (1990) and investigates the vocabulary knowledge of undergraduates in three British universities. The results suggest that monolingual speaker vocabulary sizes may be much smaller than is generally thought with far less variation than is usually reported. An average figure of about 10,000 English words families emerges for entrants to university. This figure suggests that many students must struggle with the comprehension of university level texts.
Resumo:
OBJECTIVES: The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. METHODS: To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis. RESULTS: To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse. CONCLUSIONS: Web and grid services, especially pre-defined data mining services that can run on or 'near' the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.
Resumo:
The Distributed Rule Induction (DRI) project at the University of Portsmouth is concerned with distributed data mining algorithms for automatically generating rules of all kinds. In this paper we present a system architecture and its implementation for inducing modular classification rules in parallel in a local area network using a distributed blackboard system. We present initial results of a prototype implementation based on the Prism algorithm.
Resumo:
Pocket Data Mining (PDM) is our new term describing collaborative mining of streaming data in mobile and distributed computing environments. With sheer amounts of data streams are now available for subscription on our smart mobile phones, the potential of using this data for decision making using data stream mining techniques has now been achievable owing to the increasing power of these handheld devices. Wireless communication among these devices using Bluetooth and WiFi technologies has opened the door wide for collaborative mining among the mobile devices within the same range that are running data mining techniques targeting the same application. This paper proposes a new architecture that we have prototyped for realizing the significant applications in this area. We have proposed using mobile software agents in this application for several reasons. Most importantly the autonomic intelligent behaviour of the agent technology has been the driving force for using it in this application. Other efficiency reasons are discussed in details in this paper. Experimental results showing the feasibility of the proposed architecture are presented and discussed.
Resumo:
In a world where data is captured on a large scale the major challenge for data mining algorithms is to be able to scale up to large datasets. There are two main approaches to inducing classification rules, one is the divide and conquer approach, also known as the top down induction of decision trees; the other approach is called the separate and conquer approach. A considerable amount of work has been done on scaling up the divide and conquer approach. However, very little work has been conducted on scaling up the separate and conquer approach.In this work we describe a parallel framework that allows the parallelisation of a certain family of separate and conquer algorithms, the Prism family. Parallelisation helps the Prism family of algorithms to harvest additional computer resources in a network of computers in order to make the induction of classification rules scale better on large datasets. Our framework also incorporates a pre-pruning facility for parallel Prism algorithms.
Resumo:
Inducing rules from very large datasets is one of the most challenging areas in data mining. Several approaches exist to scaling up classification rule induction to large datasets, namely data reduction and the parallelisation of classification rule induction algorithms. In the area of parallelisation of classification rule induction algorithms most of the work has been concentrated on the Top Down Induction of Decision Trees (TDIDT), also known as the ‘divide and conquer’ approach. However powerful alternative algorithms exist that induce modular rules. Most of these alternative algorithms follow the ‘separate and conquer’ approach of inducing rules, but very little work has been done to make the ‘separate and conquer’ approach scale better on large training data. This paper examines the potential of the recently developed blackboard based J-PMCRI methodology for parallelising modular classification rule induction algorithms that follow the ‘separate and conquer’ approach. A concrete implementation of the methodology is evaluated empirically on very large datasets.
Resumo:
Collaborative mining of distributed data streams in a mobile computing environment is referred to as Pocket Data Mining PDM. Hoeffding trees techniques have been experimentally and analytically validated for data stream classification. In this paper, we have proposed, developed and evaluated the adoption of distributed Hoeffding trees for classifying streaming data in PDM applications. We have identified a realistic scenario in which different users equipped with smart mobile devices run a local Hoeffding tree classifier on a subset of the attributes. Thus, we have investigated the mining of vertically partitioned datasets with possible overlap of attributes, which is the more likely case. Our experimental results have validated the efficiency of our proposed model achieving promising accuracy for real deployment.
Resumo:
Pocket Data Mining (PDM) describes the full process of analysing data streams in mobile ad hoc distributed environments. Advances in mobile devices like smart phones and tablet computers have made it possible for a wide range of applications to run in such an environment. In this paper, we propose the adoption of data stream classification techniques for PDM. Evident by a thorough experimental study, it has been proved that running heterogeneous/different, or homogeneous/similar data stream classification techniques over vertically partitioned data (data partitioned according to the feature space) results in comparable performance to batch and centralised learning techniques.
Resumo:
In the recent years, the area of data mining has been experiencing considerable demand for technologies that extract knowledge from large and complex data sources. There has been substantial commercial interest as well as active research in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from large datasets. Artificial neural networks (NNs) are popular biologically-inspired intelligent methodologies, whose classification, prediction, and pattern recognition capabilities have been utilized successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction, and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks. © 2012 Wiley Periodicals, Inc.
Resumo:
The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform data mining and other analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data that is used to populate the second component, and a data warehouse that contains important molecular properties. These properties may be used for data mining studies. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular, we look at two aspects: firstly, how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories — this is an important and challenging aspect of P-found, due to the large data volumes involved and the desire of scientists to maintain control of their own data. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling scientific discovery.
Resumo:
This article clarifies what was done with the sub-7-man positions in data-mining Harold van der Heijden's 'HHdbIV' database of chess studies prior to its publication. It emphasises that only positions in the main lines of studies were examined and that the information about uniqueness of move was not incorporated in HHdbIV. There is some reflection on the separate technical and artistic dimensions of study evaluation.