775 resultados para mining data streams
Resumo:
This paper presents visual detection and classification of light vehicles and personnel on a mine site.We capitalise on the rapid advances of ConvNet based object recognition but highlight that a naive black box approach results in a significant number of false positives. In particular, the lack of domain specific training data and the unique landscape in a mine site causes a high rate of errors. We exploit the abundance of background-only images to train a k-means classifier to complement the ConvNet. Furthermore, localisation of objects of interest and a reduction in computation is enabled through region proposals. Our system is tested on over 10km of real mine site data and we were able to detect both light vehicles and personnel. We show that the introduction of our background model can reduce the false positive rate by an order of magnitude.
Resumo:
This research proposes a multi-dimensional model for Opinion Mining, which integrates customers' characteristics and their opinions about products (or services). Customer opinions are valuable for companies to deliver right products or services to their customers. This research presents a comprehensive framework to evaluate opinions' orientation based on products' hierarchy attributes. It also provides an alternative way to obtain opinion summaries for different groups of customers and different categories of produces.
Resumo:
This paper discusses some of the sensing technologies and control approaches available for guiding robot manipulators for a class of underground mining tasks including drilling jumbos, bolting arms, shotcreters or explosive chargers. Data acquired with such sensors, in the laboratory and underground, is presented.
Resumo:
Effectively capturing opportunities requires rapid decision-making. We investigate the speed of opportunity evaluation decisions by focusing on firms' venture termination and venture advancement decisions. Experience, standard operating procedures, and confidence allow firms to make opportunity evaluation decisions faster; we propose that a firm's attentional orientation, as reflected in its project portfolio, limits the number of domains in which these speed-enhancing mechanisms can be developed. Hence firms' decision speed is likely to vary between different types of decisions. Using unique data on 3,269 mineral exploration ventures in the Australian mining industry, we find that firms with a higher degree of attention toward earlier-stage exploration activities are quicker to abandon potential opportunities in early development but slower to do so later, and that such firms are also slower to advance on potential opportunities at all stages compared to firms that focus their attention differently. Market dynamism moderates these relationships, but only with regard to initial evaluation decisions. Our study extends research on decision speed by showing that firms are not necessarily fast or slow regarding all the decisions they make, and by offering an opportunity evaluation framework that recognizes that decision makers can, in fact often do, pursue multiple potential opportunities simultaneously.
Resumo:
Product reviews are the foremost source of information for customers and manufacturers to help them make appropriate purchasing and production decisions. Natural language data is typically very sparse; the most common words are those that do not carry a lot of semantic content, and occurrences of any particular content-bearing word are rare, while co-occurrences of these words are rarer. Mining product aspects, along with corresponding opinions, is essential for Aspect-Based Opinion Mining (ABOM) as a result of the e-commerce revolution. Therefore, the need for automatic mining of reviews has reached a peak. In this work, we deal with ABOM as sequence labelling problem and propose a supervised extraction method to identify product aspects and corresponding opinions. We use Conditional Random Fields (CRFs) to solve the extraction problem and propose a feature function to enhance accuracy. The proposed method is evaluated using two different datasets. We also evaluate the effectiveness of feature function and the optimisation through multiple experiments.
Resumo:
Bird species richness survey is one of the most intriguing ecological topics for evaluating environmental health. Here, bird species richness denotes the number of unique bird species in a particular area. Factors affecting the investigation of bird species richness include weather, observation bias, and most importantly, the prohibitive costs of conducting surveys at large spatiotemporal scales. Thanks to advances in recording techniques, these problems have been alleviated by deploying sensors for acoustic data collection. Although automated detection techniques have been introduced to identify various bird species, the innate complexity of bird vocalizations, the background noise present in the recording and the escalating volumes of acoustic data pose a challenging task on determination of bird species richness. In this paper we proposed a two-step computer-assisted sampling approach for determining bird species richness in one-day acoustic data. First, a classification model is built based on acoustic indices for filtering out minutes that contain few bird species. Then the classified bird minutes are ordered by an acoustic index and the redundant temporal minutes are removed from the ranked minute sequence. The experimental results show that our method is more efficient in directing experts for determination of bird species compared with the previous methods.
Resumo:
With the explosion of information resources, there is an imminent need to understand interesting text features or topics in massive text information. This thesis proposes a theoretical model to accurately weight specific text features, such as patterns and n-grams. The proposed model achieves impressive performance in two data collections, Reuters Corpus Volume 1 (RCV1) and Reuters 21578.
Resumo:
This research studied distributed computing of all-to-all comparison problems with big data sets. The thesis formalised the problem, and developed a high-performance and scalable computing framework with a programming model, data distribution strategies and task scheduling policies to solve the problem. The study considered storage usage, data locality and load balancing for performance improvement in solving the problem. The research outcomes can be applied in bioinformatics, biometrics and data mining and other domains in which all-to-all comparisons are a typical computing pattern.
Resumo:
Precipitation-induced runoff and leaching from milled peat mining mires by peat types: a comparative method for estimating the loading of water bodies during peat production. This research project in environmental geology has arisen out of an observed need to be able to predict more accurately the loading of watercourses with detrimental organic substances and nutrients from already existing and planned peat production areas, since the authorities capacity for insisting on such predictions covering the whole duration of peat production in connection with evaluations of environmental impact is at present highly limited. National and international decisions regarding monitoring of the condition of watercourses and their improvement and restoration require more sophisticated evaluation methods in order to be able to forecast watercourse loading and its environmental impacts at the stage of land-use planning and preparations for peat production.The present project thus set out from the premise that it would be possible on the basis of existing mire and peat data properties to construct estimates for the typical loading from production mires over the whole duration of their exploitation. Finland has some 10 million hectares of peatland, accounting for almost a third of its total area. Macroclimatic conditions have varied in the course of the Holocene growth and development of this peatland, and with them the habitats of the peat-forming plants. Temperatures and moisture conditions have played a significant role in determining the dominant species of mire plants growing there at any particular time, the resulting mire types and the accumulation and deposition of plant remains to form the peat. The above climatic, environmental and mire development factors, together with ditching, have contributed, and continue to contribute, to the existence of peat horizons that differ in their physical and chemical properties, leading to differences in material transport between peatlands in a natural state and mires that have been ditched or prepared for forestry and peat production. Watercourse loading from the ditching of mires or their use for peat production can have detrimental effects on river and lake environments and their recreational use, especially where oxygen-consuming organic solids and soluble organic substances and nutrients are concerned. It has not previously been possible, however, to estimate in advance the watercourse loading likely to arise from ditching and peat production on the basis of the characteristics of the peat in a mire, although earlier observations have indicated that watercourse loading from peat production can vary greatly and it has been suggested that differences in peat properties may be of significance in this. Sprinkling is used here in combination with simulations of conditions in a milled peat production area to determine the influence of the physical and chemical properties of milled peats in production mires on surface runoff into the drainage ditches and the concentrations of material in the runoff water. Sprinkling and extraction experiments were carried out on 25 samples of milled Carex (C) and Sphagnum (S) peat of humification grades H 2.5 8.5 with moisture content in the range 23.4 89% on commencement of the first sprinkling, which was followed by a second sprinkling 24 hours later. The water retention capacity of the peat was best, and surface runoff lowest, with Sphagnum and Carex peat samples of humification grades H 2.5 6 in the moisture content class 56 75%. On account of the hydrophobicity of dry peat, runoff increased in a fairly regular manner with drying of the sample from 55% to 24 30%. Runoff from the samples with an original moisture content over 55% increased by 63% in the second round of sprinkling relative to the first, as they had practically reached saturation point on the first occasion, while those with an original moisture content below 55% retained their high runoff in the second round, due to continued hydrophobicity. The well-humified samples (H 6.5 8.5) with a moisture content over 80% showed a low water retention capacity and high runoff in both rounds of sprinkling. Loading of the runoff water with suspended solids, total phosphorus and total nitrogen, and also the chemical oxygen demand (CODMn O2), varied greatly in the sprinkling experiment, depending on the peat type and degree of humification, but concentrations of the same substances in the two sprinklings were closely or moderately closely correlated and these correlations were significant. The concentrations of suspended solids in the runoff water observed in the simulations of a peat production area and the direct surface runoff from it into the drainage ditch system in response to rain (sprinkling intensity 1.27 mm/min) varied c. 60-fold between the degrees of humification in the case of the Carex peats and c. 150-fold for the Sphagnum peats, while chemical oxygen demand varied c. 30-fold and c. 50-fold, respectively, total phosphorus c. 60-fold and c. 66-fold, total nitrogen c. 65-fold and c. 195-fold and ammonium nitrogen c. 90-fold and c. 30-fold. The increases in concentrations in the runoff water were very closely correlated with increases in humification of the peat. The correlations of the concentrations measured in extraction experiments (48 h) with peat type and degree of humification corresponded to those observed in the sprinkler experiments. The resulting figures for the surface runoff from a peat production area into the drainage ditches simulated by means of sprinkling and material concentrations in the runoff water were combined with statistics on the mean extent of daily rainfall (0 67 mm) during the frost-free period of the year (May October) over an observation period of 30 years to yield typical annual loading figures (kg/ha) for suspended solids (SS), chemical oxygen demand of organic matter (CODmn O2), total phosphorus (tot. P) and total nitrogen (tot. N) entering the ditches with respect to milled Carex (C) and Sphagnum (S) peats of humification grades H 2.5 8.5. In order to calculate the loading of drainage ditches from a milled peat production mire with the aid of these annual comparative values (in kg/ha), information is required on the properties of the intended production mire and its peat. Once data are available on the area of the mire, its peat depth, peat types and their degrees of humification, dry matter content, calorific value and corresponding energy content, it is possible to produce mutually comparable estimates for individual mires with respect to the annual loading of the drainage ditch system and the surrounding watercourse for the whole service life of the production area, the duration of this service life, determinations of energy content and the amount of loading per unit of energy generated (kg/MWh). In the 8 mires in the Köyhäjoki basin, Central Ostrobothnia, taken as an example, the loading of suspended solids (SS) in the drainage ditch networks calculated on the basis of the typical values obtained here and existing mire and peat data and expressed per unit of energy generated varied between the mires and horizons in the range 0.9 16.5 kg/MWh. One of the aims of this work was to develop means of making better use of existing mire and peat data and the results of corings and other field investigations. In this respect combination of the typical loading values (kg/ha) obtained here for S, SC, CS and C peats and the various degrees of humification (H 2.5 8.5) with the above mire and peat data by means of a computer program for the acquisition and handling of such data would enable all the information currently available and that deposited in the system in the future to be used for defining watercourse loading estimates for mires and comparing them with the corresponding estimates of energy content. The intention behind this work has been to respond to the challenge facing the energy generation industry to find larger peat production areas that exert less loading on the environment and to that facing the environmental authorities to improve the means available for estimating watercourse loading from peat production and its environmental impacts in advance. The results conform well to the initial hypothesis and to the goals laid down for the research and should enable watercourse loading from existing and planned peat production to be evaluated better in the future and the resulting impacts to be taken into account when planning land use and energy generation. The advance loading information available in this way would be of value in the selection of individual peat production areas, the planning of their exploitation, the introduction of water protection measures and the planning of loading inspections, in order to achieve controlled peat production that pays due attention to environmental considerations.
Resumo:
Contamination of urban streams is a rising topic worldwide, but the assessment and investigation of stormwater induced contamination is limited by the high amount of water quality data needed to obtain reliable results. In this study, stream bed sediments were studied to determine their contamination degree and their applicability in monitoring aquatic metal contamination in urban areas. The interpretation of sedimentary metal concentrations is, however, not straightforward, since the concentrations commonly show spatial and temporal variations as a response to natural processes. The variations of and controls on metal concentrations were examined at different scales to increase the understanding of the usefulness of sediment metal concentrations in detecting anthropogenic metal contamination patterns. The acid extractable concentrations of Zn, Cu, Pb and Cd were determined from the surface sediments and water of small streams in the Helsinki Metropolitan region, southern Finland. The data consists of two datasets: sediment samples from 53 sites located in the catchment of the Stream Gräsanoja and sediment and water samples from 67 independent catchments scattered around the metropolitan region. Moreover, the sediment samples were analyzed for their physical and chemical composition (e.g. total organic carbon, clay-%, Al, Li, Fe, Mn) and the speciation of metals (in the dataset of the Stream Gräsanoja). The metal concentrations revealed that the stream sediments were moderately contaminated and caused no immediate threat to the biota. However, at some sites the sediments appeared to be polluted with Cu or Zn. The metal concentrations increased with increasing intensity of urbanization, but site specific factors, such as point sources, were responsible for the occurrence of the highest metal concentrations. The sediment analyses revealed, thus a need for more detailed studies on the processes and factors that cause the hot spot metal concentrations. The sediment composition and metal speciation analyses indicated that organic matter is a very strong indirect control on metal concentrations, and it should be accounted for when studying anthropogenic metal contamination patterns. The fine-scale spatial and temporal variations of metal concentrations were low enough to allow meaningful interpretation of substantial metal concentration differences between sites. Furthermore, the metal concentrations in the stream bed sediments were correlated with the urbanization of the catchment better than the total metal concentrations in the water phase. These results suggest that stream sediments show true potential for wider use in detecting the spatial differences in metal contamination of urban streams. Consequently, using the sediment approach regional estimates of the stormwater related metal contamination could be obtained fairly cost-effectively, and the stability and reliability of results would be higher compared to analyses of single water samples. Nevertheless, water samples are essential in analysing the dissolved concentrations of metals, momentary discharges from point sources in particular.
Resumo:
A central tenet in the theory of reliability modelling is the quantification of the probability of asset failure. In general, reliability depends on asset age and the maintenance policy applied. Usually, failure and maintenance times are the primary inputs to reliability models. However, for many organisations, different aspects of these data are often recorded in different databases (e.g. work order notifications, event logs, condition monitoring data, and process control data). These recorded data cannot be interpreted individually, since they typically do not have all the information necessary to ascertain failure and preventive maintenance times. This paper presents a methodology for the extraction of failure and preventive maintenance times using commonly-available, real-world data sources. A text-mining approach is employed to extract keywords indicative of the source of the maintenance event. Using these keywords, a Naïve Bayes classifier is then applied to attribute each machine stoppage to one of two classes: failure or preventive. The accuracy of the algorithm is assessed and the classified failure time data are then presented. The applicability of the methodology is demonstrated on a maintenance data set from an Australian electricity company.
Resumo:
This thesis increased the researchers understanding of the relationship between operations and maintenance in underground longwall coal mines, using data from a Queensland underground coal mine. The thesis explores various relationships between recorded variables. Issues with human recorded data was uncovered, and results emphasised the significance of variables associated with conveyor operation to explain production.
Resumo:
Multi-document summarization addressing the problem of information overload has been widely utilized in the various real-world applications. Most of existing approaches adopt term-based representation for documents which limit the performance of multi-document summarization systems. In this paper, we proposed a novel pattern-based topic model (PBTMSum) for the task of the multi-document summarization. PBTMSum combining pattern mining techniques with LDA topic modelling could generate discriminative and semantic rich representations for topics and documents so that the most representative and non-redundant sentences can be selected to form a succinct and informative summary. Extensive experiments are conducted on the data of document understanding conference (DUC) 2007. The results prove the effectiveness and efficiency of our proposed approach.
Resumo:
The idea of extracting knowledge in process mining is a descendant of data mining. Both mining disciplines emphasise data flow and relations among elements in the data. Unfortunately, challenges have been encountered when working with the data flow and relations. One of the challenges is that the representation of the data flow between a pair of elements or tasks is insufficiently simplified and formulated, as it considers only a one-to-one data flow relation. In this paper, we discuss how the effectiveness of knowledge representation can be extended in both disciplines. To this end, we introduce a new representation of the data flow and dependency formulation using a flow graph. The flow graph solves the issue of the insufficiency of presenting other relation types, such as many-to-one and one-to-many relations. As an experiment, a new evaluation framework is applied to the Teleclaim process in order to show how this method can provide us with more precise results when compared with other representations.
Resumo:
The Social Water Assessment Protocol (SWAP) is a tool consisting of a series of questions on fourteen themes designed to capture the social context of water around a mine site. A pilot study of the SWAP, conducted in Prestea-Huni Valley, Ghana, showed that some communities were concerned about whether the groundwater was potable. The mining company’s concern was that there was a cycle of dependency amongst communities that received treated water from the mining company. The pilot identified potential data sources and stakeholder groups for each theme, gaps in themes and suggested refinements to questions to improve the SWAP.