788 resultados para data mining applications


Relevância:

90.00% 90.00%

Publicador:

Resumo:

The interest in using information to improve the quality of living in large urban areas and its governance efficiency has been around for decades. Nevertheless, the improvements in Information and Communications Technology has sparked a new dynamic in academic research, usually under the umbrella term of Smart Cities. This concept of Smart City can probably be translated, in a simplified version, into cities that are lived, managed and developed in an information-saturated environment. While it makes perfect sense and we can easily foresee the benefits of such a concept, presently there are still several significant challenges that need to be tackled before we can materialize this vision. In this work we aim at providing a small contribution in this direction, which maximizes the relevancy of the available information resources. One of the most detailed and geographically relevant information resource available, for the study of cities, is the census, more specifically the data available at block level (Subsecção Estatística). In this work, we use Self-Organizing Maps (SOM) and the variant Geo-SOM to explore the block level data from the Portuguese census of Lisbon city, for the years of 2001 and 2011. We focus on gauging change, proposing ways that allow the comparison of the two time periods, which have two different underlying geographical bases. We proceed with the analysis of the data using different SOM variants, aiming at producing a two-fold portrait: one, of the evolution of Lisbon during the first decade of the XXI century, another, of how the census dataset and SOM’s can be used to produce an informational framework for the study of cities.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Due to advances in information technology (e.g., digital video cameras, ubiquitous sensors), the automatic detection of human behaviors from video is a very recent research topic. In this paper, we perform a systematic and recent literature review on this topic, from 2000 to 2014, covering a selection of 193 papers that were searched from six major scientific publishers. The selected papers were classified into three main subjects: detection techniques, datasets and applications. The detection techniques were divided into four categories (initialization, tracking, pose estimation and recognition). The list of datasets includes eight examples (e.g., Hollywood action). Finally, several application areas were identified, including human detection, abnormal activity detection, action recognition, player modeling and pedestrian detection. Our analysis provides a road map to guide future research for designing automatic visual human behavior detection systems.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Customer lifetime value (LTV) enables using client characteristics, such as recency, frequency and monetary (RFM) value, to describe the value of a client through time in terms of profitability. We present the concept of LTV applied to telemarketing for improving the return-on-investment, using a recent (from 2008 to 2013) and real case study of bank campaigns to sell long- term deposits. The goal was to benefit from past contacts history to extract additional knowledge. A total of twelve LTV input variables were tested, un- der a forward selection method and using a realistic rolling windows scheme, highlighting the validity of five new LTV features. The results achieved by our LTV data-driven approach using neural networks allowed an improvement up to 4 pp in the Lift cumulative curve for targeting the deposit subscribers when compared with a baseline model (with no history data). Explanatory knowledge was also extracted from the proposed model, revealing two highly relevant LTV features, the last result of the previous campaign to sell the same product and the frequency of past client successes. The obtained results are particularly valuable for contact center companies, which can improve pre- dictive performance without even having to ask for more information to the companies they serve.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In highway construction, earthworks refer to the tasks of excavation, transportation, spreading and compaction of geomaterial (e.g. soil, rockfill and soil-rockfill mixture). Whereas relying heavily on machinery and repetitive processes, these tasks are highly susceptible to optimization. In this context Artificial Intelligent techniques, such as Data Mining and modern optimization can be applied for earthworks. A survey of these applications shows that they focus on the optimization of specific objectives and/or construction phases being possible to identify the capabilities and limitations of the analyzed techniques. Thus, according to the pinpointed drawbacks of these techniques, this paper describes a novel intelligent earthwork optimization system, capable of integrating DM, modern optimization and GIS technologies in order to optimize the earthwork processes throughout all phases of design and construction work. This integration system allows significant savings in time, cost and gas emissions contributing for a more sustainable construction.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Similarity-based operations, similarity join, similarity grouping, data integration

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Data mining, frequent pattern mining, database mining, mining algorithms in SQL

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Visual data mining, multi-dimensional scaling, POLARMAP, Sammon's mapping, clustering, outlier detection

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Magdeburg, Univ., Fak. für Informatik, Diss., 2013

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the past, sensors networks in cities have been limited to fixed sensors, embedded in particular locations, under centralised control. Today, new applications can leverage wireless devices and use them as sensors to create aggregated information. In this paper, we show that the emerging patterns unveiled through the analysis of large sets of aggregated digital footprints can provide novel insights into how people experience the city and into some of the drivers behind these emerging patterns. We particularly explore the capacity to quantify the evolution of the attractiveness of urban space with a case study of in the area of the New York City Waterfalls, a public art project of four man-made waterfalls rising from the New York Harbor. Methods to study the impact of an event of this nature are traditionally based on the collection of static information such as surveys and ticket-based people counts, which allow to generate estimates about visitors’ presence in specific areas over time. In contrast, our contribution makes use of the dynamic data that visitors generate, such as the density and distribution of aggregate phone calls and photos taken in different areas of interest and over time. Our analysis provides novel ways to quantify the impact of a public event on the distribution of visitors and on the evolution of the attractiveness of the points of interest in proximity. This information has potential uses for local authorities, researchers, as well as service providers such as mobile network operators.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Metabolite profiling is critical in many aspects of the life sciences, particularly natural product research. Obtaining precise information on the chemical composition of complex natural extracts (metabolomes) that are primarily obtained from plants or microorganisms is a challenging task that requires sophisticated, advanced analytical methods. In this respect, significant advances in hyphenated chromatographic techniques (LC-MS, GC-MS and LC-NMR in particular), as well as data mining and processing methods, have occurred over the last decade. Together, these tools, in combination with bioassay profiling methods, serve an important role in metabolomics for the purposes of both peak annotation and dereplication in natural product research. In this review, a survey of the techniques that are used for generic and comprehensive profiling of secondary metabolites in natural extracts is provided. The various approaches (chromatographic methods: LC-MS, GC-MS, and LC-NMR and direct spectroscopic methods: NMR and DIMS) are discussed with respect to their resolution and sensitivity for extract profiling. In addition the structural information that can be generated through these techniques or in combination, is compared in relation to the identification of metabolites in complex mixtures. Analytical strategies with applications to natural extracts and novel methods that have strong potential, regardless of how often they are used, are discussed with respect to their potential applications and future trends.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recently, kernel-based Machine Learning methods have gained great popularity in many data analysis and data mining fields: pattern recognition, biocomputing, speech and vision, engineering, remote sensing etc. The paper describes the use of kernel methods to approach the processing of large datasets from environmental monitoring networks. Several typical problems of the environmental sciences and their solutions provided by kernel-based methods are considered: classification of categorical data (soil type classification), mapping of environmental and pollution continuous information (pollution of soil by radionuclides), mapping with auxiliary information (climatic data from Aral Sea region). The promising developments, such as automatic emergency hot spot detection and monitoring network optimization are discussed as well.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Many classifiers achieve high levels of accuracy but have limited applicability in real world situations because they do not lead to a greater understanding or insight into the^way features influence the classification. In areas such as health informatics a classifier that clearly identifies the influences on classification can be used to direct research and formulate interventions. This research investigates the practical applications of Automated Weighted Sum, (AWSum), a classifier that provides accuracy comparable to other techniques whilst providing insight into the data. This is achieved by calculating a weight for each feature value that represents its influence on the class value. The merits of this approach in classification and insight are evaluated on a Cystic Fibrosis and Diabetes datasets with positive results.