Biblioteca Digital

847 resultados para Data Mining, Big Data, Consumi energetici, Weka Data Cleaning

Exploraty Multivariate Statistical Methods Applied to Pharmaceutical Industry CRM Data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação

Veja mais

Linked Data based Health Information Representation, Visualization and Retrieval System on the Semantic Web

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.

Veja mais

Web-application for gathering, analyzing, and processing health information about allergy data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.

Veja mais

On the cloud deployment of a session abstraction for service/data aggregation

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Veja mais

Collecting data for indoor mapping of the university of Münster via a location based game

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.

Veja mais

Evaluation and optimization of a session-based middleware for data management

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Veja mais

Contributions towards smart cities : exploring block level census data for the characterization of change in Lisbon

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The interest in using information to improve the quality of living in large urban areas and its governance efficiency has been around for decades. Nevertheless, the improvements in Information and Communications Technology has sparked a new dynamic in academic research, usually under the umbrella term of Smart Cities. This concept of Smart City can probably be translated, in a simplified version, into cities that are lived, managed and developed in an information-saturated environment. While it makes perfect sense and we can easily foresee the benefits of such a concept, presently there are still several significant challenges that need to be tackled before we can materialize this vision. In this work we aim at providing a small contribution in this direction, which maximizes the relevancy of the available information resources. One of the most detailed and geographically relevant information resource available, for the study of cities, is the census, more specifically the data available at block level (Subsecção Estatística). In this work, we use Self-Organizing Maps (SOM) and the variant Geo-SOM to explore the block level data from the Portuguese census of Lisbon city, for the years of 2001 and 2011. We focus on gauging change, proposing ways that allow the comparison of the two time periods, which have two different underlying geographical bases. We proceed with the analysis of the data using different SOM variants, aiming at producing a two-fold portrait: one, of the evolution of Lisbon during the first decade of the XXI century, another, of how the census dataset and SOM’s can be used to produce an informational framework for the study of cities.

Veja mais

From a NoSQL data source to a business intelligence solution: An experiment

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We are living in the era of Big Data. A time which is characterized by the continuous creation of vast amounts of data, originated from different sources, and with different formats. First, with the rise of the social networks and, more recently, with the advent of the Internet of Things (IoT), in which everyone and (eventually) everything is linked to the Internet, data with enormous potential for organizations is being continuously generated. In order to be more competitive, organizations want to access and explore all the richness that is present in those data. Indeed, Big Data is only as valuable as the insights organizations gather from it to make better decisions, which is the main goal of Business Intelligence. In this paper we describe an experiment in which data obtained from a NoSQL data source (database technology explicitly developed to deal with the specificities of Big Data) is used to feed a Business Intelligence solution.

Veja mais

Discrimination of Brazilian propolis according to the seasoning using chemometrics and machine learning based on UV-Vis scanning data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Propolis is a chemically complex biomass produced by honeybees (Apis mellifera) from plant resins added of salivary enzymes, beeswax, and pollen. The biological activities described for propolis were also identified for donor plants resin, but a big challenge for the standardization of the chemical composition and biological effects of propolis remains on a better understanding of the influence of seasonality on the chemical constituents of that raw material. Since propolis quality depends, among other variables, on the local flora which is strongly influenced by (a)biotic factors over the seasons, to unravel the harvest season effect on the propolis chemical profile is an issue of recognized importance. For that, fast, cheap, and robust analytical techniques seem to be the best choice for large scale quality control processes in the most demanding markets, e.g., human health applications. For that, UV-Visible (UV-Vis) scanning spectrophotometry of hydroalcoholic extracts (HE) of seventy-three propolis samples, collected over the seasons in 2014 (summer, spring, autumn, and winter) and 2015 (summer and autumn) in Southern Brazil was adopted. Further machine learning and chemometrics techniques were applied to the UV-Vis dataset aiming to gain insights as to the seasonality effect on the claimed chemical heterogeneity of propolis samples determined by changes in the flora of the geographic region under study. Descriptive and classification models were built following a chemometric approach, i.e. principal component analysis (PCA) and hierarchical clustering analysis (HCA) supported by scripts written in the R language. The UV-Vis profiles associated with chemometric analysis allowed identifying a typical pattern in propolis samples collected in the summer. Importantly, the discrimination based on PCA could be improved by using the dataset of the fingerprint region of phenolic compounds ( = 280-400m), suggesting that besides the biological activities of those secondary metabolites, they also play a relevant role for the discrimination and classification of that complex matrix through bioinformatics tools. Finally, a series of machine learning approaches, e.g., partial least square-discriminant analysis (PLS-DA), k-Nearest Neighbors (kNN), and Decision Trees showed to be complementary to PCA and HCA, allowing to obtain relevant information as to the sample discrimination.

Veja mais

Forecasting human entrances at a commercial store using facial recognition data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Veja mais

Efficient similarity-based operations for data integration

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Similarity-based operations, similarity join, similarity grouping, data integration

Veja mais

Visual data analysis in air traffic management

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Visual data mining, multi-dimensional scaling, POLARMAP, Sammon's mapping, clustering, outlier detection

Veja mais

Comparison of Contemporary Protocols for High-speed Data Transport via 10 Gbps WAN Connections

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This work is dedicated to comparison of open source as well as proprietary transport protocols for highspeed data transmission via IP networks. The contemporary common TCP needs significant improvement since it was developed as general-purpose transport protocol and firstly introduced four decades ago. In nowadays networks, TCP fits not all communication needs that society has. Caused of it another transport protocols have been developed and successfully used for e.g. Big Data movement. In scope of this research the following protocols have been investigated for its efficiency on 10Gbps links: UDT, RBUDP, MTP and RWTP. The protocols were tested under different impairments such as Round Trip Time up to 400 ms and packet losses up to 2%. Investigated parameters are the data rate under different conditions of the network, the CPU load by sender andreceiver during the experiments, size of feedback data, CPU usage per Gbps and the amount of feedback data per GiByte of effectively transmitted data. The best performance and fair resources consumption was observed by RWTP. From the opensource projects, the best behavior is showed by RBUDP.

Veja mais

Mapping of Environmental Data Using Kernel-Based Methods

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recently, kernel-based Machine Learning methods have gained great popularity in many data analysis and data mining fields: pattern recognition, biocomputing, speech and vision, engineering, remote sensing etc. The paper describes the use of kernel methods to approach the processing of large datasets from environmental monitoring networks. Several typical problems of the environmental sciences and their solutions provided by kernel-based methods are considered: classification of categorical data (soil type classification), mapping of environmental and pollution continuous information (pollution of soil by radionuclides), mapping with auxiliary information (climatic data from Aral Sea region). The promising developments, such as automatic emergency hot spot detection and monitoring network optimization are discussed as well.

Veja mais

Statistical learning theory for geospatial data. Case study: Aral sea

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).

Veja mais

847 resultados para Data Mining, Big Data, Consumi energetici, Weka Data Cleaning

Filtro por publicador