818 resultados para Clustering algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A Work Project, presented as part of the requirements for the Award of a Masters Degree in Management from the NOVA – School of Business and Economics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Trabalho de Projeto apresentado como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classical serological screening assays for Chagas' disease are time consuming and subjective. The objective of the present work is to evaluate the enzyme immuno-assay (ELISA) methodology and to propose an algorithm for blood banks to be applied to Chagas' disease. Seven thousand, nine hundred and ninety nine blood donor samples were screened by both reverse passive hemagglutination (RPHA) and indirect immunofluorescence assay (IFA). Samples reactive on RPHA and/or IFA were submitted to supplementary RPHA, IFA and complement fixation (CFA) tests. This strategy allowed us to create a panel of 60 samples to evaluate the ELISA methodology from 3 different manufacturers. The sensitivity of the screening by IFA and the 3 different ELISA's was 100%. The specificity was better on ELISA methodology. For Chagas disease, ELISA seems to be the best test for blood donor screening, because it showed high sensitivity and specificity, it is not subjective and can be automated. Therefore, it was possible to propose an algorithm to screen samples and confirm donor results at the blood bank.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Logica Computicional

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Human Activity Recognition systems require objective and reliable methods that can be used in the daily routine and must offer consistent results according with the performed activities. These systems are under development and offer objective and personalized support for several applications such as the healthcare area. This thesis aims to create a framework for human activities recognition based on accelerometry signals. Some new features and techniques inspired in the audio recognition methodology are introduced in this work, namely Log Scale Power Bandwidth and the Markov Models application. The Forward Feature Selection was adopted as the feature selection algorithm in order to improve the clustering performances and limit the computational demands. This method selects the most suitable set of features for activities recognition in accelerometry from a 423th dimensional feature vector. Several Machine Learning algorithms were applied to the used accelerometry databases – FCHA and PAMAP databases - and these showed promising results in activities recognition. The developed algorithm set constitutes a mighty contribution for the development of reliable evaluation methods of movement disorders for diagnosis and treatment applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Diffusion Kurtosis Imaging (DKI) is a fairly new magnetic resonance imag-ing (MRI) technique that tackles the non-gaussian motion of water in biological tissues by taking into account the restrictions imposed by tissue microstructure, which are not considered in Diffusion Tensor Imaging (DTI), where the water diffusion is considered purely gaussian. As a result DKI provides more accurate information on biological structures and is able to detect important abnormalities which are not visible in standard DTI analysis. This work regards the development of a tool for DKI computation to be implemented as an OsiriX plugin. Thus, as OsiriX runs under Mac OS X, the pro-gram is written in Objective-C and also makes use of Apple’s Cocoa framework. The whole program is developed in the Xcode integrated development environ-ment (IDE). The plugin implements a fast heuristic constrained linear least squares al-gorithm (CLLS-H) for estimating the diffusion and kurtosis tensors, and offers the user the possibility to choose which maps are to be generated for not only standard DTI quantities such as Mean Diffusion (MD), Radial Diffusion (RD), Axial Diffusion (AD) and Fractional Anisotropy (FA), but also DKI metrics, Mean Kurtosis (MK), Radial Kurtosis (RK) and Axial Kurtosis (AK).The plugin was subjected to both a qualitative and a semi-quantitative analysis which yielded convincing results. A more accurate validation pro-cess is still being developed, after which, and with some few minor adjust-ments the plugin shall become a valid option for DKI computation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Saccharomyces cerevisiae as well as other microorganisms are frequently used in industry with the purpose of obtain different kind of products that can be applied in several areas (research investigation, pharmaceutical compounds, etc.). In order to obtain high yields for the desired product, it is necessary to make an adequate medium supplementation during the growth of the microorganisms. The higher yields are typically reached by using complex media, however the exact formulation of these media is not known. Moreover, it is difficult to control the exact composition of complex media, leading to batch-to-batch variations. So, to overcome this problem, some industries choose to use defined media, with a defined and known chemical composition. However these kind of media, many times, do not reach the same high yields that are obtained by using complex media. In order to obtain similar yield with defined media the addition of many different compounds has to be tested experimentally. Therefore, the industries use a set of empirical methods with which it is tried to formulate defined media that can reach the same high yields as complex media. In this thesis, a defined medium for Saccharomyces cerevisiae was developed using a rational design approach. In this approach a given metabolic network of Saccharomyces cerevisiae is divided into a several unique and not further decomposable sub networks of metabolic reactions that work coherently in steady state, so called elementary flux modes. The EFMtool algorithm was used in order to calculate the EFM’s for two Saccharomyces cerevisiae metabolic networks (amino acids supplemented metabolic network; amino acids non-supplemented metabolic network). For the supplemented metabolic network 1352172 EFM’s were calculated and then divided into: 1306854 EFM’s producing biomass, and 18582 EFM’s exclusively producing CO2 (cellular respiration). For the non-supplemented network 635 EFM’s were calculated and then divided into: 215 EFM’s producing biomass; 420 EFM’s producing exclusively CO2. The EFM’s of each group were normalized by the respective glucose consumption value. After that, the EFMs’ of the supplemented network were grouped again into: 30 clusters for the 1306854 EFMs producing biomass and, 20 clusters for the 18582 EFM’s producing CO2. For the non-supplemented metabolic network the respective EFM’s of each metabolic function were grouped into 10 clusters. After the clustering step, the concentrations of the other medium compounds were calculated by considering a reasonable glucose amount and by accounting for the proportionality between the compounds concentrations and the glucose ratios. The approach adopted/developed in this thesis may allow a faster and more economical way for media development.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the last few years, we have observed an exponential increasing of the information systems, and parking information is one more example of them. The needs of obtaining reliable and updated information of parking slots availability are very important in the goal of traffic reduction. Also parking slot prediction is a new topic that has already started to be applied. San Francisco in America and Santander in Spain are examples of such projects carried out to obtain this kind of information. The aim of this thesis is the study and evaluation of methodologies for parking slot prediction and the integration in a web application, where all kind of users will be able to know the current parking status and also future status according to parking model predictions. The source of the data is ancillary in this work but it needs to be understood anyway to understand the parking behaviour. Actually, there are many modelling techniques used for this purpose such as time series analysis, decision trees, neural networks and clustering. In this work, the author explains the best techniques at this work, analyzes the result and points out the advantages and disadvantages of each one. The model will learn the periodic and seasonal patterns of the parking status behaviour, and with this knowledge it can predict future status values given a date. The data used comes from the Smart Park Ontinyent and it is about parking occupancy status together with timestamps and it is stored in a database. After data acquisition, data analysis and pre-processing was needed for model implementations. The first test done was with the boosting ensemble classifier, employed over a set of decision trees, created with C5.0 algorithm from a set of training samples, to assign a prediction value to each object. In addition to the predictions, this work has got measurements error that indicates the reliability of the outcome predictions being correct. The second test was done using the function fitting seasonal exponential smoothing tbats model. Finally as the last test, it has been tried a model that is actually a combination of the previous two models, just to see the result of this combination. The results were quite good for all of them, having error averages of 6.2, 6.6 and 5.4 in vacancies predictions for the three models respectively. This means from a parking of 47 places a 10% average error in parking slot predictions. This result could be even better with longer data available. In order to make this kind of information visible and reachable from everyone having a device with internet connection, a web application was made for this purpose. Beside the data displaying, this application also offers different functions to improve the task of searching for parking. The new functions, apart from parking prediction, were: - Park distances from user location. It provides all the distances to user current location to the different parks in the city. - Geocoding. The service for matching a literal description or an address to a concrete location. - Geolocation. The service for positioning the user. - Parking list panel. This is not a service neither a function, is just a better visualization and better handling of the information.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study focuses on the implementation of several pair trading strategies across three emerging markets, with the objective of comparing the results obtained from the different strategies and assessing if pair trading benefits from a more volatile environment. The results show that, indeed, there are higher potential profits arising from emerging markets. However, the higher excess return will be partially offset by higher transaction costs, which will be a determinant factor to the profitability of pair trading strategies. Also, a new clustering approach based on the Principal Component Analysis was tested as an alternative to the more standard clustering by Industry Groups. The new clustering approach delivers promising results, consistently reducing volatility to a greater extent than the Industry Group approach, with no significant harm to the excess returns.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this work project is to analyze the current algorithm used by EDP to estimate their clients’ electrical energy consumptions, create a new algorithm and compare the advantages and disadvantages of both. This new algorithm is different from the current one as it incorporates some effects from temperature variations. The results of the comparison show that this new algorithm with temperature variables performed better than the same algorithm without temperature variables, although there is still potential for further improvements of the current algorithm, if the prediction model is estimated using a sample of daily data, which is the case of the current EDP algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Contém resumo

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ship tracking systems allow Maritime Organizations that are concerned with the Safety at Sea to obtain information on the current location and route of merchant vessels. Thanks to Space technology in recent years the geographical coverage of the ship tracking platforms has increased significantly, from radar based near-shore traffic monitoring towards a worldwide picture of the maritime traffic situation. The long-range tracking systems currently in operations allow the storage of ship position data over many years: a valuable source of knowledge about the shipping routes between different ocean regions. The outcome of this Master project is a software prototype for the estimation of the most operated shipping route between any two geographical locations. The analysis is based on the historical ship positions acquired with long-range tracking systems. The proposed approach makes use of a Genetic Algorithm applied on a training set of relevant ship positions extracted from the long-term storage tracking database of the European Maritime Safety Agency (EMSA). The analysis of some representative shipping routes is presented and the quality of the results and their operational applications are assessed by a Maritime Safety expert.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Usually, data warehousing populating processes are data-oriented workflows composed by dozens of granular tasks that are responsible for the integration of data coming from different data sources. Specific subset of these tasks can be grouped on a collection together with their relationships in order to form higher- level constructs. Increasing task granularity allows for the generalization of processes, simplifying their views and providing methods to carry out expertise to new applications. Well-proven practices can be used to describe general solutions that use basic skeletons configured and instantiated according to a set of specific integration requirements. Patterns can be applied to ETL processes aiming to simplify not only a possible conceptual representation but also to reduce the gap that often exists between two design perspectives. In this paper, we demonstrate the feasibility and effectiveness of an ETL pattern-based approach using task clustering, analyzing a real world ETL scenario through the definitions of two commonly used clusters of tasks: a data lookup cluster and a data conciliation and integration cluster.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

When a pregnant woman is guided to a hospital for obstetrics purposes, many outcomes are possible, depending on her current conditions. An improved understanding of these conditions could provide a more direct medical approach by categorizing the different types of patients, enabling a faster response to risk situations, and therefore increasing the quality of services. In this case study, the characteristics of the patients admitted in the maternity care unit of Centro Hospitalar of Porto are acknowledged, allowing categorizing the patient women through clustering techniques. The main goal is to predict the patients’ route through the maternity care, adapting the services according to their conditions, providing the best clinical decisions and a cost-effective treatment to patients. The models developed presented very interesting results, being the best clustering evaluation index: 0.65. The evaluation of the clustering algorithms proved the viability of using clustering based data mining models to characterize pregnant patients, identifying which conditions can be used as an alert to prevent the occurrence of medical complications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lecture Notes in Computer Science, 9273