14 resultados para Dataset

em Repositório Científico do Instituto Politécnico de Lisboa - Portugal


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Check Your Biosignals Here initiative (CYBHi) was developed as a way of creating a dataset and consistently repeatable acquisition framework, to further extend research in electrocardiographic (ECG) biometrics. In particular, our work targets the novel trend towards off-the-person data acquisition, which opens a broad new set of challenges and opportunities both for research and industry. While datasets with ECG signals collected using medical grade equipment at the chest can be easily found, for off-the-person ECG data the solution is generally for each team to collect their own corpus at considerable expense of resources. In this paper we describe the context, experimental considerations, methods, and preliminary findings of two public datasets created by our team, one for short-term and another for long-term assessment, with ECG data collected at the hand palms and fingers. (C) 2013 Elsevier Ireland Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Opposite enantiomers exhibit different NMR properties in the presence of an external common chiral element, and a chiral molecule exhibits different NMR properties in the presence of external enantiomeric chiral elements. Automatic prediction of such differences, and comparison with experimental values, leads to the assignment of the absolute configuration. Here two cases are reported, one using a dataset of 80 chiral secondary alcohols esterified with (R)-MTPA and the corresponding 1H NMR chemical shifts and the other with 94 13C NMR chemical shifts of chiral secondary alcohols in two enantiomeric chiral solvents. For the first application, counterpropagation neural networks were trained to predict the sign of the difference between chemical shifts of opposite stereoisomers. The neural networks were trained to process the chirality code of the alcohol as the input, and to give the NMR property as the output. In the second application, similar neural networks were employed, but the property to predict was the difference of chemical shifts in the two enantiomeric solvents. For independent test sets of 20 objects, 100% correct predictions were obtained in both applications concerning the sign of the chemical shifts differences. Additionally, with the second dataset, the difference of chemical shifts in the two enantiomeric solvents was quantitatively predicted, yielding r2 0.936 for the test set between the predicted and experimental values.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Este trabalho consiste no desenvolvimento de um Sistema de Apoio à Criminologia – SAC, onde se pretende ajudar os detectives/analistas na prevenção proactiva da criminalidade e na gestão dos seus recursos materiais e humanos, bem como impulsionar estudos sobre a alta incidência de determinados tipos de crime numa dada região. Historicamente, a resolução de crimes tem sido uma prerrogativa da justiça penal e dos seus especialistas e, com o aumento da utilização de sistemas computacionais no sistema judicial para registar todos os dados que dizem respeito a ocorrências de crimes, dados de suspeitos e vítimas, registo criminal de indivíduos e outros dados que fluem dentro da organização, cresce a necessidade de transformar estes dados em informação proveitosa no combate à criminalidade. O SAC tira partido de técnicas de extracção de conhecimento de informação e aplica-as a um conjunto de dados de ocorrências de crimes numa dada região e espaço temporal, bem como a um conjunto de variáveis que influenciam a criminalidade, as quais foram estudadas e identificadas neste trabalho. Este trabalho é constituído por um modelo de extracção de conhecimento de informação e por uma aplicação que permite ao utilizador fornecer um conjunto de dados adequado, garantindo a máxima eficácia do modelo.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A primary tool for regional tsunami hazard assessment is a reliable historical and instrumental catalogue of events. Morocco by its geographical situation, with two marine sides, stretching along the Atlantic coast to the west and along the Mediterranean coast to the north, is the country of Western Africa most exposed to the risk of tsunamis. Previous information on tsunami events affecting Morocco are included in the Iberian and/or the Mediterranean lists of tsunami events, as it is the case of the European GITEC Tsunami Catalogue, but there is a need to organize this information in a dataset and to assess the likelihood of claimed historical tsunamis in Morocco. Due to the fact that Moroccan sources are scarce, this compilation rely on historical documentation from neighbouring countries (Portugal and Spain) and so the compatibility between the new tsunami catalogue presented here and those that correspond to the same source areas is also discussed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: A common task in analyzing microarray data is to determine which genes are differentially expressed across two (or more) kind of tissue samples or samples submitted under experimental conditions. Several statistical methods have been proposed to accomplish this goal, generally based on measures of distance between classes. It is well known that biological samples are heterogeneous because of factors such as molecular subtypes or genetic background that are often unknown to the experimenter. For instance, in experiments which involve molecular classification of tumors it is important to identify significant subtypes of cancer. Bimodal or multimodal distributions often reflect the presence of subsamples mixtures. Consequently, there can be genes differentially expressed on sample subgroups which are missed if usual statistical approaches are used. In this paper we propose a new graphical tool which not only identifies genes with up and down regulations, but also genes with differential expression in different subclasses, that are usually missed if current statistical methods are used. This tool is based on two measures of distance between samples, namely the overlapping coefficient (OVL) between two densities and the area under the receiver operating characteristic (ROC) curve. The methodology proposed here was implemented in the open-source R software. Results: This method was applied to a publicly available dataset, as well as to a simulated dataset. We compared our results with the ones obtained using some of the standard methods for detecting differentially expressed genes, namely Welch t-statistic, fold change (FC), rank products (RP), average difference (AD), weighted average difference (WAD), moderated t-statistic (modT), intensity-based moderated t-statistic (ibmT), significance analysis of microarrays (samT) and area under the ROC curve (AUC). On both datasets all differentially expressed genes with bimodal or multimodal distributions were not selected by all standard selection procedures. We also compared our results with (i) area between ROC curve and rising area (ABCR) and (ii) the test for not proper ROC curves (TNRC). We found our methodology more comprehensive, because it detects both bimodal and multimodal distributions and different variances can be considered on both samples. Another advantage of our method is that we can analyze graphically the behavior of different kinds of differentially expressed genes. Conclusion: Our results indicate that the arrow plot represents a new flexible and useful tool for the analysis of gene expression profiles from microarrays.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Trabalho de Projeto para obtenção do grau de Mestre em Engenharia Informática e de Computadores

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the last decade, local image features have been widely used in robot visual localization. To assess image similarity, a strategy exploiting these features compares raw descriptors extracted from the current image to those in the models of places. This paper addresses the ensuing step in this process, where a combining function must be used to aggregate results and assign each place a score. Casting the problem in the multiple classifier systems framework, we compare several candidate combiners with respect to their performance in the visual localization task. A deeper insight into the potential of the sum and product combiners is provided by testing two extensions of these algebraic rules: threshold and weighted modifications. In addition, a voting method, previously used in robot visual localization, is assessed. All combiners are tested on a visual localization task, carried out on a public dataset. It is experimentally demonstrated that the sum rule extensions globally achieve the best performance. The voting method, whilst competitive to the algebraic rules in their standard form, is shown to be outperformed by both their modified versions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The MCNPX code was used to calculate the TG-43U1 recommended parameters in water and prostate tissue in order to quantify the dosimetric impact in 30 patients treated with (125)I prostate implants when replacing the TG-43U1 formalism parameters calculated in water by a prostate-like medium in the planning system (PS) and to evaluate the uncertainties associated with Monte Carlo (MC) calculations. The prostate density was obtained from the CT of 100 patients with prostate cancer. The deviations between our results for water and the TG-43U1 consensus dataset values were -2.6% for prostate V100, -13.0% for V150, and -5.8% for D90; -2.0% for rectum V100, and -5.1% for D0.1; -5.0% for urethra D10, and -5.1% for D30. The same differences between our water and prostate results were all under 0.3%. Uncertainties estimations were up to 2.9% for the gL(r) function, 13.4% for the F(r,θ) function and 7.0% for Λ, mainly due to seed geometry uncertainties. Uncertainties in extracting the TG-43U1 parameters in the MC simulations as well as in the literature comparison are of the same order of magnitude as the differences between dose distributions computed for water and prostate-like medium. The selection of the parameters for the PS should be done carefully, as it may considerably affect the dose distributions. The seeds internal geometry uncertainties are a major limiting factor in the MC parameters deduction.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Coastal low-level jets (CLLJ) are a low-tropospheric wind feature driven by the pressure gradient produced by a sharp contrast between high temperatures over land and lower temperatures over the sea. This contrast between the cold ocean and the warm land in the summer is intensified by the impact of the coastal parallel winds on the ocean generating upwelling currents, sharpening the temperature gradient close to the coast and giving rise to strong baroclinic structures at the coast. During summertime, the Iberian Peninsula is often under the effect of the Azores High and of a thermal low pressure system inland, leading to a seasonal wind, in the west coast, called the Nortada (northerly wind). This study presents a regional climatology of the CLLJ off the west coast of the Iberian Peninsula, based on a 9km resolution downscaling dataset, produced using the Weather Research and Forecasting (WRF) mesoscale model, forced by 19 years of ERA-Interim reanalysis (1989-2007). The simulation results show that the jet hourly frequency of occurrence in the summer is above 30% and decreases to about 10% during spring and autumn. The monthly frequencies of occurrence can reach higher values, around 40% in summer months, and reveal large inter-annual variability in all three seasons. In the summer, at a daily base, the CLLJ is present in almost 70% of the days. The CLLJ wind direction is mostly from north-northeasterly and occurs more persistently in three areas where the interaction of the jet flow with local capes and headlands is more pronounced. The coastal jets in this area occur at heights between 300 and 400 m, and its speed has a mean around 15 m/s, reaching maximum speeds of 25 m/s.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known "curse of dimensionality". As a consequence, in order to extract useful information from this data, one has to resort to feature selection (FS) techniques to lower the dimensionality of the learning space. In this paper, we assess the impact of FS techniques for silent speech data, in a dataset with 4 non-invasive and promising modalities, namely: video, depth, ultrasonic Doppler sensing, and surface electromyography. We consider two supervised (mutual information and Fisher's ratio) and two unsupervised (meanmedian and arithmetic mean geometric mean) FS filters. The evaluation was made by assessing the classification accuracy (word recognition error) of three well-known classifiers (knearest neighbors, support vector machines, and dynamic time warping). The key results of this study show that both unsupervised and supervised FS techniques improve on the classification accuracy on both individual and combined modalities. For instance, on the video component, we attain relative performance gains of 36.2% in error rates. FS is also useful as pre-processing for feature fusion. Copyright © 2014 ISCA.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new data set of daily gridded observations of precipitation, computed from over 400 stations in Portugal, is used to assess the performance of 12 regional climate models at 25 km resolution, from the ENSEMBLES set, all forced by ERA-40 boundary conditions, for the 1961-2000 period. Standard point error statistics, calculated from grid point and basin aggregated data, and precipitation related climate indices are used to analyze the performance of the different models in representing the main spatial and temporal features of the regional climate, and its extreme events. As a whole, the ENSEMBLES models are found to achieve a good representation of those features, with good spatial correlations with observations. There is a small but relevant negative bias in precipitation, especially in the driest months, leading to systematic errors in related climate indices. The underprediction of precipitation occurs in most percentiles, although this deficiency is partially corrected at the basin level. Interestingly, some of the conclusions concerning the performance of the models are different of what has been found for the contiguous territory of Spain; in particular, ENSEMBLES models appear too dry over Portugal and too wet over Spain. Finally, models behave quite differently in the simulation of some important aspects of local climate, from the mean climatology to high precipitation regimes in localized mountain ranges and in the subsequent drier regions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the last decade, local image features have been widely used in robot visual localization. In order to assess image similarity, a strategy exploiting these features compares raw descriptors extracted from the current image with those in the models of places. This paper addresses the ensuing step in this process, where a combining function must be used to aggregate results and assign each place a score. Casting the problem in the multiple classifier systems framework, in this paper we compare several candidate combiners with respect to their performance in the visual localization task. For this evaluation, we selected the most popular methods in the class of non-trained combiners, namely the sum rule and product rule. A deeper insight into the potential of these combiners is provided through a discriminativity analysis involving the algebraic rules and two extensions of these methods: the threshold, as well as the weighted modifications. In addition, a voting method, previously used in robot visual localization, is assessed. Furthermore, we address the process of constructing a model of the environment by describing how the model granularity impacts upon performance. All combiners are tested on a visual localization task, carried out on a public dataset. It is experimentally demonstrated that the sum rule extensions globally achieve the best performance, confirming the general agreement on the robustness of this rule in other classification problems. The voting method, whilst competitive with the product rule in its standard form, is shown to be outperformed by its modified versions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Remote hyperspectral sensors collect large amounts of data per flight usually with low spatial resolution. It is known that the bandwidth connection between the satellite/airborne platform and the ground station is reduced, thus a compression onboard method is desirable to reduce the amount of data to be transmitted. This paper presents a parallel implementation of an compressive sensing method, called parallel hyperspectral coded aperture (P-HYCA), for graphics processing units (GPU) using the compute unified device architecture (CUDA). This method takes into account two main properties of hyperspectral dataset, namely the high correlation existing among the spectral bands and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. Experimental results conducted using synthetic and real hyperspectral datasets on two different GPU architectures by NVIDIA: GeForce GTX 590 and GeForce GTX TITAN, reveal that the use of GPUs can provide real-time compressive sensing performance. The achieved speedup is up to 20 times when compared with the processing time of HYCA running on one core of the Intel i7-2600 CPU (3.4GHz), with 16 Gbyte memory.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The capability to anticipate a contact with another device can greatly improve the performance and user satisfaction not only of mobile social network applications but of any other relying on some form of data harvesting or hoarding. One of the most promising approaches for contact prediction is to extrapolate from past experiences. This paper investigates the recurring contact patterns observed between groups of devices using an 8-year dataset of wireless access logs produced by more than 70000 devices. This effort permitted to model the probabilities of occurrence of a contact at a predefined date between groups of devices using a power law distribution that varies according to neighbourhood size and recurrence period. In the general case, the model can be used by applications that need to disseminate large datasets by groups of devices. As an example, the paper presents and evaluates an algorithm that provides daily contact predictions, based on the history of past pairwise contacts and their duration. Copyright © 2015 ICST.