956 resultados para Data Sets
Resumo:
Dissertação apresentada para obtenção do Grau de Doutor em Engenharia Electrotécnica e de Computadores – Sistemas Digitais e Percepcionais pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia
Resumo:
Hyperspectral imaging has become one of the main topics in remote sensing applications, which comprise hundreds of spectral bands at different (almost contiguous) wavelength channels over the same area generating large data volumes comprising several GBs per flight. This high spectral resolution can be used for object detection and for discriminate between different objects based on their spectral characteristics. One of the main problems involved in hyperspectral analysis is the presence of mixed pixels, which arise when the spacial resolution of the sensor is not able to separate spectrally distinct materials. Spectral unmixing is one of the most important task for hyperspectral data exploitation. However, the unmixing algorithms can be computationally very expensive, and even high power consuming, which compromises the use in applications under on-board constraints. In recent years, graphics processing units (GPUs) have evolved into highly parallel and programmable systems. Specifically, several hyperspectral imaging algorithms have shown to be able to benefit from this hardware taking advantage of the extremely high floating-point processing performance, compact size, huge memory bandwidth, and relatively low cost of these units, which make them appealing for onboard data processing. In this paper, we propose a parallel implementation of an augmented Lagragian based method for unsupervised hyperspectral linear unmixing on GPUs using CUDA. The method called simplex identification via split augmented Lagrangian (SISAL) aims to identify the endmembers of a scene, i.e., is able to unmix hyperspectral data sets in which the pure pixel assumption is violated. The efficient implementation of SISAL method presented in this work exploits the GPU architecture at low level, using shared memory and coalesced accesses to memory.
Resumo:
The application of compressive sensing (CS) to hyperspectral images is an active area of research over the past few years, both in terms of the hardware and the signal processing algorithms. However, CS algorithms can be computationally very expensive due to the extremely large volumes of data collected by imaging spectrometers, a fact that compromises their use in applications under real-time constraints. This paper proposes four efficient implementations of hyperspectral coded aperture (HYCA) for CS, two of them termed P-HYCA and P-HYCA-FAST and two additional implementations for its constrained version (CHYCA), termed P-CHYCA and P-CHYCA-FAST on commodity graphics processing units (GPUs). HYCA algorithm exploits the high correlation existing among the spectral bands of the hyperspectral data sets and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. The proposed P-HYCA and P-CHYCA implementations have been developed using the compute unified device architecture (CUDA) and the cuFFT library. Moreover, this library has been replaced by a fast iterative method in the P-HYCA-FAST and P-CHYCA-FAST implementations that leads to very significant speedup factors in order to achieve real-time requirements. The proposed algorithms are evaluated not only in terms of reconstruction error for different compressions ratios but also in terms of computational performance using two different GPU architectures by NVIDIA: 1) GeForce GTX 590; and 2) GeForce GTX TITAN. Experiments are conducted using both simulated and real data revealing considerable acceleration factors and obtaining good results in the task of compressing remotely sensed hyperspectral data sets.
Resumo:
One of the main problems of hyperspectral data analysis is the presence of mixed pixels due to the low spatial resolution of such images. Linear spectral unmixing aims at inferring pure spectral signatures and their fractions at each pixel of the scene. The huge data volumes acquired by hyperspectral sensors put stringent requirements on processing and unmixing methods. This letter proposes an efficient implementation of the method called simplex identification via split augmented Lagrangian (SISAL) which exploits the graphics processing unit (GPU) architecture at low level using Compute Unified Device Architecture. SISAL aims to identify the endmembers of a scene, i.e., is able to unmix hyperspectral data sets in which the pure pixel assumption is violated. The proposed implementation is performed in a pixel-by-pixel fashion using coalesced accesses to memory and exploiting shared memory to store temporary data. Furthermore, the kernels have been optimized to minimize the threads divergence, therefore achieving high GPU occupancy. The experimental results obtained for the simulated and real hyperspectral data sets reveal speedups up to 49 times, which demonstrates that the GPU implementation can significantly accelerate the method's execution over big data sets while maintaining the methods accuracy.
Resumo:
In the present paper we compare clustering solutions using indices of paired agreement. We propose a new method - IADJUST - to correct indices of paired agreement, excluding agreement by chance. This new method overcomes previous limitations known in the literature as it permits the correction of any index. We illustrate its use in external clustering validation, to measure the accordance between clusters and an a priori known structure. The adjusted indices are intended to provide a realistic measure of clustering performance that excludes agreement by chance with ground truth. We use simulated data sets, under a range of scenarios - considering diverse numbers of clusters, clusters overlaps and balances - to discuss the pertinence and the precision of our proposal. Precision is established based on comparisons with the analytical approach for correction specific indices that can be corrected in this way are used for this purpose. The pertinence of the proposed correction is discussed when making a detailed comparison between the performance of two classical clustering approaches, namely Expectation-Maximization (EM) and K-Means (KM) algorithms. Eight indices of paired agreement are studied and new corrected indices are obtained.
Resumo:
Dissertation presented at Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia in fulfilment of the requirements for the Masters degree in Mathematics and Applications, specialization in Actuarial Sciences, Statistics and Operations Research
Resumo:
In this work, kriging with covariates is used to model and map the spatial distribution of salinity measurements gathered by an autonomous underwater vehicle in a sea outfall monitoring campaign aiming to distinguish the effluent plume from the receiving waters and characterize its spatial variability in the vicinity of the discharge. Four different geostatistical linear models for salinity were assumed, where the distance to diffuser, the west-east positioning, and the south-north positioning were used as covariates. Sample variograms were fitted by the Mat`ern models using weighted least squares and maximum likelihood estimation methods as a way to detect eventual discrepancies. Typically, the maximum likelihood method estimated very low ranges which have limited the kriging process. So, at least for these data sets, weighted least squares showed to be the most appropriate estimation method for variogram fitting. The kriged maps show clearly the spatial variation of salinity, and it is possible to identify the effluent plume in the area studied. The results obtained show some guidelines for sewage monitoring if a geostatistical analysis of the data is in mind. It is important to treat properly the existence of anomalous values and to adopt a sampling strategy that includes transects parallel and perpendicular to the effluent dispersion.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Geológica (Georrecursos)
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
Evidence suggests that human semen quality may have been deteriorating in recent years. Most of the evidence is retrospective, based on analysis of data sets collected for other purposes. Measures of male infertility are needed if we want to monitor the biological capacity for males to reproduce over time or between different populations. We also need these measures in analytical epidemiology if we want to identify risk indicators, risk factors, or even causes of an impaired male fecundity-that is, the male component in the biological ability to reproduce. The most direct evaluation of fecundity is to measure the time it takes to conceive. Since the time of conception may be missed in the case of an early abortion, time to get pregnant is often measured as the time it takes to obtain a conception that survives until a clinically recognized pregnancy or even a pregnancy that ends with a live born child occurs. A prolonged time required to produce pregnancy may therefore be due to a failure to conceive or a failure to maintain a pregnancy until clinical recognition. Studies that focus on quantitative changes in fecundity (that does not cause sterility) should in principle be possible in a pregnancy sample. The most important limitation in fertility studies is that the design requires equal persistency in trying to become pregnant and rather similar fertility desires and family planning methods in the groups to be compared. This design is probably achievable in exposure studies that make comparisons with reasonable comparable groups concerning social conditions and use of contraceptive methods.
Resumo:
We present a study on human mobility at small spatial scales. Differently from large scale mobility, recently studied through dollar-bill tracking and mobile phone data sets within one big country or continent, we report Brownian features of human mobility at smaller scales. In particular, the scaling exponents found at the smallest scales is typically close to one-half, differently from the larger values for the exponent characterizing mobility at larger scales. We carefully analyze 12-month data of the Eduroam database within the Portuguese university of Minho. A full procedure is introduced with the aim of properly characterizing the human mobility within the network of access points composing the wireless system of the university. In particular, measures of flux are introduced for estimating a distance between access points. This distance is typically non-Euclidean, since the spatial constraints at such small scales distort the continuum space on which human mobility occurs. Since two different ex- ponents are found depending on the scale human motion takes place, we raise the question at which scale the transition from Brownian to non-Brownian motion takes place. In this context, we discuss how the numerical approach can be extended to larger scales, using the full Eduroam in Europe and in Asia, for uncovering the transi- tion between both dynamical regimes.
Resumo:
Healthy immunoglobulin repertoire has not been extensively evaluated reflecting in part the challenge of generating sufficiently robust data sets by conventional clonal sequencing. Deep sequencing has revolutionized the capacity to evaluate the depth and breadth of the Ig repertoire along the B cell developmental pathway, and can be used to pin point defect(s) of primary or acquired B-cell associated diseases. In this study healthy IgM and IgG repertoires were studied by 454-pyrosequencing to establish the healthy controls for diseased repertoires. (...)
Impact of preoperative risk factors on morbidity after esophagectomy: is there room for improvement?
Resumo:
BACKGROUND: Despite progress in multidisciplinary treatment of esophageal cancer, oncologic esophagectomy is still the cornerstone of therapeutic strategies. Several scoring systems are used to predict postoperative morbidity, but in most cases they identify nonmodifiable parameters. The aim of this study was to identify potentially modifiable risk factors associated with complications after oncologic esophagectomy. METHODS: All consecutive patients with complete data sets undergoing oncologic esophagectomy in our department during 2001-2011 were included in this study. As potentially modifiable risk factors we assessed nutritional status depicted by body mass index (BMI) and preoperative serum albumin levels, excessive alcohol consumption, and active smoking. Postoperative complications were graded according to a validated 5-grade system. Univariate and multivariate analyses were used to identify preoperative risk factors associated with the occurrence and severity of complications. RESULTS: Our series included 93 patients. Overall morbidity rate was 81 % (n = 75), with 56 % (n = 52) minor complications and 18 % (n = 17) major complications. Active smoking and excessive alcohol consumption were associated with the occurrence of severe complications, whereas BMI and low preoperative albumin levels were not. The simultaneous presence of two or more of these risk factors significantly increased the risk of postoperative complications. CONCLUSIONS: A combination of malnutrition, active smoking and alcohol consumption were found to have a negative impact on postoperative morbidity rates. Therefore, preoperative smoking and alcohol cessation counseling and monitoring and improving the nutritional status are strongly recommended.
Resumo:
We examine the timing of firms' operations in a formal model of labor demand. Merging a variety of data sets from Portugal from 1995-2004, we describe temporal patterns of firms' demand for labor and estimate production-functions and relative labor-demand equations. The results demonstrate the existence of substitution of employment across times of the day/week and show that legislated penalties for work at irregular hours induce firms to alter their operating schedules. The results suggest a role for such penalties in an unregulated labor market, such as the United States, in which unusually large fractions of work are performed at night and on weekends.
Resumo:
The breakdown of the Bretton Woods system and the adoption of generalized oating exchange rates ushered in a new era of exchange rate volatility and uncer- tainty. This increased volatility lead economists to search for economic models able to describe observed exchange rate behavior. In the present paper we propose more general STAR transition functions which encompass both threshold nonlinearity and asymmetric e¤ects. Our framework allows for a gradual adjustment from one regime to another, and considers threshold e¤ects by encompassing other existing models, such as TAR models. We apply our methodology to three di¤erent exchange rate data-sets, one for developing countries, and o¢ cial nominal exchange rates, the sec- ond emerging market economies using black market exchange rates and the third for OECD economies.