905 resultados para partial least-squares regression
Resumo:
The development of high spatial resolution airborne and spaceborne sensors has improved the capability of ground-based data collection in the fields of agriculture, geography, geology, mineral identification, detection [2, 3], and classification [4–8]. The signal read by the sensor from a given spatial element of resolution and at a given spectral band is a mixing of components originated by the constituent substances, termed endmembers, located at that element of resolution. This chapter addresses hyperspectral unmixing, which is the decomposition of the pixel spectra into a collection of constituent spectra, or spectral signatures, and their corresponding fractional abundances indicating the proportion of each endmember present in the pixel [9, 10]. Depending on the mixing scales at each pixel, the observed mixture is either linear or nonlinear [11, 12]. The linear mixing model holds when the mixing scale is macroscopic [13]. The nonlinear model holds when the mixing scale is microscopic (i.e., intimate mixtures) [14, 15]. The linear model assumes negligible interaction among distinct endmembers [16, 17]. The nonlinear model assumes that incident solar radiation is scattered by the scene through multiple bounces involving several endmembers [18]. Under the linear mixing model and assuming that the number of endmembers and their spectral signatures are known, hyperspectral unmixing is a linear problem, which can be addressed, for example, under the maximum likelihood setup [19], the constrained least-squares approach [20], the spectral signature matching [21], the spectral angle mapper [22], and the subspace projection methods [20, 23, 24]. Orthogonal subspace projection [23] reduces the data dimensionality, suppresses undesired spectral signatures, and detects the presence of a spectral signature of interest. The basic concept is to project each pixel onto a subspace that is orthogonal to the undesired signatures. As shown in Settle [19], the orthogonal subspace projection technique is equivalent to the maximum likelihood estimator. This projection technique was extended by three unconstrained least-squares approaches [24] (signature space orthogonal projection, oblique subspace projection, target signature space orthogonal projection). Other works using maximum a posteriori probability (MAP) framework [25] and projection pursuit [26, 27] have also been applied to hyperspectral data. In most cases the number of endmembers and their signatures are not known. Independent component analysis (ICA) is an unsupervised source separation process that has been applied with success to blind source separation, to feature extraction, and to unsupervised recognition [28, 29]. ICA consists in finding a linear decomposition of observed data yielding statistically independent components. Given that hyperspectral data are, in given circumstances, linear mixtures, ICA comes to mind as a possible tool to unmix this class of data. In fact, the application of ICA to hyperspectral data has been proposed in reference 30, where endmember signatures are treated as sources and the mixing matrix is composed by the abundance fractions, and in references 9, 25, and 31–38, where sources are the abundance fractions of each endmember. In the first approach, we face two problems: (1) The number of samples are limited to the number of channels and (2) the process of pixel selection, playing the role of mixed sources, is not straightforward. In the second approach, ICA is based on the assumption of mutually independent sources, which is not the case of hyperspectral data, since the sum of the abundance fractions is constant, implying dependence among abundances. This dependence compromises ICA applicability to hyperspectral images. In addition, hyperspectral data are immersed in noise, which degrades the ICA performance. IFA [39] was introduced as a method for recovering independent hidden sources from their observed noisy mixtures. IFA implements two steps. First, source densities and noise covariance are estimated from the observed data by maximum likelihood. Second, sources are reconstructed by an optimal nonlinear estimator. Although IFA is a well-suited technique to unmix independent sources under noisy observations, the dependence among abundance fractions in hyperspectral imagery compromises, as in the ICA case, the IFA performance. Considering the linear mixing model, hyperspectral observations are in a simplex whose vertices correspond to the endmembers. Several approaches [40–43] have exploited this geometric feature of hyperspectral mixtures [42]. Minimum volume transform (MVT) algorithm [43] determines the simplex of minimum volume containing the data. The MVT-type approaches are complex from the computational point of view. Usually, these algorithms first find the convex hull defined by the observed data and then fit a minimum volume simplex to it. Aiming at a lower computational complexity, some algorithms such as the vertex component analysis (VCA) [44], the pixel purity index (PPI) [42], and the N-FINDR [45] still find the minimum volume simplex containing the data cloud, but they assume the presence in the data of at least one pure pixel of each endmember. This is a strong requisite that may not hold in some data sets. In any case, these algorithms find the set of most pure pixels in the data. Hyperspectral sensors collects spatial images over many narrow contiguous bands, yielding large amounts of data. For this reason, very often, the processing of hyperspectral data, included unmixing, is preceded by a dimensionality reduction step to reduce computational complexity and to improve the signal-to-noise ratio (SNR). Principal component analysis (PCA) [46], maximum noise fraction (MNF) [47], and singular value decomposition (SVD) [48] are three well-known projection techniques widely used in remote sensing in general and in unmixing in particular. The newly introduced method [49] exploits the structure of hyperspectral mixtures, namely the fact that spectral vectors are nonnegative. The computational complexity associated with these techniques is an obstacle to real-time implementations. To overcome this problem, band selection [50] and non-statistical [51] algorithms have been introduced. This chapter addresses hyperspectral data source dependence and its impact on ICA and IFA performances. The study consider simulated and real data and is based on mutual information minimization. Hyperspectral observations are described by a generative model. This model takes into account the degradation mechanisms normally found in hyperspectral applications—namely, signature variability [52–54], abundance constraints, topography modulation, and system noise. The computation of mutual information is based on fitting mixtures of Gaussians (MOG) to data. The MOG parameters (number of components, means, covariances, and weights) are inferred using the minimum description length (MDL) based algorithm [55]. We study the behavior of the mutual information as a function of the unmixing matrix. The conclusion is that the unmixing matrix minimizing the mutual information might be very far from the true one. Nevertheless, some abundance fractions might be well separated, mainly in the presence of strong signature variability, a large number of endmembers, and high SNR. We end this chapter by sketching a new methodology to blindly unmix hyperspectral data, where abundance fractions are modeled as a mixture of Dirichlet sources. This model enforces positivity and constant sum sources (full additivity) constraints. The mixing matrix is inferred by an expectation-maximization (EM)-type algorithm. This approach is in the vein of references 39 and 56, replacing independent sources represented by MOG with mixture of Dirichlet sources. Compared with the geometric-based approaches, the advantage of this model is that there is no need to have pure pixels in the observations. The chapter is organized as follows. Section 6.2 presents a spectral radiance model and formulates the spectral unmixing as a linear problem accounting for abundance constraints, signature variability, topography modulation, and system noise. Section 6.3 presents a brief resume of ICA and IFA algorithms. Section 6.4 illustrates the performance of IFA and of some well-known ICA algorithms with experimental data. Section 6.5 studies the ICA and IFA limitations in unmixing hyperspectral data. Section 6.6 presents results of ICA based on real data. Section 6.7 describes the new blind unmixing scheme and some illustrative examples. Section 6.8 concludes with some remarks.
Resumo:
O modelo matemático de um sistema real permite o conhecimento do seu comportamento dinâmico e é geralmente utilizado em problemas de engenharia. Por vezes os parâmetros utilizados pelo modelo são desconhecidos ou imprecisos. O envelhecimento e o desgaste do material são fatores a ter em conta pois podem causar alterações no comportamento do sistema real, podendo ser necessário efetuar uma nova estimação dos seus parâmetros. Para resolver este problema é utilizado o software desenvolvido pela empresa MathWorks, nomeadamente, o Matlab e o Simulink, em conjunto com a plataforma Arduíno cujo Hardware é open-source. A partir de dados obtidos do sistema real será aplicado um Ajuste de curvas (Curve Fitting) pelo Método dos Mínimos Quadrados de forma a aproximar o modelo simulado ao modelo do sistema real. O sistema desenvolvido permite a obtenção de novos valores dos parâmetros, de uma forma simples e eficaz, com vista a uma melhor aproximação do sistema real em estudo. A solução encontrada é validada com recurso a diferentes sinais de entrada aplicados ao sistema e os seus resultados comparados com os resultados do novo modelo obtido. O desempenho da solução encontrada é avaliado através do método das somas quadráticas dos erros entre resultados obtidos através de simulação e resultados obtidos experimentalmente do sistema real.
Resumo:
4th International Conference on Future Generation Communication Technologies (FGCT 2015), Luton, United Kingdom.
Resumo:
In this work, kriging with covariates is used to model and map the spatial distribution of salinity measurements gathered by an autonomous underwater vehicle in a sea outfall monitoring campaign aiming to distinguish the effluent plume from the receiving waters and characterize its spatial variability in the vicinity of the discharge. Four different geostatistical linear models for salinity were assumed, where the distance to diffuser, the west-east positioning, and the south-north positioning were used as covariates. Sample variograms were fitted by the Mat`ern models using weighted least squares and maximum likelihood estimation methods as a way to detect eventual discrepancies. Typically, the maximum likelihood method estimated very low ranges which have limited the kriging process. So, at least for these data sets, weighted least squares showed to be the most appropriate estimation method for variogram fitting. The kriged maps show clearly the spatial variation of salinity, and it is possible to identify the effluent plume in the area studied. The results obtained show some guidelines for sewage monitoring if a geostatistical analysis of the data is in mind. It is important to treat properly the existence of anomalous values and to adopt a sampling strategy that includes transects parallel and perpendicular to the effluent dispersion.
Resumo:
To determine whether the slope of a maximal bronchial challenge test (in which FEV1 falls by over 50%) could be extrapolated from a standard bronchial challenge test (in which FEV1 falls up to 20%), 14 asthmatic children performed a single maximal bronchial challenge test with methacholin(dose range: 0.097–30.08 umol) by the dosimeter method. Maximal dose-response curves were included according to the following criteria: (1) at least one more dose beyond a FEV1 ù 20%; and (2) a MFEV1 ù 50%. PD20 FEV1 was calculated, and the slopes of the early part of the dose-response curve (standard dose-response slopes) and of the entire curve (maximal dose-response slopes) were calculated by two methods: the two-point slope (DRR) and the least squares method (LSS) in % FEV1 × umol−1. Maximal dose-response slopes were compared with the corresponding standard dose-response slopes by a paired Student’s t test after logarithmic transformation of the data; the goodness of fit of the LSS was also determined. Maximal dose-response slopes were significantly different (p < 0.0001) from those calculated on the early part of the curve: DRR20% (91.2 ± 2.7 FEV1% z umol−1)was 2.88 times higher than DRR50% (31.6 ± 3.4 DFEV1% z umol−1), and the LSS20% (89.1 ± 2.8% FEV1 z umol−1) was 3.10 times higher than LSS 50% (28.8 ± 1.5%FEV1 z umol−1). The goodness of fit of LSS 50% was significant in all cases, whereas LSS 20% failed to be significant in one. These results suggest that maximal dose-response slopes cannot be predicted from the data of standard bronchial challenge tests.
Resumo:
Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação
Resumo:
Geographic information systems give us the possibility to analyze, produce, and edit geographic information. Furthermore, these systems fall short on the analysis and support of complex spatial problems. Therefore, when a spatial problem, like land use management, requires a multi-criteria perspective, multi-criteria decision analysis is placed into spatial decision support systems. The analytic hierarchy process is one of many multi-criteria decision analysis methods that can be used to support these complex problems. Using its capabilities we try to develop a spatial decision support system, to help land use management. Land use management can undertake a broad spectrum of spatial decision problems. The developed decision support system had to accept as input, various formats and types of data, raster or vector format, and the vector could be polygon line or point type. The support system was designed to perform its analysis for the Zambezi river Valley in Mozambique, the study area. The possible solutions for the emerging problems had to cover the entire region. This required the system to process large sets of data, and constantly adjust to new problems’ needs. The developed decision support system, is able to process thousands of alternatives using the analytical hierarchy process, and produce an output suitability map for the problems faced.
Resumo:
This study assess the quality of Cybersecurity as a service provided by IT department in corporate network and provides analysis about the service quality impact on the user, seen as a consumer of the service, and on the organization as well. In order to evaluate the quality of this service, multi-item instrument “SERVQUAL” was used for measuring consumer perceptions of service quality. To provide insights about Cybersecurity service quality impact, DeLone and McLean information systems success model was used. To test this approach, data was collected from over one hundred users from different industries and partial least square (PLS) was used to estimate the research model. This study found that SERVQUAL is adequate to assess Cybersecurity service quality and also found that Cybersecurity service quality positively influences the Cybersecurity use and individual impact in Cybersecurity.
Resumo:
Propolis is a chemically complex biomass produced by honeybees (Apis mellifera) from plant resins added of salivary enzymes, beeswax, and pollen. The biological activities described for propolis were also identified for donor plants resin, but a big challenge for the standardization of the chemical composition and biological effects of propolis remains on a better understanding of the influence of seasonality on the chemical constituents of that raw material. Since propolis quality depends, among other variables, on the local flora which is strongly influenced by (a)biotic factors over the seasons, to unravel the harvest season effect on the propolis chemical profile is an issue of recognized importance. For that, fast, cheap, and robust analytical techniques seem to be the best choice for large scale quality control processes in the most demanding markets, e.g., human health applications. For that, UV-Visible (UV-Vis) scanning spectrophotometry of hydroalcoholic extracts (HE) of seventy-three propolis samples, collected over the seasons in 2014 (summer, spring, autumn, and winter) and 2015 (summer and autumn) in Southern Brazil was adopted. Further machine learning and chemometrics techniques were applied to the UV-Vis dataset aiming to gain insights as to the seasonality effect on the claimed chemical heterogeneity of propolis samples determined by changes in the flora of the geographic region under study. Descriptive and classification models were built following a chemometric approach, i.e. principal component analysis (PCA) and hierarchical clustering analysis (HCA) supported by scripts written in the R language. The UV-Vis profiles associated with chemometric analysis allowed identifying a typical pattern in propolis samples collected in the summer. Importantly, the discrimination based on PCA could be improved by using the dataset of the fingerprint region of phenolic compounds ( = 280-400m), suggesting that besides the biological activities of those secondary metabolites, they also play a relevant role for the discrimination and classification of that complex matrix through bioinformatics tools. Finally, a series of machine learning approaches, e.g., partial least square-discriminant analysis (PLS-DA), k-Nearest Neighbors (kNN), and Decision Trees showed to be complementary to PCA and HCA, allowing to obtain relevant information as to the sample discrimination.
Resumo:
Olive oil quality grading is traditionally assessed by human sensory evaluation of positive and negative attributes (olfactory, gustatory, and final olfactorygustatory sensations). However, it is not guaranteed that trained panelist can correctly classify monovarietal extra-virgin olive oils according to olive cultivar. In this work, the potential application of human (sensory panelists) and artificial (electronic tongue) sensory evaluation of olive oils was studied aiming to discriminate eight single-cultivar extra-virgin olive oils. Linear discriminant, partial least square discriminant, and sparse partial least square discriminant analyses were evaluated. The best predictive classification was obtained using linear discriminant analysis with simulated annealing selection algorithm. A low-level data fusion approach (18 electronic tongue signals and nine sensory attributes) enabled 100 % leave-one-out cross-validation correct classification, improving the discrimination capability of the individual use of sensor profiles or sensory attributes (70 and 57 % leave-one-out correct classifications, respectively). So, human sensory evaluation and electronic tongue analysis may be used as complementary tools allowing successful monovarietal olive oil discrimination.
Resumo:
Dissertação de mestrado em Economia Industrial e da Empresa
Resumo:
The author proves that equation, Σy n ΣZx | ΣxyZx ΣxZx ΣxZ2x | = 0, Σy ΣZx Σy2x | where Z = 10-cq and q is a numerical constant, used by Pimentel Gomes and Malavolta in several articles for the interpolation of Mitscherlih's equation y = A [ 1 - 10 - c (x + b) ] by the least squares method, always has a zero of order three for Z = 1. Therefore, equation A Zm + A1Zm -1 + ........... + Am = 0 obtained from that determinant can be divided by (Z-1)³. This property provides a good test for the correctness of the computations and facilitates the solution of the equation.
Resumo:
The parameterized expectations algorithm (PEA) involves a long simulation and a nonlinear least squares (NLS) fit, both embedded in a loop. Both steps are natural candidates for parallelization. This note shows that parallelization can lead to important speedups for the PEA. I provide example code for a simple model that can serve as a template for parallelization of more interesting models, as well as a download link for an image of a bootable CD that allows creation of a cluster and execution of the example code in minutes, with no need to install any software.
Resumo:
The Republic of Haiti is the prime international remittances recipient country in the Latin American and Caribbean (LAC) region relative to its gross domestic product (GDP). The downside of this observation may be that this country is also the first exporter of skilled workers in the world by population size. The present research uses a zero-altered negative binomial (with logit inflation) to model households' international migration decision process, and endogenous regressors' Amemiya Generalized Least Squares method (instrumental variable Tobit, IV-Tobit) to account for selectivity and endogeneity issues in assessing the impact of remittances on labor market outcomes. Results are in line with what has been found so far in this literature in terms of a decline of labor supply in the presence of remittances. However, the impact of international remittances does not seem to be important in determining recipient households' labor participation behavior, particularly for women.
Resumo:
This paper demonstrates that an asset pricing model with least-squares learning can lead to bubbles and crashes as endogenous responses to the fundamentals driving asset prices. When agents are risk-averse they need to make forecasts of the conditional variance of a stock’s return. Recursive updating of both the conditional variance and the expected return implies several mechanisms through which learning impacts stock prices. Extended periods of excess volatility, bubbles and crashes arise with a frequency that depends on the extent to which past data is discounted. A central role is played by changes over time in agents’ estimates of risk.