881 resultados para Landmark-based spectral clustering
Resumo:
With recent advances in remote sensing processing technology, it has become more feasible to begin analysis of the enormous historic archive of remotely sensed data. This historical data provides valuable information on a wide variety of topics which can influence the lives of millions of people if processed correctly and in a timely manner. One such field of benefit is that of landslide mapping and inventory. This data provides a historical reference to those who live near high risk areas so future disasters may be avoided. In order to properly map landslides remotely, an optimum method must first be determined. Historically, mapping has been attempted using pixel based methods such as unsupervised and supervised classification. These methods are limited by their ability to only characterize an image spectrally based on single pixel values. This creates a result prone to false positives and often without meaningful objects created. Recently, several reliable methods of Object Oriented Analysis (OOA) have been developed which utilize a full range of spectral, spatial, textural, and contextual parameters to delineate regions of interest. A comparison of these two methods on a historical dataset of the landslide affected city of San Juan La Laguna, Guatemala has proven the benefits of OOA methods over those of unsupervised classification. Overall accuracies of 96.5% and 94.3% and F-score of 84.3% and 77.9% were achieved for OOA and unsupervised classification methods respectively. The greater difference in F-score is a result of the low precision values of unsupervised classification caused by poor false positive removal, the greatest shortcoming of this method.
Resumo:
Altough nowadays DMTA is one of the most used techniques to characterize polymers thermo-mechanical behaviour, it is only effective for small amplitude oscillatory tests and limited to a single frequency analysis (linear regime). In this thesis work a Fourier transform based experimental system has proven to give hint on structural and chemical changes in specimens during large amplitude oscillatory tests exploiting multi frequency spectral analysis turning out in a more sensitive tool than classical linear approach. The test campaign has been focused on three test typologies: Strain sweep tests, Damage investigation and temperature sweep tests.
Resumo:
In this thesis we present a mathematical formulation of the interaction between microorganisms such as bacteria or amoebae and chemicals, often produced by the organisms themselves. This interaction is called chemotaxis and leads to cellular aggregation. We derive some models to describe chemotaxis. The first is the pioneristic Keller-Segel parabolic-parabolic model and it is derived by two different frameworks: a macroscopic perspective and a microscopic perspective, in which we start with a stochastic differential equation and we perform a mean-field approximation. This parabolic model may be generalized by the introduction of a degenerate diffusion parameter, which depends on the density itself via a power law. Then we derive a model for chemotaxis based on Cattaneo's law of heat propagation with finite speed, which is a hyperbolic model. The last model proposed here is a hydrodynamic model, which takes into account the inertia of the system by a friction force. In the limit of strong friction, the model reduces to the parabolic model, whereas in the limit of weak friction, we recover a hyperbolic model. Finally, we analyze the instability condition, which is the condition that leads to aggregation, and we describe the different kinds of aggregates we may obtain: the parabolic models lead to clusters or peaks whereas the hyperbolic models lead to the formation of network patterns or filaments. Moreover, we discuss the analogy between bacterial colonies and self gravitating systems by comparing the chemotactic collapse and the gravitational collapse (Jeans instability).
Resumo:
The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.
Resumo:
Research regarding the use of social media among travelers has mainly focused on its impact on travelers’ travel planning process and there is consensus that travel decisions are highly influenced by social media. Yet, little attention has been paid to the differences among travelers regarding their use of social media for travel purposes. Based on the use of travel social media, cluster analysis was employed to identify different segments among travelers. Furthermore, the study profiles the clusters based on demographic and other travel related characteristics. The findings of this study are important to online marketers to better understand traveler’s use of social media and their characteristics, in order to adapt online marketing strategies according to the profile of each segment.
Resumo:
Research regarding the use of social media among travelers has mainly focused on its impact on travelers’ travel planning process and there is consensus that travel decisions are highly influenced by social media. Yet, little attention has been paid to the differences among travelers regarding their use of social media for travel purposes. Based on the use of travel social media, cluster analysis was employed to identify different segments among travelers. Furthermore, the study profiles the clusters based on demographic and other travel related characteristics. The findings of this study are important to online marketers to better understand traveler’s use of social media and their characteristics, in order to adapt online marketing strategies according to the profile of each segment.
Resumo:
A method is presented for accurate measurement of spectral flux-reflectance (albedo) in a laboratory, for media with long optical path lengths, such as snow and ice. The approach uses an acrylic hemispheric dome, which, when placed over the surface being studied, serves two functions: (i) it creates an overcast “sky” to illuminate the target surface from all directions within a hemisphere, and (ii) serves as a platform for measuring incident and backscattered spectral radiances, which can be integrated to obtain fluxes. The fluxes are relative measurements and because their ratio is used to determine flux-reflectance, no absolute radiometric calibrations are required. The dome and surface must meet minimum size requirements based on the scattering properties of the surface. This technique is suited for media with long photon path lengths since the backscattered illumination is collected over a large enough area to include photons that reemerge from the domain far from their point of entry because of multiple scattering and small absorption. Comparison between field and laboratory albedo of a portable test surface demonstrates the viability of this method.
Resumo:
The current approach to data analysis for the Laser Interferometry Space Antenna (LISA) depends on the time delay interferometry observables (TDI) which have to be generated before any weak signal detection can be performed. These are linear combinations of the raw data with appropriate time shifts that lead to the cancellation of the laser frequency noises. This is possible because of the multiple occurrences of the same noises in the different raw data. Originally, these observables were manually generated starting with LISA as a simple stationary array and then adjusted to incorporate the antenna's motions. However, none of the observables survived the flexing of the arms in that they did not lead to cancellation with the same structure. The principal component approach is another way of handling these noises that was presented by Romano and Woan which simplified the data analysis by removing the need to create them before the analysis. This method also depends on the multiple occurrences of the same noises but, instead of using them for cancellation, it takes advantage of the correlations that they produce between the different readings. These correlations can be expressed in a noise (data) covariance matrix which occurs in the Bayesian likelihood function when the noises are assumed be Gaussian. Romano and Woan showed that performing an eigendecomposition of this matrix produced two distinct sets of eigenvalues that can be distinguished by the absence of laser frequency noise from one set. The transformation of the raw data using the corresponding eigenvectors also produced data that was free from the laser frequency noises. This result led to the idea that the principal components may actually be time delay interferometry observables since they produced the same outcome, that is, data that are free from laser frequency noise. The aims here were (i) to investigate the connection between the principal components and these observables, (ii) to prove that the data analysis using them is equivalent to that using the traditional observables and (ii) to determine how this method adapts to real LISA especially the flexing of the antenna. For testing the connection between the principal components and the TDI observables a 10x 10 covariance matrix containing integer values was used in order to obtain an algebraic solution for the eigendecomposition. The matrix was generated using fixed unequal arm lengths and stationary noises with equal variances for each noise type. Results confirm that all four Sagnac observables can be generated from the eigenvectors of the principal components. The observables obtained from this method however, are tied to the length of the data and are not general expressions like the traditional observables, for example, the Sagnac observables for two different time stamps were generated from different sets of eigenvectors. It was also possible to generate the frequency domain optimal AET observables from the principal components obtained from the power spectral density matrix. These results indicate that this method is another way of producing the observables therefore analysis using principal components should give the same results as that using the traditional observables. This was proven by fact that the same relative likelihoods (within 0.3%) were obtained from the Bayesian estimates of the signal amplitude of a simple sinusoidal gravitational wave using the principal components and the optimal AET observables. This method fails if the eigenvalues that are free from laser frequency noises are not generated. These are obtained from the covariance matrix and the properties of LISA that are required for its computation are the phase-locking, arm lengths and noise variances. Preliminary results of the effects of these properties on the principal components indicate that only the absence of phase-locking prevented their production. The flexing of the antenna results in time varying arm lengths which will appear in the covariance matrix and, from our toy model investigations, this did not prevent the occurrence of the principal components. The difficulty with flexing, and also non-stationary noises, is that the Toeplitz structure of the matrix will be destroyed which will affect any computation methods that take advantage of this structure. In terms of separating the two sets of data for the analysis, this was not necessary because the laser frequency noises are very large compared to the photodetector noises which resulted in a significant reduction in the data containing them after the matrix inversion. In the frequency domain the power spectral density matrices were block diagonals which simplified the computation of the eigenvalues by allowing them to be done separately for each block. The results in general showed a lack of principal components in the absence of phase-locking except for the zero bin. The major difference with the power spectral density matrix is that the time varying arm lengths and non-stationarity do not show up because of the summation in the Fourier transform.
Resumo:
OBJECTIVES: The aims of this study were to establish a Colombian smoothed centile charts and LMS tables for tríceps, subscapular and sum tríceps+subscapular skinfolds; appropriate cut-offs were selected using receiver operating characteristic analysis based in a populationbased sample of schoolchildren in Bogota, Colombia and to compare them with international studies. METHODS: A total of 9 618 children and adolescents attending public schools in Bogota, Colombia (55.7% girls; age range of 9–17.9 years). Height, weight, body mass index (BMI), waist circumference, triceps and subscapular skinfold measurements were obtained using standardized methods. We have calculated tríceps+subscapular skinfold (T+SS) sum. Smoothed percentile curves for triceps and subscapular skinfold thickness were derived by the LMS method. Receiver operating characteristics curve (ROC) analyses were used to evaluate the optimal cut-off point of tríceps, subscapular and sum tríceps+subscapular skinfolds for overweight and obesity based on the International Obesity Task Force (IOTF) definitions. Data were compared with international studies. RESULTS: Subscapular, triceps skinfolds and T+SS were significantly higher in girls than in boys (P <0.001). The median values for triceps, subscapular as well as T+SS skinfold thickness increased in a sex-specific pattern with age. The ROC analysis showed that subscapular, triceps skinfolds and T+SS have a high discrimination power in the identification of overweight and obesity in the sample population in this study. Based on the raw non-adjusted data, we found that Colombian boys and girls had high triceps and subscapular skinfolds values than their counterparts from Spain, UK, German and US. CONCLUSIONS: Our results provide sex- and age-specific normative reference standards for the triceps and subscapular skinfold thickness values in a large, population-based sample of 3 schoolchildren and adolescents from an Latin-American population. By providing LMS tables for Latin-American people based on Colombian reference data, we hope to provide quantitative tools for the study of obesity and its complications.
Resumo:
It is well known that rib cage dimensions depend on the gender and vary with the age of the individual. Under this setting it is therefore possible to assume that a computational approach to the problem may be thought out and, consequently, this work will focus on the development of an Artificial Intelligence grounded decision support system to predict individual’s age, based on such measurements. On the one hand, using some basic image processing techniques it were extracted such descriptions from chest X-rays (i.e., its maximum width and height). On the other hand, the computational framework was built on top of a Logic Programming Case Base approach to knowledge representation and reasoning, which caters for the handling of incomplete, unknown, or even contradictory information. Furthermore, clustering methods based on similarity analysis among cases were used to distinguish and aggregate collections of historical data in order to reduce the search space, therefore enhancing the cases retrieval and the overall computational process. The accuracy of the proposed model is satisfactory, close to 90%.
Resumo:
Solar radiation takes in today's world, an increasing importance. Different devices are used to carry out spectral and integrated measurements of solar radiation. Thus the sensors can be divided into the fallow types: Calorimetric, Thermomechanical, Thermoelectric and Photoelectric. The first three categories are based on components converting the radiation to temperature (or heat) and then into electrical quantity. On the other hand, the photoelectric sensors are based on semiconductor or optoelectronic elements that when irradiated change their impedance or generate a measurable electric signal. The response function of the sensor element depends not only on the intensity of the radiation but also on its wavelengths. The radiation sensors most widely used fit in the first categories, but thanks to the reduction in manufacturing costs and to the increased integration of electronic systems, the use of the photoelectric-type sensors became more interesting. In this work we present a study of the behavior of different optoelectronic sensor elements. It is intended to verify the static response of the elements to the incident radiation. We study the optoelectronic elements using mathematical models that best fit their response as a function of wavelength. As an input to the model, the solar radiation values are generated with a radiative transfer model. We present a modeling of the spectral response sensors of other types in order to compare the behavior of optoelectronic elements with other sensors currently in use.
Resumo:
This paper proposes a process for the classifi cation of new residential electricity customers. The current state of the art is extended by using a combination of smart metering and survey data and by using model-based feature selection for the classifi cation task. Firstly, the normalized representative consumption profi les of the population are derived through the clustering of data from households. Secondly, new customers are classifi ed using survey data and a limited amount of smart metering data. Thirdly, regression analysis and model-based feature selection results explain the importance of the variables and which are the drivers of diff erent consumption profi les, enabling the extraction of appropriate models. The results of a case study show that the use of survey data signi ficantly increases accuracy of the classifi cation task (up to 20%). Considering four consumption groups, more than half of the customers are correctly classifi ed with only one week of metering data, with more weeks the accuracy is signifi cantly improved. The use of model-based feature selection resulted in the use of a signifi cantly lower number of features allowing an easy interpretation of the derived models.
Resumo:
This paper proposes a novel demand response model using a fuzzy subtractive cluster approach. The model development provides support to domestic consumer decisions on controllable loads management, considering consumers’ consumption needs and the appropriate load shape or rescheduling in order to achieve possible economic benefits. The model based on fuzzy subtractive clustering method considers clusters of domestic consumption covering an adequate consumption range. Analysis of different scenarios is presented considering available electric power and electric energy prices. Simulation results are presented and conclusions of the proposed demand response model are discussed.
Resumo:
Modifications in vegetation cover can have an impact on the climate through changes in biogeochemical and biogeophysical processes. In this paper, the tree canopy cover percentage of a savannah-like ecosystem (montado/dehesa) was estimated at Landsat pixel level for 2011, and the role of different canopy cover percentages on land surface albedo (LSA) and land surface temperature (LST) were analysed. A modelling procedure using a SGB machine-learning algorithm and Landsat 5-TM spectral bands and derived vegetation indices as explanatory variables, showed that the estimation of montado canopy cover was obtained with good agreement (R2 = 78.4%). Overall, montado canopy cover estimations showed that low canopy cover class (MT_1) is the most representative with 50.63% of total montado area. MODIS LSA and LST products were used to investigate the magnitude of differences in mean annual LSA and LST values between contrasting montado canopy cover percentages. As a result, it was found a significant statistical relationship between montado canopy cover percentage and mean annual surface albedo (R2 = 0.866, p < 0.001) and surface temperature (R2 = 0.942, p < 0.001). The comparisons between the four contrasting montado canopy cover classes showed marked differences in LSA (χ2 = 192.17, df = 3, p < 0.001) and LST (χ2 = 318.18, df = 3, p < 0.001). The highest montado canopy cover percentage (MT_4) generally had lower albedo than lowest canopy cover class, presenting a difference of −11.2% in mean annual albedo values. It was also showed that MT_4 and MT_3 are the cooler canopy cover classes, and MT_2 and MT_1 the warmer, where MT_1 class had a difference of 3.42 °C compared with MT_4 class. Overall, this research highlighted the role that potential changes in montado canopy cover may play in local land surface albedo and temperature variations, as an increase in these two biogeophysical parameters may potentially bring about, in the long term, local/regional climatic changes moving towards greater aridity.
Resumo:
The success of regional development policies depends on the homogeneity of the territorial units. This paper aims to propose a framework for obtaining homogenous territorial clusters based on a Pareto frontier considering multiple criteria related to territories’ endogenous resources, economic profile and socio-cultural features. This framework is developed in two phases. First, the criteria correlated with development at the territorial unit level are determined through statistical and econometric methods. Then, a multi-criteria approach is developed to allocate each territorial unit (parishes) to a territorial agglomerate, according to the Pareto frontier established.