900 resultados para Fuzzy additive spectral clustering
Resumo:
n this paper we deal with the problem of obtaining the set of k-additive measures dominating a fuzzy measure. This problem extends the problem of deriving the set of probabilities dominating a fuzzy measure, an important problem appearing in Decision Making and Game Theory. The solution proposed in the paper follows the line developed by Chateauneuf and Jaffray for dominating probabilities and continued by Miranda et al. for dominating k-additive belief functions. Here, we address the general case transforming the problem into a similar one such that the involved set functions have non-negative Möbius transform; this simplifies the problem and allows a result similar to the one developed for belief functions. Although the set obtained is very large, we show that the conditions cannot be sharpened. On the other hand, we also show that it is possible to define a more restrictive subset, providing a more natural extension of the result for probabilities, such that it is possible to derive any k-additive dominating measure from it.
Resumo:
Virtually every sector of business and industry that uses computing, including financial analysis, search engines, and electronic commerce, incorporate Big Data analysis into their business model. Sophisticated clustering algorithms are popular for deducing the nature of data by assigning labels to unlabeled data. We address two main challenges in Big Data. First, by definition, the volume of Big Data is too large to be loaded into a computer’s memory (this volume changes based on the computer used or available, but there is always a data set that is too large for any computer). Second, in real-time applications, the velocity of new incoming data prevents historical data from being stored and future data from being accessed. Therefore, we propose our Streaming Kernel Fuzzy c-Means (stKFCM) algorithm, which reduces both computational complexity and space complexity significantly. The proposed stKFCM only requires O(n2) memory where n is the (predetermined) size of a data subset (or data chunk) at each time step, which makes this algorithm truly scalable (as n can be chosen based on the available memory). Furthermore, only 2n2 elements of the full N × N (where N >> n) kernel matrix need to be calculated at each time-step, thus reducing both the computation time in producing the kernel elements and also the complexity of the FCM algorithm. Empirical results show that stKFCM, even with relatively very small n, can provide clustering performance as accurately as kernel fuzzy c-means run on the entire data set while achieving a significant speedup.
Resumo:
In this work we compare Grapholita molesta Busck (Lepidoptera: Tortricidae) populations originated from Brazil, Chile, Spain, Italy and Greece using power spectral density and phylogenetic analysis to detect any similarities between the population macro- and the molecular micro-level. Log-transformed population data were normalized and AR(p) models were developed to generate for each case population time series of equal lengths. The time-frequency/scale properties of the population data were further analyzed using wavelet analysis to detect any population dynamics frequency changes and cluster the populations. Based on the power spectral of each population time series and the hierarchical clustering schemes, populations originated from Southern America (Brazil and Chile) exhibit similar rhythmic properties and are both closer related with populations originated from Greece. Populations from Spain and especially Italy, have higher distance by terms of periodic changes on their population dynamics. Moreover, the members within the same cluster share similar spectral information, therefore they are supposed to participate in the same temporally regulated population process. On the contrary, the phylogenetic approach revealed a less structured pattern that bears indications of panmixia, as the two clusters contain individuals from both Europe and South America. This preliminary outcome will be further assessed by incorporating more individuals and likely employed a second molecular marker.
Resumo:
The study carried out in this thesis is devoted to spectral analysis of systems of PDEs related also with quantum physics models. Namely, the research deals with classes of systems that contain certain quantum optics models such as Jaynes-Cummings, Rabi and their generalizations that describe light-matter interaction. First we investigate the spectral Weyl asymptotics for a class of semiregular systems, extending to the vector-valued case results of Helffer and Robert, and more recently of Doll, Gannot and Wunsch. Actually, the asymptotics by Doll, Gannot and Wunsch is more precise (that is why we call it refined) than the classical result by Helffer and Robert, but deals with a less general class of systems, since the authors make an hypothesis on the measure of the subset of the unit sphere on which the tangential derivatives of the X-Ray transform of the semiprincipal symbol vanish to infinity order. Abstract Next, we give a meromorphic continuation of the spectral zeta function for semiregular differential systems with polynomial coefficients, generalizing the results by Ichinose and Wakayama and Parmeggiani. Finally, we state and prove a quasi-clustering result for a class of systems including the aforementioned quantum optics models and we conclude the thesis by showing a Weyl law result for the Rabi model and its generalizations.
Resumo:
Different types of water bodies, including lakes, streams, and coastal marine waters, are often susceptible to fecal contamination from a range of point and nonpoint sources, and have been evaluated using fecal indicator microorganisms. The most commonly used fecal indicator is Escherichia coli, but traditional cultivation methods do not allow discrimination of the source of pollution. The use of triplex PCR offers an approach that is fast and inexpensive, and here enabled the identification of phylogroups. The phylogenetic distribution of E. coli subgroups isolated from water samples revealed higher frequencies of subgroups A1 and B23 in rivers impacted by human pollution sources, while subgroups D1 and D2 were associated with pristine sites, and subgroup B1 with domesticated animal sources, suggesting their use as a first screening for pollution source identification. A simple classification is also proposed based on phylogenetic subgroup distribution using the w-clique metric, enabling differentiation of polluted and unpolluted sites.
Resumo:
Purpose. To investigate misalignments (MAs) on retinal nerve fiber layer thickness (RNFLT) measurements obtained with Cirrus(©) SD-OCT. Methods. This was a retrospective, observational, cross-sectional study. Twenty-seven healthy and 29 glaucomatous eyes of 56 individuals with one normal exam and another showing MA were included. MAs were defined as an improper alignment of vertical vessels in the en face image. MAs were classified in complete MA (CMA) and partial MA (PMA), according to their site: 1 (superior, outside the measurement ring (MR)), 2 (superior, within MR), 3 (inferior, within MR), and 4 (inferior, outside MR). We compared RNFLT measurements of aligned versus misaligned exams in all 4 sectors, in the superior area (sectors 1 + 2), inferior area (sectors 3 + 4), and within the measurement ring (sectors 2 + 3). Results. RNFLT measurements at 12 clock-hour of eyes with MAs in the superior area (sectors 1 + 2) were significantly lower than those obtained in the same eyes without MAs (P = 0.043). No significant difference was found in other areas (sectors 1 + 2 + 3 + 4, sectors 3 + 4, and sectors 2 + 3). Conclusion. SD-OCT scans with superior MAs may present lower superior RNFLT measurements compared to aligned exams.
Resumo:
A method using the ring-oven technique for pre-concentration in filter paper discs and near infrared hyperspectral imaging is proposed to identify four detergent and dispersant additives, and to determine their concentration in gasoline. Different approaches were used to select the best image data processing in order to gather the relevant spectral information. This was attained by selecting the pixels of the region of interest (ROI), using a pre-calculated threshold value of the PCA scores arranged as histograms, to select the spectra set; summing up the selected spectra to achieve representativeness; and compensating for the superimposed filter paper spectral information, also supported by scores histograms for each individual sample. The best classification model was achieved using linear discriminant analysis and genetic algorithm (LDA/GA), whose correct classification rate in the external validation set was 92%. Previous classification of the type of additive present in the gasoline is necessary to define the PLS model required for its quantitative determination. Considering that two of the additives studied present high spectral similarity, a PLS regression model was constructed to predict their content in gasoline, while two additional models were used for the remaining additives. The results for the external validation of these regression models showed a mean percentage error of prediction varying from 5 to 15%.
Resumo:
PURPOSE: To evaluate the sensitivity and specificity of machine learning classifiers (MLCs) for glaucoma diagnosis using Spectral Domain OCT (SD-OCT) and standard automated perimetry (SAP). METHODS: Observational cross-sectional study. Sixty two glaucoma patients and 48 healthy individuals were included. All patients underwent a complete ophthalmologic examination, achromatic standard automated perimetry (SAP) and retinal nerve fiber layer (RNFL) imaging with SD-OCT (Cirrus HD-OCT; Carl Zeiss Meditec Inc., Dublin, California). Receiver operating characteristic (ROC) curves were obtained for all SD-OCT parameters and global indices of SAP. Subsequently, the following MLCs were tested using parameters from the SD-OCT and SAP: Bagging (BAG), Naive-Bayes (NB), Multilayer Perceptron (MLP), Radial Basis Function (RBF), Random Forest (RAN), Ensemble Selection (ENS), Classification Tree (CTREE), Ada Boost M1(ADA),Support Vector Machine Linear (SVML) and Support Vector Machine Gaussian (SVMG). Areas under the receiver operating characteristic curves (aROC) obtained for isolated SAP and OCT parameters were compared with MLCs using OCT+SAP data. RESULTS: Combining OCT and SAP data, MLCs' aROCs varied from 0.777(CTREE) to 0.946 (RAN).The best OCT+SAP aROC obtained with RAN (0.946) was significantly larger the best single OCT parameter (p<0.05), but was not significantly different from the aROC obtained with the best single SAP parameter (p=0.19). CONCLUSION: Machine learning classifiers trained on OCT and SAP data can successfully discriminate between healthy and glaucomatous eyes. The combination of OCT and SAP measurements improved the diagnostic accuracy compared with OCT data alone.
Resumo:
The sustainability of intensive swine production demands alternative destinations for the generated residues. Ashes from swine rice husk-based deep bedding were tested as a mineral addition for cement mortars. The ashes were obtained at 400 to 600ºC, ground and sieved through a 325 mesh sieve (# 0.045 mm). The characterization of the ashes included the determination of the index of pozzolanic activity with lime. The ashes were also tested as partial substitutes of Portland cement. The mortars were prepared using a cement:sand proportion of 1:1.5, and with water/cement ratio of 0.4. Three percentages of mass substitution of the cement were tested: 10, 20 and 30%. Mortar performances were assessed at 7 and 28 days determining their compressive strength. The chosen condition for calcinations at the laboratory scale was related to the maximum temperature of 600ºC since the resulting ashes contained vitreous materials and presented satisfactory values for the pozzolanic index under analysis. The pozzolanic activity indicated promising results for ashes produced at 600ºC as a replacement of up to 30% in cement masses.
Resumo:
The study of tokamak plasma light emissions in the vacuum ultraviolet (VUV) region is an important subject since many impurity spectral emissions are present in this region. These spectral emissions can be used to determine the plasma ion temperature and density from different species and spatial positions inside plasma according to their temperatures. We have analyzed VUV spectra from 500 Å to 3200 Å wavelength in the TCABR tokamak plasma including higher diffraction order emissions. There have been identified 37 first diffraction order emissions, resulting in 28 second diffraction order, 24 third diffraction order, and 7 fourth diffraction order lines. The emissions are from impurity species such as OII, OIII, OIV, OV, OVI, OVII, CII, CIII, CIV, NIII, NIV, and NV. All the spectra beyond 1900 Å are from higher diffraction order emissions, and possess much better spectral resolution. Each strong and isolated spectral line, as well as its higher diffraction order emissions suitable for plasma diagnostic is identified and discussed. Finally, an example of ion temperature determination using different diffraction order is presented.
Resumo:
In the southern region of Mato Grosso do Sul state, Brazil, a foot-and-mouth disease (FMD) epidemic started in September 2005. A total of 33 outbreaks were detected and 33,741 FMD-susceptible animals were slaughtered and destroyed. There were no reports of FMD cases in other species than bovines. Based on the data of this epidemic, it was carried out an analysis using the K-function and it was observed spatial clustering of outbreaks within a range of 25km. This observation may be related to the dynamics of foot-and-mouth disease spread and to the measures undertaken to control the disease dissemination. The control measures were effective once the disease did not spread to farms more than 47 km apart from the initial outbreaks.
Resumo:
In this paper, we present a fuzzy approach to the Reed-Frost model for epidemic spreading taking into account uncertainties in the diagnostic of the infection. The heterogeneities in the infected group is based on the clinical signals of the individuals (symptoms, laboratorial exams, medical findings, etc.), which are incorporated into the dynamic of the epidemic. The infectivity level is time-varying and the classification of the individuals is performed through fuzzy relations. Simulations considering a real problem with data of the viral epidemic in a children daycare are performed and the results are compared with a stochastic Reed-Frost generalization
Resumo:
Objective: The biochemical alterations between inflammatory fibrous hyperplasia (IFH) and normal tissues of buccal mucosa were probed by using the FT-Raman spectroscopy technique. The aim was to find the minimal set of Raman bands that would furnish the best discrimination. Background: Raman-based optical biopsy is a widely recognized potential technique for noninvasive real-time diagnosis. However, few studies had been devoted to the discrimination of very common subtle or early pathologic states as inflammatory processes that are always present on, for example, cancer lesion borders. Methods: Seventy spectra of IFH from 14 patients were compared with 30 spectra of normal tissues from six patients. The statistical analysis was performed with principal components analysis and soft independent modeling class analogy cross-validated, leave-one-out methods. Results: Bands close to 574, 1,100, 1,250 to 1,350, and 1,500 cm(-1) (mainly amino acids and collagen bands) showed the main intragroup variations that are due to the acanthosis process in the IFH epithelium. The 1,200 (C-C aromatic/DNA), 1,350 (CH(2) bending/collagen 1), and 1,730 cm(-1) (collagen III) regions presented the main intergroup variations. This finding was interpreted as originating in an extracellular matrix-degeneration process occurring in the inflammatory tissues. The statistical analysis results indicated that the best discrimination capability (sensitivity of 95% and specificity of 100%) was found by using the 530-580 cm(-1) spectral region. Conclusions: The existence of this narrow spectral window enabling normal and inflammatory diagnosis also had useful implications for an in vivo dispersive Raman setup for clinical applications.
Resumo:
Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.
Resumo:
The aim of the present study was to evaluate the heterosis effects on weaning weight at 205 days (WW, N = 146,464), yearling weight at 390 days (YW, N = 69,315) and weight gain from weaning to yearling (WG, N = 59,307) in composite beef cattle. The fixed models were: RM, which included contemporary groups, class of age of dam, outcrossing percentages for direct and maternal effects, and additive direct and maternal ( AM) breed effects; R, RM model, minus AM breed effects, and H, RM model, minus additive breed effects. The estimates for W205 were in general positive (P < 0.01). The R and H models resulted in similar estimates, but they were very different from the ones estimated by the RM model. For W390, the R and H models resulted in general positive estimates (P < 0.05). For WG, the RM model resulted in general significant heterosis effects (P < 0.05). It can be concluded that the RM model seems to supply estimates of better quality (P < 0.01).