806 resultados para fuzzy clustering


Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper uses folksonomies and fuzzy clustering algorithms to establish term-relevant related results. This paper will propose a Meta search engine with the ability to search for vaguely associated terms and aggregate them into several meaningful cluster categories. The potential of the fuzzy weblog extraction is illustrated using a simple example and added value and possible future studies are discussed in the conclusion.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Social Web offers increasingly simple ways to publish and disseminate personal or opinionated information, which can rapidly exhibit a disastrous influence on the online reputation of organizations. Based on social Web data, this study describes the building of an ontology based on fuzzy sets. At the end of a recurring harvesting of folksonomies by Web agents, the aggregated tags are purified, linked, and transformed to a so-called fuzzy grassroots ontology by means of a fuzzy clustering algorithm. This self-updating ontology is used for online reputation analysis, a crucial task of reputation management, with the goal to follow the online conversation going on around an organization to discover and monitor its reputation. In addition, an application of the Fuzzy Online Reputation Analysis (FORA) framework, lesson learned, and potential extensions are discussed in this article.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Architecture and learning algorithm of self-learning spiking neural network in fuzzy clustering task are outlined. Fuzzy receptive neurons for pulse-position transformation of input data are considered. It is proposed to treat a spiking neural network in terms of classical automatic control theory apparatus based on the Laplace transform. It is shown that synapse functioning can be easily modeled by a second order damped response unit. Spiking neuron soma is presented as a threshold detection unit. Thus, the proposed fuzzy spiking neural network is an analog-digital nonlinear pulse-position dynamic system. It is demonstrated how fuzzy probabilistic and possibilistic clustering approaches can be implemented on the base of the presented spiking neural network.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With the rapid growth of the Internet, computer attacks are increasing at a fast pace and can easily cause millions of dollar in damage to an organization. Detecting these attacks is an important issue of computer security. There are many types of attacks and they fall into four main categories, Denial of Service (DoS) attacks, Probe, User to Root (U2R) attacks, and Remote to Local (R2L) attacks. Within these categories, DoS and Probe attacks continuously show up with greater frequency in a short period of time when they attack systems. They are different from the normal traffic data and can be easily separated from normal activities. On the contrary, U2R and R2L attacks are embedded in the data portions of the packets and normally involve only a single connection. It becomes difficult to achieve satisfactory detection accuracy for detecting these two attacks. Therefore, we focus on studying the ambiguity problem between normal activities and U2R/R2L attacks. The goal is to build a detection system that can accurately and quickly detect these two attacks. In this dissertation, we design a two-phase intrusion detection approach. In the first phase, a correlation-based feature selection algorithm is proposed to advance the speed of detection. Features with poor prediction ability for the signatures of attacks and features inter-correlated with one or more other features are considered redundant. Such features are removed and only indispensable information about the original feature space remains. In the second phase, we develop an ensemble intrusion detection system to achieve accurate detection performance. The proposed method includes multiple feature selecting intrusion detectors and a data mining intrusion detector. The former ones consist of a set of detectors, and each of them uses a fuzzy clustering technique and belief theory to solve the ambiguity problem. The latter one applies data mining technique to automatically extract computer users’ normal behavior from training network traffic data. The final decision is a combination of the outputs of feature selecting and data mining detectors. The experimental results indicate that our ensemble approach not only significantly reduces the detection time but also effectively detect U2R and R2L attacks that contain degrees of ambiguous information.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With the rapid growth of the Internet, computer attacks are increasing at a fast pace and can easily cause millions of dollar in damage to an organization. Detecting these attacks is an important issue of computer security. There are many types of attacks and they fall into four main categories, Denial of Service (DoS) attacks, Probe, User to Root (U2R) attacks, and Remote to Local (R2L) attacks. Within these categories, DoS and Probe attacks continuously show up with greater frequency in a short period of time when they attack systems. They are different from the normal traffic data and can be easily separated from normal activities. On the contrary, U2R and R2L attacks are embedded in the data portions of the packets and normally involve only a single connection. It becomes difficult to achieve satisfactory detection accuracy for detecting these two attacks. Therefore, we focus on studying the ambiguity problem between normal activities and U2R/R2L attacks. The goal is to build a detection system that can accurately and quickly detect these two attacks. In this dissertation, we design a two-phase intrusion detection approach. In the first phase, a correlation-based feature selection algorithm is proposed to advance the speed of detection. Features with poor prediction ability for the signatures of attacks and features inter-correlated with one or more other features are considered redundant. Such features are removed and only indispensable information about the original feature space remains. In the second phase, we develop an ensemble intrusion detection system to achieve accurate detection performance. The proposed method includes multiple feature selecting intrusion detectors and a data mining intrusion detector. The former ones consist of a set of detectors, and each of them uses a fuzzy clustering technique and belief theory to solve the ambiguity problem. The latter one applies data mining technique to automatically extract computer users’ normal behavior from training network traffic data. The final decision is a combination of the outputs of feature selecting and data mining detectors. The experimental results indicate that our ensemble approach not only significantly reduces the detection time but also effectively detect U2R and R2L attacks that contain degrees of ambiguous information.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The value of soil evidence in the forensic discipline is well known. However, it would be advantageous if an in-situ method was available that could record responses from tyre or shoe impressions in ground soil at the crime scene. The development of optical fibres and emerging portable NIR instruments has unveiled a potential methodology which could permit such a proposal. The NIR spectral region contains rich chemical information in the form of overtone and combination bands of the fundamental infrared absorptions and low-energy electronic transitions. This region has in the past, been perceived as being too complex for interpretation and consequently was scarcely utilized. The application of NIR in the forensic discipline is virtually non-existent creating a vacancy for research in this area. NIR spectroscopy has great potential in the forensic discipline as it is simple, nondestructive and capable of rapidly providing information relating to chemical composition. The objective of this study is to investigate the ability of NIR spectroscopy combined with Chemometrics to discriminate between individual soils. A further objective is to apply the NIR process to a simulated forensic scenario where soil transfer occurs. NIR spectra were recorded from twenty-seven soils sampled from the Logan region in South-East Queensland, Australia. A series of three high quartz soils were mixed with three different kaolinites in varying ratios and NIR spectra collected. Spectra were also collected from six soils as the temperature of the soils was ramped from room temperature up to 6000C. Finally, a forensic scenario was simulated where the transferral of ground soil to shoe soles was investigated. Chemometrics methods such as the commonly known Principal Component Analysis (PCA), the less well known fuzzy clustering (FC) and ranking by means of multicriteria decision making (MCDM) methodology were employed to interpret the spectral results. All soils were characterised using Inductively Coupled Plasma Optical Emission Spectroscopy and X-Ray Diffractometry. Results were promising revealing NIR combined with Chemometrics is capable of discriminating between the various soils. Peak assignments were established by comparing the spectra of known minerals with the spectra collected from the soil samples. The temperature dependent NIR analysis confirmed the assignments of the absorptions due to adsorbed and molecular bound water. The relative intensities of the identified NIR absorptions reflected the quantitative XRD and ICP characterisation results. PCA and FC analysis of the raw soils in the initial NIR investigation revealed that the soils were primarily distinguished on the basis of their relative quartz and kaolinte contents, and to a lesser extent on the horizon from which they originated. Furthermore, PCA could distinguish between the three kaolinites used in the study, suggesting that the NIR spectral region was sensitive enough to contain information describing variation within kaolinite itself. The forensic scenario simulation PCA successfully discriminated between the ‘Backyard Soil’ and ‘Melcann® Sand’, as well as the two sampling methods employed. Further PCA exploration revealed that it was possible to distinguish between the various shoes used in the simulation. In addition, it was possible to establish association between specific sampling sites on the shoe with the corresponding site remaining in the impression. The forensic application revealed some limitations of the process relating to moisture content and homogeneity of the soil. These limitations can both be overcome by simple sampling practices and maintaining the original integrity of the soil. The results from the forensic scenario simulation proved that the concept shows great promise in the forensic discipline.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This overview focuses on the application of chemometrics techniques for the investigation of soils contaminated by polycyclic aromatic hydrocarbons (PAHs) and metals because these two important and very diverse groups of pollutants are ubiquitous in soils. The salient features of various studies carried out in the micro- and recreational environments of humans, are highlighted in the context of the various multivariate statistical techniques available across discipline boundaries that have been effectively used in soil studies. Particular attention is paid to techniques employed in the geosciences that may be effectively utilized for environmental soil studies; classical multivariate approaches that may be used in isolation or as complementary methods to these are also discussed. Chemometrics techniques widely applied in atmospheric studies for identifying sources of pollutants or for determining the importance of contaminant source contributions to a particular site, have seen little use in soil studies, but may be effectively employed in such investigations. Suitable programs are also available for suggesting mitigating measures in cases of soil contamination, and these are also considered. Specific techniques reviewed include pattern recognition techniques such as Principal Components Analysis (PCA), Fuzzy Clustering (FC) and Cluster Analysis (CA); geostatistical tools include variograms, Geographical Information Systems (GIS), contour mapping and kriging; source identification and contribution estimation methods reviewed include Positive Matrix Factorisation (PMF), and Principal Component Analysis on Absolute Principal Component Scores (PCA/APCS). Mitigating measures to limit or eliminate pollutant sources may be suggested through the use of ranking analysis and multi criteria decision making methods (MCDM). These methods are mainly represented in this review by studies employing the Preference Ranking Organisation Method for Enrichment Evaluation (PROMETHEE) and its associated graphic output, Geometrical Analysis for Interactive Aid (GAIA).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An investigation into the effects of changes in urban traffic characteristics due to rapid urbanisation and the predicted changes in rainfall characteristics due to climate change on the build-up and wash-off of heavy metals was carried out in Gold Coast, Australia. The study sites encompassed three different urban land uses. Nine heavy metals commonly associated with traffic emissions were selected. The results were interpreted using multivariate data analysis and decision making tools, such as principal component analysis (PCA), fuzzy clustering (FC), PROMETHEE and GAIA. Initial analyses established high, low and moderate traffic scenarios as well as low, low to moderate, moderate, high and extreme rainfall scenarios for build-up and wash-off investigations. GAIA analyses established that moderate to high traffic scenarios could affect the build-up while moderate to high rainfall scenarios could affect the wash-off of heavy metals under changed conditions. However, in wash-off, metal concentrations in 1-75µm fraction were found to be independent of the changes to rainfall characteristics. In build-up, high traffic activities in commercial and industrial areas influenced the accumulation of heavy metal concentrations in particulate size range from 75 - >300 µm, whereas metal concentrations in finer size range of <1-75 µm were not affected. As practical implications, solids <1 µm and organic matter from 1 - >300 µm can be targeted for removal of Ni, Cu, Pb, Cd, Cr and Zn from build-up whilst organic matter from <1 - >300 µm can be targeted for removal of Cd, Cr, Pb and Ni from wash-off. Cu and Zn need to be removed as free ions from most fractions in wash-off.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Human hair fibres are ubiquitous in nature and are found frequently at crime scenes often as a result of exchange between the perpetrator, victim and/or the surroundings according to Locard's Principle. Therefore, hair fibre evidence can provide important information for crime investigation. For human hair evidence, the current forensic methods of analysis rely on comparisons of either hair morphology by microscopic examination or nuclear and mitochondrial DNA analyses. Unfortunately in some instances the utilisation of microscopy and DNA analyses are difficult and often not feasible. This dissertation is arguably the first comprehensive investigation aimed to compare, classify and identify the single human scalp hair fibres with the aid of FTIR-ATR spectroscopy in a forensic context. Spectra were collected from the hair of 66 subjects of Asian, Caucasian and African (i.e. African-type). The fibres ranged from untreated to variously mildly and heavily cosmetically treated hairs. The collected spectra reflected the physical and chemical nature of a hair from the near-surface particularly, the cuticle layer. In total, 550 spectra were acquired and processed to construct a relatively large database. To assist with the interpretation of the complex spectra from various types of human hair, Derivative Spectroscopy and Chemometric methods such as Principal Component Analysis (PCA), Fuzzy Clustering (FC) and Multi-Criteria Decision Making (MCDM) program; Preference Ranking Organisation Method for Enrichment Evaluation (PROMETHEE) and Geometrical Analysis for Interactive Aid (GAIA); were utilised. FTIR-ATR spectroscopy had two important advantages over to previous methods: (i) sample throughput and spectral collection were significantly improved (no physical flattening or microscope manipulations), and (ii) given the recent advances in FTIR-ATR instrument portability, there is real potential to transfer this work.s findings seamlessly to on-field applications. The "raw" spectra, spectral subtractions and second derivative spectra were compared to demonstrate the subtle differences in human hair. SEM images were used as corroborative evidence to demonstrate the surface topography of hair. It indicated that the condition of the cuticle surface could be of three types: untreated, mildly treated and treated hair. Extensive studies of potential spectral band regions responsible for matching and discrimination of various types of hair samples suggested the 1690-1500 cm-1 IR spectral region was to be preferred in comparison with the commonly used 1750-800 cm-1. The principal reason was the presence of the highly variable spectral profiles of cystine oxidation products (1200-1000 cm-1), which contributed significantly to spectral scatter and hence, poor hair sample matching. In the preferred 1690-1500 cm-1 region, conformational changes in the keratin protein attributed to the α-helical to β-sheet transitions in the Amide I and Amide II vibrations and played a significant role in matching and discrimination of the spectra and hence, the hair fibre samples. For gender comparison, the Amide II band is significant for differentiation. The results illustrated that the male hair spectra exhibit a more intense β-sheet vibration in the Amide II band at approximately 1511 cm-1 whilst the female hair spectra displayed more intense α-helical vibration at 1520-1515cm-1. In terms of chemical composition, female hair spectra exhibit greater intensity of the amino acid tryptophan (1554 cm-1), aspartic and glutamic acid (1577 cm-1). It was also observed that for the separation of samples based on racial differences, untreated Caucasian hair was discriminated from Asian hair as a result of having higher levels of the amino acid cystine and cysteic acid. However, when mildly or chemically treated, Asian and Caucasian hair fibres are similar, whereas African-type hair fibres are different. In terms of the investigation's novel contribution to the field of forensic science, it has allowed for the development of a novel, multifaceted, methodical protocol where previously none had existed. The protocol is a systematic method to rapidly investigate unknown or questioned single human hair FTIR-ATR spectra from different genders and racial origin, including fibres of different cosmetic treatments. Unknown or questioned spectra are first separated on the basis of chemical treatment i.e. untreated, mildly treated or chemically treated, genders, and racial origin i.e. Asian, Caucasian and African-type. The methodology has the potential to complement the current forensic analysis methods of fibre evidence (i.e. Microscopy and DNA), providing information on the morphological, genetic and structural levels.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Genetic research of complex diseases is a challenging, but exciting, area of research. The early development of the research was limited, however, until the completion of the Human Genome and HapMap projects, along with the reduction in the cost of genotyping, which paves the way for understanding the genetic composition of complex diseases. In this thesis, we focus on the statistical methods for two aspects of genetic research: phenotype definition for diseases with complex etiology and methods for identifying potentially associated Single Nucleotide Polymorphisms (SNPs) and SNP-SNP interactions. With regard to phenotype definition for diseases with complex etiology, we firstly investigated the effects of different statistical phenotyping approaches on the subsequent analysis. In light of the findings, and the difficulties in validating the estimated phenotype, we proposed two different methods for reconciling phenotypes of different models using Bayesian model averaging as a coherent mechanism for accounting for model uncertainty. In the second part of the thesis, the focus is turned to the methods for identifying associated SNPs and SNP interactions. We review the use of Bayesian logistic regression with variable selection for SNP identification and extended the model for detecting the interaction effects for population based case-control studies. In this part of study, we also develop a machine learning algorithm to cope with the large scale data analysis, namely modified Logic Regression with Genetic Program (MLR-GEP), which is then compared with the Bayesian model, Random Forests and other variants of logic regression.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Photochemistry has made significant contributions to our understanding of many important natural processes as well as the scientific discoveries of the man-made world. The measurements from such studies are often complex and may require advanced data interpretation with the use of multivariate or chemometrics methods. In general, such methods have been applied successfully for data display, classification, multivariate curve resolution and prediction in analytical chemistry, environmental chemistry, engineering, medical research and industry. However, in photochemistry, by comparison, applications of such multivariate approaches were found to be less frequent although a variety of methods have been used, especially with spectroscopic photochemical applications. The methods include Principal Component Analysis (PCA; data display), Partial Least Squares (PLS; prediction), Artificial Neural Networks (ANN; prediction) and several models for multivariate curve resolution related to Parallel Factor Analysis (PARAFAC; decomposition of complex responses). Applications of such methods are discussed in this overview and typical examples include photodegradation of herbicides, prediction of antibiotics in human fluids (fluorescence spectroscopy), non-destructive in- and on-line monitoring (near infrared spectroscopy) and fast-time resolution of spectroscopic signals from photochemical reactions. It is also quite clear from the literature that the scope of spectroscopic photochemistry was enhanced by the application of chemometrics. To highlight and encourage further applications of chemometrics in photochemistry, several additional chemometrics approaches are discussed using data collected by the authors. The use of a PCA biplot is illustrated with an analysis of a matrix containing data on the performance of photocatalysts developed for water splitting and hydrogen production. In addition, the applications of the Multi-Criteria Decision Making (MCDM) ranking methods and Fuzzy Clustering are demonstrated with an analysis of water quality data matrix. Other examples of topics include the application of simultaneous kinetic spectroscopic methods for prediction of pesticides, and the use of response fingerprinting approach for classification of medicinal preparations. In general, the overview endeavours to emphasise the advantages of chemometrics' interpretation of multivariate photochemical data, and an Appendix of references and summaries of common and less usual chemometrics methods noted in this work, is provided. Crown Copyright © 2010.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Detecting anomalies in the online social network is a significant task as it assists in revealing the useful and interesting information about the user behavior on the network. This paper proposes a rule-based hybrid method using graph theory, Fuzzy clustering and Fuzzy rules for modeling user relationships inherent in online-social-network and for identifying anomalies. Fuzzy C-Means clustering is used to cluster the data and Fuzzy inference engine is used to generate rules based on the cluster behavior. The proposed method is able to achieve improved accuracy for identifying anomalies in comparison to existing methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Neste trabalho, é proposta uma nova família de métodos a ser aplicada à otimização de problemas multimodais. Nestas técnicas, primeiramente são geradas soluções iniciais com o intuito de explorar o espaço de busca. Em seguida, com a finalidade de encontrar mais de um ótimo, estas soluções são agrupadas em subespaços utilizando um algoritmo de clusterização nebulosa. Finalmente, são feitas buscas locais através de métodos determinísticos de otimização dentro de cada subespaço gerado na fase anterior com a finalidade de encontrar-se o ótimo local. A família de métodos é formada por seis variantes, combinando três esquemas de inicialização das soluções na primeira fase e dois algoritmos de busca local na terceira. A fim de que esta nova família de métodos possa ser avaliada, seus constituintes são comparados com outras metodologias utilizando problemas da literatura e os resultados alcançados são promissores.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

首先利用模糊C-均值聚类算法在多特征形成的特征空间上对图像进行区域分割,并在此基础上对区域进行多尺度小波分解;然后利用柯西函数构造区域的模糊相似度,应用模糊相似度及区域信息量构造加权因子,从而得到融合图像的小波系数;最后利用小波逆变换得到融合图像·采用均方根误差、峰值信噪比、熵、交叉熵和互信息5种准则评价融合算法的性能·实验结果表明,文中方法具有良好的融合特性·

Relevância:

60.00% 60.00%

Publicador:

Resumo:

一般说来,离群点是远离其他数据点的数据,但很可能包含着极其重要的信息.提出了一种新的离群模糊核聚类算法来发现样本集中的离群点.通过Mercer核把原来的数据空间映射到特征空间,并为特征空间的每个向量分配一个动态权值,在经典的FCM模糊聚类算法的基础上得到了一个特征空间内的全新的聚类目标函数,通过对目标函数的优化,最终得到了各个数据的权值,根据权值的大小标识出样本集中的离群点.仿真实验的结果表明了该离群模糊核聚类算法的可行性和有效性.