910 resultados para clustering algorithm


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Agricultural pests are responsible for millions of dollars in crop losses and management costs every year. In order to implement optimal site-specific treatments and reduce control costs, new methods to accurately monitor and assess pest damage need to be investigated. In this paper we explore the combination of unmanned aerial vehicles (UAV), remote sensing and machine learning techniques as a promising technology to address this challenge. The deployment of UAVs as a sensor platform is a rapidly growing field of study for biosecurity and precision agriculture applications. In this experiment, a data collection campaign is performed over a sorghum crop severely damaged by white grubs (Coleoptera: Scarabaeidae). The larvae of these scarab beetles feed on the roots of plants, which in turn impairs root exploration of the soil profile. In the field, crop health status could be classified according to three levels: bare soil where plants were decimated, transition zones of reduced plant density and healthy canopy areas. In this study, we describe the UAV platform deployed to collect high-resolution RGB imagery as well as the image processing pipeline implemented to create an orthoimage. An unsupervised machine learning approach is formulated in order to create a meaningful partition of the image into each of the crop levels. The aim of the approach is to simplify the image analysis step by minimizing user input requirements and avoiding the manual data labeling necessary in supervised learning approaches. The implemented algorithm is based on the K-means clustering algorithm. In order to control high-frequency components present in the feature space, a neighbourhood-oriented parameter is introduced by applying Gaussian convolution kernels prior to K-means. The outcome of this approach is a soft K-means algorithm similar to the EM algorithm for Gaussian mixture models. The results show the algorithm delivers decision boundaries that consistently classify the field into three clusters, one for each crop health level. The methodology presented in this paper represents a venue for further research towards automated crop damage assessments and biosecurity surveillance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Core Vector Machine(CVM) is suitable for efficient large-scale pattern classification. In this paper, a method for improving the performance of CVM with Gaussian kernel function irrespective of the orderings of patterns belonging to different classes within the data set is proposed. This method employs a selective sampling based training of CVM using a novel kernel based scalable hierarchical clustering algorithm. Empirical studies made on synthetic and real world data sets show that the proposed strategy performs well on large data sets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper investigates a new Glowworm Swarm Optimization (GSO) clustering algorithm for hierarchical splitting and merging of automatic multi-spectral satellite image classification (land cover mapping problem). Amongst the multiple benefits and uses of remote sensing, one of the most important has been its use in solving the problem of land cover mapping. Image classification forms the core of the solution to the land cover mapping problem. No single classifier can prove to classify all the basic land cover classes of an urban region in a satisfactory manner. In unsupervised classification methods, the automatic generation of clusters to classify a huge database is not exploited to their full potential. The proposed methodology searches for the best possible number of clusters and its center using Glowworm Swarm Optimization (GSO). Using these clusters, we classify by merging based on parametric method (k-means technique). The performance of the proposed unsupervised classification technique is evaluated for Landsat 7 thematic mapper image. Results are evaluated in terms of the classification efficiency - individual, average and overall.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Summary form only given. A scheme for code compression that has a fast decompression algorithm, which can be implemented using simple hardware, is proposed. The effectiveness of the scheme on the TMS320C62x architecture that includes the overheads of a line address table (LAT) is evaluated and obtained compression rates ranging from 70% to 80%. Two schemes for decompression are proposed. The basic idea underlying the scheme is a simple clustering algorithm that partially maps a block of instructions into a set of clusters. The clustering algorithm is a greedy algorithm based on the frequency of occurrence of various instructions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper describes a new method of color text localization from generic scene images containing text of different scripts and with arbitrary orientations. A representative set of colors is first identified using the edge information to initiate an unsupervised clustering algorithm. Text components are identified from each color layer using a combination of a support vector machine and a neural network classifier trained on a set of low-level features derived from the geometric, boundary, stroke and gradient information. Experiments on camera-captured images that contain variable fonts, size, color, irregular layout, non-uniform illumination and multiple scripts illustrate the robustness of the method. The proposed method yields precision and recall of 0.8 and 0.86 respectively on a database of 100 images. The method is also compared with others in the literature using the ICDAR 2003 robust reading competition dataset.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a new hierarchical clustering algorithm for crop stage classification using hyperspectral satellite image. Amongst the multiple benefits and uses of remote sensing, one of the important application is to solve the problem of crop stage classification. Modern commercial imaging satellites, owing to their large volume of satellite imagery, offer greater opportunities for automated image analysis. Hence, we propose a unsupervised algorithm namely Hierarchical Artificial Immune System (HAIS) of two steps: splitting the cluster centers and merging them. The high dimensionality of the data has been reduced with the help of Principal Component Analysis (PCA). The classification results have been compared with K-means and Artificial Immune System algorithms. From the results obtained, we conclude that the proposed hierarchical clustering algorithm is accurate.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present a methodology for identifying best features from a large feature space. In high dimensional feature space nearest neighbor search is meaningless. In this feature space we see quality and performance issue with nearest neighbor search. Many data mining algorithms use nearest neighbor search. So instead of doing nearest neighbor search using all the features we need to select relevant features. We propose feature selection using Non-negative Matrix Factorization(NMF) and its application to nearest neighbor search. Recent clustering algorithm based on Locally Consistent Concept Factorization(LCCF) shows better quality of document clustering by using local geometrical and discriminating structure of the data. By using our feature selection method we have shown further improvement of performance in the clustering.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper primarily intends to develop a GIS (geographical information system)-based data mining approach for optimally selecting the locations and determining installed capacities for setting up distributed biomass power generation systems in the context of decentralized energy planning for rural regions. The optimal locations within a cluster of villages are obtained by matching the installed capacity needed with the demand for power, minimizing the cost of transportation of biomass from dispersed sources to power generation system, and cost of distribution of electricity from the power generation system to demand centers or villages. The methodology was validated by using it for developing an optimal plan for implementing distributed biomass-based power systems for meeting the rural electricity needs of Tumkur district in India consisting of 2700 villages. The approach uses a k-medoid clustering algorithm to divide the total region into clusters of villages and locate biomass power generation systems at the medoids. The optimal value of k is determined iteratively by running the algorithm for the entire search space for different values of k along with demand-supply matching constraints. The optimal value of the k is chosen such that it minimizes the total cost of system installation, costs of transportation of biomass, and transmission and distribution. A smaller region, consisting of 293 villages was selected to study the sensitivity of the results to varying demand and supply parameters. The results of clustering are represented on a GIS map for the region.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The presence of a large number of spectral bands in the hyperspectral images increases the capability to distinguish between various physical structures. However, they suffer from the high dimensionality of the data. Hence, the processing of hyperspectral images is applied in two stages: dimensionality reduction and unsupervised classification techniques. The high dimensionality of the data has been reduced with the help of Principal Component Analysis (PCA). The selected dimensions are classified using Niche Hierarchical Artificial Immune System (NHAIS). The NHAIS combines the splitting method to search for the optimal cluster centers using niching procedure and the merging method is used to group the data points based on majority voting. Results are presented for two hyperspectral images namely EO-1 Hyperion image and Indian pines image. A performance comparison of this proposed hierarchical clustering algorithm with the earlier three unsupervised algorithms is presented. From the results obtained, we deduce that the NHAIS is efficient.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present a novel algorithm for piecewise linear regression which can learn continuous as well as discontinuous piecewise linear functions. The main idea is to repeatedly partition the data and learn a linear model in each partition. The proposed algorithm is similar in spirit to k-means clustering algorithm. We show that our algorithm can also be viewed as a special case of an EM algorithm for maximum likelihood estimation under a reasonable probability model. We empirically demonstrate the effectiveness of our approach by comparing its performance with that of the state of art algorithms on various datasets. (C) 2014 Elsevier Inc. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Structural information over the entire course of binding interactions based on the analyses of energy landscapes is described, which provides a framework to understand the events involved during biomolecular recognition. Conformational dynamics of malectin's exquisite selectivity for diglucosylated N-glycan (Dig-N-glycan), a highly flexible oligosaccharide comprising of numerous dihedral torsion angles, are described as an example. For this purpose, a novel approach based on hierarchical sampling for acquiring metastable molecular conformations constituting low-energy minima for understanding the structural features involved in a biologic recognition is proposed. For this purpose, four variants of principal component analysis were employed recursively in both Cartesian space and dihedral angles space that are characterized by free energy landscapes to select the most stable conformational substates. Subsequently, k-means clustering algorithm was implemented for geometric separation of the major native state to acquire a final ensemble of metastable conformers. A comparison of malectin complexes was then performed to characterize their conformational properties. Analyses of stereochemical metrics and other concerted binding events revealed surface complementarity, cooperative and bidentate hydrogen bonds, water-mediated hydrogen bonds, carbohydrate-aromatic interactions including CH-pi and stacking interactions involved in this recognition. Additionally, a striking structural transition from loop to beta-strands in malectin CRD upon specific binding to Dig-N-glycan is observed. The interplay of the above-mentioned binding events in malectin and Dig-N-glycan supports an extended conformational selection model as the underlying binding mechanism.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Cells in the lateral intraparietal cortex (LIP) of rhesus macaques respond vigorously and in spatially-tuned fashion to briefly memorized visual stimuli. Responses to stimulus presentation, memory maintenance, and task completion are seen, in varying combination from neuron to neuron. To help elucidate this functional segmentation a new system for simultaneous recording from multiple neighboring neurons was developed. The two parts of this dissertation discuss the technical achievements and scientific discoveries, respectively.

Technology. Simultanous recordings from multiple neighboring neurons were made with four-wire bundle electrodes, or tetrodes, which were adapted to the awake behaving primate preparation. Signals from these electrodes were partitionable into a background process with a 1/f-like spectrum and foreground spiking activity spanning 300-6000 Hz. Continuous voltage recordings were sorted into spike trains using a state-of-the-art clustering algorithm, producing a mean of 3 cells per site. The algorithm classified 96% of spikes correctly when tetrode recordings were confirmed with simultaneous intracellular signals. Recording locations were verified with a new technique that creates electrolytic lesions visible in magnetic resonance imaging, eliminating the need for histological processing. In anticipation of future multi-tetrode work, the chronic chamber microdrive, a device for long-term tetrode delivery, was developed.

Science. Simultaneously recorded neighboring LIP neurons were found to have similar preferred targets in the memory saccade paradigm, but dissimilar peristimulus time histograms, PSTH). A majority of neighboring cell pairs had a difference in preferred directions of under 45° while the trial time of maximal response showed a broader distribution, suggesting homogeneity of tuning with het erogeneity of function. A continuum of response characteristics was present, rather than a set of specific response types; however, a mapping experiment suggests this may be because a given cell's PSTH changes shape as well as amplitude through the response field. Spike train autocovariance was tuned over target and changed through trial epoch, suggesting different mechanisms during memory versus background periods. Mean frequency-domain spike-to-spike coherence was concentrated below 50 Hz with a significant maximum of 0.08; mean time-domain coherence had a narrow peak in the range ±10 ms with a significant maximum of 0.03. Time-domain coherence was found to be untuned for short lags (10 ms), but significantly tuned at larger lags (50 ms).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Wide field-of-view (FOV) microscopy is of high importance to biological research and clinical diagnosis where a high-throughput screening of samples is needed. This thesis presents the development of several novel wide FOV imaging technologies and demonstrates their capabilities in longitudinal imaging of living organisms, on the scale of viral plaques to live cells and tissues.

The ePetri Dish is a wide FOV on-chip bright-field microscope. Here we applied an ePetri platform for plaque analysis of murine norovirus 1 (MNV-1). The ePetri offers the ability to dynamically track plaques at the individual cell death event level over a wide FOV of 6 mm × 4 mm at 30 min intervals. A density-based clustering algorithm is used to analyze the spatial-temporal distribution of cell death events to identify plaques at their earliest stages. We also demonstrate the capabilities of the ePetri in viral titer count and dynamically monitoring plaque formation, growth, and the influence of antiviral drugs.

We developed another wide FOV imaging technique, the Talbot microscope, for the fluorescence imaging of live cells. The Talbot microscope takes advantage of the Talbot effect and can generate a focal spot array to scan the fluorescence samples directly on-chip. It has a resolution of 1.2 μm and a FOV of ~13 mm2. We further upgraded the Talbot microscope for the long-term time-lapse fluorescence imaging of live cell cultures, and analyzed the cells’ dynamic response to an anticancer drug.

We present two wide FOV endoscopes for tissue imaging, named the AnCam and the PanCam. The AnCam is based on the contact image sensor (CIS) technology, and can scan the whole anal canal within 10 seconds with a resolution of 89 μm, a maximum FOV of 100 mm × 120 mm, and a depth-of-field (DOF) of 0.65 mm. We also demonstrate the performance of the AnCam in whole anal canal imaging in both animal models and real patients. In addition to this, the PanCam is based on a smartphone platform integrated with a panoramic annular lens (PAL), and can capture a FOV of 18 mm × 120 mm in a single shot with a resolution of 100─140 μm. In this work we demonstrate the PanCam’s performance in imaging a stained tissue sample.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Neste trabalho, é proposta uma nova família de métodos a ser aplicada à otimização de problemas multimodais. Nestas técnicas, primeiramente são geradas soluções iniciais com o intuito de explorar o espaço de busca. Em seguida, com a finalidade de encontrar mais de um ótimo, estas soluções são agrupadas em subespaços utilizando um algoritmo de clusterização nebulosa. Finalmente, são feitas buscas locais através de métodos determinísticos de otimização dentro de cada subespaço gerado na fase anterior com a finalidade de encontrar-se o ótimo local. A família de métodos é formada por seis variantes, combinando três esquemas de inicialização das soluções na primeira fase e dois algoritmos de busca local na terceira. A fim de que esta nova família de métodos possa ser avaliada, seus constituintes são comparados com outras metodologias utilizando problemas da literatura e os resultados alcançados são promissores.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A new method of finding the optimal group membership and number of groupings to partition population genetic distance data is presented. The software program Partitioning Optimization with Restricted Growth Strings (PORGS), visits all possible set partitions and deems acceptable partitions to be those that reduce mean intracluster distance. The optimal number of groups is determined with the gap statistic which compares PORGS results with a reference distribution. The PORGS method was validated by a simulated data set with a known distribution. For efficiency, where values of n were larger, restricted growth strings (RGS) were used to bipartition populations during a nested search (bi-PORGS). Bi-PORGS was applied to a set of genetic data from 18 Chinook salmon (Oncorhynchus tshawytscha) populations from the west coast of Vancouver Island. The optimal grouping of these populations corresponded to four geographic locations: 1) Quatsino Sound, 2) Nootka Sound, 3) Clayoquot +Barkley sounds, and 4) southwest Vancouver Island. However, assignment of populations to groups did not strictly reflect the geographical divisions; fish of Barkley Sound origin that had strayed into the Gold River and close genetic similarity between transferred and donor populations meant groupings crossed geographic boundaries. Overall, stock structure determined by this partitioning method was similar to that determined by the unweighted pair-group method with arithmetic averages (UPGMA), an agglomerative clustering algorithm.