938 resultados para Elaborazione d’immagini, Microscopia, Istopatologia, Classificazione, K-means


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The majority of distribution utilities do not have accurate information on the constituents of their loads. This information is very useful in managing and planning the network, adequately and economically. Customer loads are normally categorized in three main sectors: 1) residential; 2) industrial; and 3) commercial. In this paper, penalized least-squares regression and Euclidean distance methods are developed for this application to identify and quantify the makeup of a feeder load with unknown sectors/subsectors. This process is done on a monthly basis to account for seasonal and other load changes. The error between the actual and estimated load profiles are used as a benchmark of accuracy. This approach has shown to be accurate in identifying customer types in unknown load profiles, and is used in cross-validation of the results and initial assumptions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data structures such as k-D trees and hierarchical k-means trees perform very well in approximate k nearest neighbour matching, but are only marginally more effective than linear search when performing exact matching in high-dimensional image descriptor data. This paper presents several improvements to linear search that allows it to outperform existing methods and recommends two approaches to exact matching. The first method reduces the number of operations by evaluating the distance measure in order of significance of the query dimensions and terminating when the partial distance exceeds the search threshold. This method does not require preprocessing and significantly outperforms existing methods. The second method improves query speed further by presorting the data using a data structure called d-D sort. The order information is used as a priority queue to reduce the time taken to find the exact match and to restrict the range of data searched. Construction of the d-D sort structure is very simple to implement, does not require any parameter tuning, and requires significantly less time than the best-performing tree structure, and data can be added to the structure relatively efficiently.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Crashes on motorway contribute to a significant proportion (40-50%) of non-recurrent motorway congestions. Hence reduce crashes will help address congestion issues (Meyer, 2008). Crash likelihood estimation studies commonly focus on traffic conditions in a Short time window around the time of crash while longer-term pre-crash traffic flow trends are neglected. In this paper we will show, through data mining techniques, that a relationship between pre-crash traffic flow patterns and crash occurrence on motorways exists, and that this knowledge has the potential to improve the accuracy of existing models and opens the path for new development approaches. The data for the analysis was extracted from records collected between 2007 and 2009 on the Shibuya and Shinjuku lines of the Tokyo Metropolitan Expressway in Japan. The dataset includes a total of 824 rear-end and sideswipe crashes that have been matched with traffic flow data of one hour prior to the crash using an incident detection algorithm. Traffic flow trends (traffic speed/occupancy time series) revealed that crashes could be clustered with regards of the dominant traffic flow pattern prior to the crash. Using the k-means clustering method allowed the crashes to be clustered based on their flow trends rather than their distance. Four major trends have been found in the clustering results. Based on these findings, crash likelihood estimation algorithms can be fine-tuned based on the monitored traffic flow conditions with a sliding window of 60 minutes to increase accuracy of the results and minimize false alarms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Crashes that occur on motorways contribute to a significant proportion (40-50%) of non-recurrent motorway congestions. Hence, reducing the frequency of crashes assists in addressing congestion issues (Meyer, 2008). Crash likelihood estimation studies commonly focus on traffic conditions in a short time window around the time of a crash while longer-term pre-crash traffic flow trends are neglected. In this paper we will show, through data mining techniques that a relationship between pre-crash traffic flow patterns and crash occurrence on motorways exists. We will compare them with normal traffic trends and show this knowledge has the potential to improve the accuracy of existing models and opens the path for new development approaches. The data for the analysis was extracted from records collected between 2007 and 2009 on the Shibuya and Shinjuku lines of the Tokyo Metropolitan Expressway in Japan. The dataset includes a total of 824 rear-end and sideswipe crashes that have been matched with crashes corresponding to traffic flow data using an incident detection algorithm. Traffic trends (traffic speed time series) revealed that crashes can be clustered with regards to the dominant traffic patterns prior to the crash. Using the K-Means clustering method with Euclidean distance function allowed the crashes to be clustered. Then, normal situation data was extracted based on the time distribution of crashes and were clustered to compare with the “high risk” clusters. Five major trends have been found in the clustering results for both high risk and normal conditions. The study discovered traffic regimes had differences in the speed trends. Based on these findings, crash likelihood estimation models can be fine-tuned based on the monitored traffic conditions with a sliding window of 30 minutes to increase accuracy of the results and minimize false alarms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Crashes that occur on motorways contribute to a significant proportion (40-50%) of non-recurrent motorway congestion. Hence, reducing the frequency of crashes assist in addressing congestion issues (Meyer, 2008). Analysing traffic conditions and discovering risky traffic trends and patterns are essential basics in crash likelihood estimations studies and still require more attention and investigation. In this paper we will show, through data mining techniques, that there is a relationship between pre-crash traffic flow patterns and crash occurrence on motorways, compare them with normal traffic trends, and that this knowledge has the potentiality to improve the accuracy of existing crash likelihood estimation models, and opens the path for new development approaches. The data for the analysis was extracted from records collected between 2007 and 2009 on the Shibuya and Shinjuku lines of the Tokyo Metropolitan Expressway in Japan. The dataset includes a total of 824 rear-end and sideswipe crashes that have been matched with crashes corresponding traffic flow data using an incident detection algorithm. Traffic trends (traffic speed time series) revealed that crashes can be clustered with regards to the dominant traffic patterns prior to the crash occurrence. K-Means clustering algorithm applied to determine dominant pre-crash traffic patterns. In the first phase of this research, traffic regimes identified by analysing crashes and normal traffic situations using half an hour speed in upstream locations of crashes. Then, the second phase investigated the different combination of speed risk indicators to distinguish crashes from normal traffic situations more precisely. Five major trends have been found in the first phase of this paper for both high risk and normal conditions. The study discovered traffic regimes had differences in the speed trends. Moreover, the second phase explains that spatiotemporal difference of speed is a better risk indicator among different combinations of speed related risk indicators. Based on these findings, crash likelihood estimation models can be fine-tuned to increase accuracy of estimations and minimize false alarms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an investigation into event detection in crowded scenes, where the event of interest co-occurs with other activities and only binary labels at the clip level are available. The proposed approach incorporates a fast feature descriptor from the MPEG domain, and a novel multiple instance learning (MIL) algorithm using sparse approximation and random sensing. MPEG motion vectors are used to build particle trajectories that represent the motion of objects in uniform video clips, and the MPEG DCT coefficients are used to compute a foreground map to remove background particles. Trajectories are transformed into the Fourier domain, and the Fourier representations are quantized into visual words using the K-Means algorithm. The proposed MIL algorithm models the scene as a linear combination of independent events, where each event is a distribution of visual words. Experimental results show that the proposed approaches achieve promising results for event detection compared to the state-of-the-art.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The K-means algorithm is one of the most popular techniques in clustering. Nevertheless, the performance of the K-means algorithm depends highly on initial cluster centers and converges to local minima. This paper proposes a hybrid evolutionary programming based clustering algorithm, called PSO-SA, by combining particle swarm optimization (PSO) and simulated annealing (SA). The basic idea is to search around the global solution by SA and to increase the information exchange among particles using a mutation operator to escape local optima. Three datasets, Iris, Wisconsin Breast Cancer, and Ripley’s Glass, have been considered to show the effectiveness of the proposed clustering algorithm in providing optimal clusters. The simulation results show that the PSO-SA clustering algorithm not only has a better response but also converges more quickly than the K-means, PSO, and SA algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Long-term measurements of particle number size distribution (PNSD) produce a very large number of observations and their analysis requires an efficient approach in order to produce results in the least possible time and with maximum accuracy. Clustering techniques are a family of sophisticated methods which have been recently employed to analyse PNSD data, however, very little information is available comparing the performance of different clustering techniques on PNSD data. This study aims to apply several clustering techniques (i.e. K-means, PAM, CLARA and SOM) to PNSD data, in order to identify and apply the optimum technique to PNSD data measured at 25 sites across Brisbane, Australia. A new method, based on the Generalised Additive Model (GAM) with a basis of penalised B-splines, was proposed to parameterise the PNSD data and the temporal weight of each cluster was also estimated using the GAM. In addition, each cluster was associated with its possible source based on the results of this parameterisation, together with the characteristics of each cluster. The performances of four clustering techniques were compared using the Dunn index and Silhouette width validation values and the K-means technique was found to have the highest performance, with five clusters being the optimum. Therefore, five clusters were found within the data using the K-means technique. The diurnal occurrence of each cluster was used together with other air quality parameters, temporal trends and the physical properties of each cluster, in order to attribute each cluster to its source and origin. The five clusters were attributed to three major sources and origins, including regional background particles, photochemically induced nucleated particles and vehicle generated particles. Overall, clustering was found to be an effective technique for attributing each particle size spectra to its source and the GAM was suitable to parameterise the PNSD data. These two techniques can help researchers immensely in analysing PNSD data for characterisation and source apportionment purposes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents visual detection and classification of light vehicles and personnel on a mine site.We capitalise on the rapid advances of ConvNet based object recognition but highlight that a naive black box approach results in a significant number of false positives. In particular, the lack of domain specific training data and the unique landscape in a mine site causes a high rate of errors. We exploit the abundance of background-only images to train a k-means classifier to complement the ConvNet. Furthermore, localisation of objects of interest and a reduction in computation is enabled through region proposals. Our system is tested on over 10km of real mine site data and we were able to detect both light vehicles and personnel. We show that the introduction of our background model can reduce the false positive rate by an order of magnitude.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Frog protection has become increasingly essential due to the rapid decline of its biodiversity. Therefore, it is valuable to develop new methods for studying this biodiversity. In this paper, a novel feature extraction method is proposed based on perceptual wavelet packet decomposition for classifying frog calls in noisy environments. Pre-processing and syllable segmentation are first applied to the frog call. Then, a spectral peak track is extracted from each syllable if possible. Track duration, dominant frequency and oscillation rate are directly extracted from the track. With k-means clustering algorithm, the calculated dominant frequency of all frog species is clustered into k parts, which produce a frequency scale for wavelet packet decomposition. Based on the adaptive frequency scale, wavelet packet decomposition is applied to the frog calls. Using the wavelet packet decomposition coefficients, a new feature set named perceptual wavelet packet decomposition sub-band cepstral coefficients is extracted. Finally, a k-nearest neighbour (k-NN) classifier is used for the classification. The experiment results show that the proposed features can achieve an average classification accuracy of 97.45% which outperforms syllable features (86.87%) and Mel-frequency cepstral coefficients (MFCCs) feature (90.80%).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Road traffic emissions are often considered the main source of ultrafine particles (UFP, diameter smaller than 100 nm) in urban environments. However, recent studies worldwide have shown that - in high-insolation urban regions at least - new particle formation events can also contribute to UFP. In order to quantify such events we systematically studied three cities located in predominantly sunny environments: Barcelona (Spain), Madrid (Spain) and Brisbane (Australia). Three long term datasets (1-2 years) of fine and ultrafine particle number size distributions (measured by SMPS, Scanning Mobility Particle Sizer) were analysed. Compared to total particle number concentrations, aerosol size distributions offer far more information on the type, origin and atmospheric evolution of the particles. By applying k-Means clustering analysis, we categorized the collected aerosol size distributions in three main categories: “Traffic” (prevailing 44-63% of the time), “Nucleation” (14-19%) and “Background pollution and Specific cases” (7-22%). Measurements from Rome (Italy) and Los Angeles (California) were also included to complement the study. The daily variation of the average UFP concentrations for a typical nucleation day at each site revealed a similar pattern for all cities, with three distinct particle bursts. A morning and an evening spike reflected traffic rush hours, whereas a third one at midday showed nucleation events. The photochemically nucleated particles burst lasted 1-4 hours, reaching sizes of 30-40 nm. On average, the occurrence of particle size spectra dominated by nucleation events was 16% of the time, showing the importance of this process as a source of UFP in urban environments exposed to high solar radiation. On average, nucleation events lasting for 2 hours or more occurred on 55% of the days, this extending to >4hrs in 28% of the days, demonstrating that atmospheric conditions in urban environments are not favourable to the growth of photochemically nucleated particles. In summary, although traffic remains the main source of UFP in urban areas, in developed countries with high insolation urban nucleation events are also a main source of UFP. If traffic-related particle concentrations are reduced in the future, nucleation events will likely increase in urban areas, due to the reduced urban condensation sinks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivation and personal goals play an important role in the ways in which people direct their behavior. Personal goals are closely connected with well-being but they also relate to how people perform in different achievement domains. Many studies show that evaluating study-related goals as important, easy to attain and non stressful, predict better academic achievements than evaluating them as non attainable and stressful (Salmela-Aro & Nurmi, 1997b). The aim of this study was to describe motivational factors among theology students. They form an interesting group in terms of exploring connections between motivation, spiritual goals and academic achievements. The average duration of graduation at the Faculty of Theology is among the highest at the University of Helsinki. On the other hand, it may be assumed that many theology students have spiritual goals which affect their studies. A special focus was paid on the different evaluations of study-related personal projects and how they are related to academic achievement. A methodology of personal projects (Little, 1983) was used to study what kind of personal goals theology students are engaged in during their studies. In the first part of the questionnaire the subjects (N=133) were asked to describe important personal projects. They were given four numbered lines for their written responses. In the second part the subjects were asked to rate projects concerning their studies according to 13 dimensions using a 7-point Likert-scale. Three subgroups were formed on a K-Means Cluster Analysis on the basis of evaluations of the study-related projects. The groups were named committed, self-fulfillers and non-committed according to their evaluations of their study related projects. Academic achievements among the different groups varied substantially. After two years of studying the students who were in the committed group had completed on an average twenty study credits more than those who were in the non-committed group. Self-fulfillers placed in the middle of the three groups. Committed and self-fulfiller students also reported higher levels of intrinsic reasons for striving towards study-related goals. The results indicate that goals reported at the beginning of studies predicted academic achievement later on. The results also showed that different evaluations of goals have long lasting connections to progress in studying. Implications for student well-being and how these results can be utilized for student counseling are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The role of different chemical compounds, particularly organics, involved in the new particle formation (NPF) and its consequent growth are not fully understood. Therefore, this study was conducted to investigate the chemical composition of aerosol particles during NPF events in an urban subtropical environment. Aerosol chemical composition was measured along with particle number size distribution (PNSD) and several other air quality parameters at five sites across an urban subtropical environment. An Aerodyne compact Time-of-Flight Aerosol Mass Spectrometer (c-TOF-AMS) and a TSI Scanning Mobility Particle Sizer (SMPS) measured aerosol chemical composition (particles above 50 nm in vacuum aerodynamic diameter) and PNSD (particles within 9-414 nm in mobility diameter), respectively. Five NPF events, with growth rates in the range 3.3-4.6 nm, were detected at two of the sites. The NPF events happened on relatively warmer days with lower condensation sink (CS). Temporal percent fractions of organics increased after the particles grew enough to have a significant contribution to particles volume, while the mass fraction of ammonium and sulphate decreased. This uncovered the important role of organics in the growth of newly formed particles. Three organic markers, factors f43, f44 and f57, were calculated and the f44 vs f43 trends were compared between nucleation and non-nucleation days. K-means cluster analysis was performed on f44 vs f43 data and it was found that they follow different patterns on nucleation days compared to non-nucleation days, whereby f43 decreased for vehicle emission generated particles, while both f44 and f43 decreased for NPF generated particles. It was found for the first time that vehicle generated and newly formed particles cluster in different locations on f44 vs f43 plot and this finding can be potentially used as a tool for source apportionment of measured particles.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

K-means algorithm is a well known nonhierarchical method for clustering data. The most important limitations of this algorithm are that: (1) it gives final clusters on the basis of the cluster centroids or the seed points chosen initially, and (2) it is appropriate for data sets having fairly isotropic clusters. But this algorithm has the advantage of low computation and storage requirements. On the other hand, hierarchical agglomerative clustering algorithm, which can cluster nonisotropic (chain-like and concentric) clusters, requires high storage and computation requirements. This paper suggests a new method for selecting the initial seed points, so that theK-means algorithm gives the same results for any input data order. This paper also describes a hybrid clustering algorithm, based on the concepts of multilevel theory, which is nonhierarchical at the first level and hierarchical from second level onwards, to cluster data sets having (i) chain-like clusters and (ii) concentric clusters. It is observed that this hybrid clustering algorithm gives the same results as the hierarchical clustering algorithm, with less computation and storage requirements.