5 resultados para means clustering

em Digital Commons - Michigan Tech


Relevância:

60.00% 60.00%

Publicador:

Resumo:

In 1998-2001 Finland suffered the most severe insect outbreak ever recorded, over 500,000 hectares. The outbreak was caused by the common pine sawfly (Diprion pini L.). The outbreak has continued in the study area, Palokangas, ever since. To find a good method to monitor this type of outbreaks, the purpose of this study was to examine the efficacy of multi-temporal ERS-2 and ENVISAT SAR imagery for estimating Scots pine (Pinus sylvestris L.) defoliation. Three methods were tested: unsupervised k-means clustering, supervised linear discriminant analysis (LDA) and logistic regression. In addition, I assessed if harvested areas could be differentiated from the defoliated forest using the same methods. Two different speckle filters were used to determine the effect of filtering on the SAR imagery and subsequent results. The logistic regression performed best, producing a classification accuracy of 81.6% (kappa 0.62) with two classes (no defoliation, >20% defoliation). LDA accuracy was with two classes at best 77.7% (kappa 0.54) and k-means 72.8 (0.46). In general, the largest speckle filter, 5 x 5 image window, performed best. When additional classes were added the accuracy was usually degraded on a step-by-step basis. The results were good, but because of the restrictions in the study they should be confirmed with independent data, before full conclusions can be made that results are reliable. The restrictions include the small size field data and, thus, the problems with accuracy assessment (no separate testing data) as well as the lack of meteorological data from the imaging dates.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The primary goal of this project is to demonstrate the practical use of data mining algorithms to cluster a solved steady-state computational fluids simulation (CFD) flow domain into a simplified lumped-parameter network. A commercial-quality code, “cfdMine” was created using a volume-weighted k-means clustering that that can accomplish the clustering of a 20 million cell CFD domain on a single CPU in several hours or less. Additionally agglomeration and k-means Mahalanobis were added as optional post-processing steps to further enhance the separation of the clusters. The resultant nodal network is considered a reduced-order model and can be solved transiently at a very minimal computational cost. The reduced order network is then instantiated in the commercial thermal solver MuSES to perform transient conjugate heat transfer using convection predicted using a lumped network (based on steady-state CFD). When inserting the lumped nodal network into a MuSES model, the potential for developing a “localized heat transfer coefficient” is shown to be an improvement over existing techniques. Also, it was found that the use of the clustering created a new flow visualization technique. Finally, fixing clusters near equipment newly demonstrates a capability to track temperatures near specific objects (such as equipment in vehicles).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Virtually every sector of business and industry that uses computing, including financial analysis, search engines, and electronic commerce, incorporate Big Data analysis into their business model. Sophisticated clustering algorithms are popular for deducing the nature of data by assigning labels to unlabeled data. We address two main challenges in Big Data. First, by definition, the volume of Big Data is too large to be loaded into a computer’s memory (this volume changes based on the computer used or available, but there is always a data set that is too large for any computer). Second, in real-time applications, the velocity of new incoming data prevents historical data from being stored and future data from being accessed. Therefore, we propose our Streaming Kernel Fuzzy c-Means (stKFCM) algorithm, which reduces both computational complexity and space complexity significantly. The proposed stKFCM only requires O(n2) memory where n is the (predetermined) size of a data subset (or data chunk) at each time step, which makes this algorithm truly scalable (as n can be chosen based on the available memory). Furthermore, only 2n2 elements of the full N × N (where N >> n) kernel matrix need to be calculated at each time-step, thus reducing both the computation time in producing the kernel elements and also the complexity of the FCM algorithm. Empirical results show that stKFCM, even with relatively very small n, can provide clustering performance as accurately as kernel fuzzy c-means run on the entire data set while achieving a significant speedup.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present studies of the spatial clustering of inertial particles embedded in turbulent flow. A major part of the thesis is experimental, involving the technique of Phase Doppler Interferometry (PDI). The thesis also includes significant amount of simulation studies and some theoretical considerations. We describe the details of PDI and explain why it is suitable for study of particle clustering in turbulent flow with a strong mean velocity. We introduce the concept of the radial distribution function (RDF) as our chosen way of quantifying inertial particle clustering and present some original works on foundational and practical considerations related to it. These include methods of treating finite sampling size, interpretation of the magnitude of RDF and the possibility of isolating RDF signature of inertial clustering from that of large scale mixing. In experimental work, we used the PDI to observe clustering of water droplets in a turbulent wind tunnel. From that we present, in the form of a published paper, evidence of dynamical similarity (Stokes number similarity) of inertial particle clustering together with other results in qualitative agreement with available theoretical prediction and simulation results. We next show detailed quantitative comparisons of results from our experiments, direct-numerical-simulation (DNS) and theory. Very promising agreement was found for like-sized particles (mono-disperse). Theory is found to be incorrect regarding clustering of different-sized particles and we propose a empirical correction based on the DNS and experimental results. Besides this, we also discovered a few interesting characteristics of inertial clustering. Firstly, through observations, we found an intriguing possibility for modeling the RDF arising from inertial clustering that has only one (sensitive) parameter. We also found that clustering becomes saturated at high Reynolds number.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Volcán Pacaya is one of three currently active volcanoes in Guatemala. Volcanic activity originates from the local tectonic subduction of the Cocos plate beneath the Caribbean plate along the Pacific Guatemalan coast. Pacaya is characterized by generally strombolian type activity with occasional larger vulcanian type eruptions approximately every ten years. One particularly large eruption occurred on May 27, 2010. Using GPS data collected for approximately 8 years before this eruption and data from an additional three years of collection afterwards, surface movement covering the period of the eruption can be measured and used as a tool to help understand activity at the volcano. Initial positions were obtained from raw data using the Automatic Precise Positioning Service provided by the NASA Jet Propulsion Laboratory. Forward modeling of observed 3-D displacements for three time periods (before, covering and after the May 2010 eruption) revealed that a plausible source for deformation is related to a vertical dike or planar surface trending NNW-SSE through the cone. For three distinct time periods the best fitting models describe deformation of the volcano: 0.45 right lateral movement and 0.55 m tensile opening along the dike mentioned above from October 2001 through January 2009 (pre-eruption); 0.55 m left lateral slip along the dike mentioned above for the period from January 2009 and January 2011 (covering the eruption); -0.025 m dip slip along the dike for the period from January 2011 through March 2013 (post-eruption). In all bestfit models the dike is oriented with a 75° westward dip. These data have respective RMS misfit values of 5.49 cm, 12.38 cm and 6.90 cm for each modeled period. During the time period that includes the eruption the volcano most likely experienced a combination of slip and inflation below the edifice which created a large scar at the surface down the northern flank of the volcano. All models that a dipping dike may be experiencing a combination of inflation and oblique slip below the edifice which augments the possibility of a westward collapse in the future.