3 resultados para big data storage
em Digital Commons - Michigan Tech
Resumo:
Virtually every sector of business and industry that uses computing, including financial analysis, search engines, and electronic commerce, incorporate Big Data analysis into their business model. Sophisticated clustering algorithms are popular for deducing the nature of data by assigning labels to unlabeled data. We address two main challenges in Big Data. First, by definition, the volume of Big Data is too large to be loaded into a computer’s memory (this volume changes based on the computer used or available, but there is always a data set that is too large for any computer). Second, in real-time applications, the velocity of new incoming data prevents historical data from being stored and future data from being accessed. Therefore, we propose our Streaming Kernel Fuzzy c-Means (stKFCM) algorithm, which reduces both computational complexity and space complexity significantly. The proposed stKFCM only requires O(n2) memory where n is the (predetermined) size of a data subset (or data chunk) at each time step, which makes this algorithm truly scalable (as n can be chosen based on the available memory). Furthermore, only 2n2 elements of the full N × N (where N >> n) kernel matrix need to be calculated at each time-step, thus reducing both the computation time in producing the kernel elements and also the complexity of the FCM algorithm. Empirical results show that stKFCM, even with relatively very small n, can provide clustering performance as accurately as kernel fuzzy c-means run on the entire data set while achieving a significant speedup.
Resumo:
Understanding the canopy cover of an urban environment leads to better estimates of carbon storage and more informed management decisions by urban foresters. The most commonly used method for assessing urban forest cover type extent is ground surveys, which can be both timeconsuming and expensive. The analysis of aerial photos is an alternative method that is faster, cheaper, and can cover a larger number of sites, but may be less accurate. The objectives of this paper were (1) to compare three methods of cover type assessment for Los Angeles, CA: handdelineation of aerial photos in ArcMap, supervised classification of aerial photos in ERDAS Imagine, and ground-collected data using the Urban Forest Effects (UFORE) model protocol; (2) to determine how well remote sensing methods estimate carbon storage as predicted by the UFORE model; and (3) to explore the influence of tree diameter and tree density on carbon storage estimates. Four major cover types (bare ground, fine vegetation, coarse vegetation, and impervious surfaces) were determined from 348 plots (0.039 ha each) randomly stratified according to land-use. Hand-delineation was better than supervised classification at predicting ground-based measurements of cover type and UFORE model-predicted carbon storage. Most error in supervised classification resulted from shadow, which was interpreted as unknown cover type. Neither tree diameter or tree density per plot significantly affected the relationship between carbon storage and canopy cover. The efficiency of remote sensing rather than in situ data collection allows urban forest managers the ability to quickly assess a city and plan accordingly while also preserving their often-limited budget.
Resumo:
The selective catalytic reduction system is a well established technology for NOx emissions control in diesel engines. A one dimensional, single channel selective catalytic reduction (SCR) model was previously developed using Oak Ridge National Laboratory (ORNL) generated reactor data for an iron-zeolite catalyst system. Calibration of this model to fit the experimental reactor data collected at ORNL for a copper-zeolite SCR catalyst is presented. Initially a test protocol was developed in order to investigate the different phenomena responsible for the SCR system response. A SCR model with two distinct types of storage sites was used. The calibration process was started with storage capacity calculations for the catalyst sample. Then the chemical kinetics occurring at each segment of the protocol was investigated. The reactions included in this model were adsorption, desorption, standard SCR, fast SCR, slow SCR, NH3 Oxidation, NO oxidation and N2O formation. The reaction rates were identified for each temperature using a time domain optimization approach. Assuming an Arrhenius form of the reaction rates, activation energies and pre-exponential parameters were fit to the reaction rates. The results indicate that the Arrhenius form is appropriate and the reaction scheme used allows the model to fit to the experimental data and also for use in real world engine studies.