Biblioteca Digital

17 resultados para Multi Domain Information Model

em Digital Commons at Florida International University

3D TerraFly: Quality of service management for online interactive 3D geographic information system

Relevância:

100.00% 100.00%

Publicador:

Resumo:

3D geographic information system (GIS) is data and computation intensive in nature. Internet users are usually equipped with low-end personal computers and network connections of limited bandwidth. Data reduction and performance optimization techniques are of critical importance in quality of service (QoS) management for online 3D GIS. In this research, QoS management issues regarding distributed 3D GIS presentation were studied to develop 3D TerraFly, an interactive 3D GIS that supports high quality online terrain visualization and navigation. ^ To tackle the QoS management challenges, multi-resolution rendering model, adaptive level of detail (LOD) control and mesh simplification algorithms were proposed to effectively reduce the terrain model complexity. The rendering model is adaptively decomposed into sub-regions of up-to-three detail levels according to viewing distance and other dynamic quality measurements. The mesh simplification algorithm was designed as a hybrid algorithm that combines edge straightening and quad-tree compression to reduce the mesh complexity by removing geometrically redundant vertices. The main advantage of this mesh simplification algorithm is that grid mesh can be directly processed in parallel without triangulation overhead. Algorithms facilitating remote accessing and distributed processing of volumetric GIS data, such as data replication, directory service, request scheduling, predictive data retrieving and caching were also proposed. ^ A prototype of the proposed 3D TerraFly implemented in this research demonstrates the effectiveness of our proposed QoS management framework in handling interactive online 3D GIS. The system implementation details and future directions of this research are also addressed in this thesis. ^

A generic model of execution for synthesizing domain-specific models

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Software engineering researchers are challenged to provide increasingly more powerful levels of abstractions to address the rising complexity inherent in software solutions. One new development paradigm that places models as abstraction at the forefront of the development process is Model-Driven Software Development (MDSD). MDSD considers models as first class artifacts, extending the capability for engineers to use concepts from the problem domain of discourse to specify apropos solutions. A key component in MDSD is domain-specific modeling languages (DSMLs) which are languages with focused expressiveness, targeting a specific taxonomy of problems. The de facto approach used is to first transform DSML models to an intermediate artifact in a HLL e.g., Java or C++, then execute that resulting code.^ Our research group has developed a class of DSMLs, referred to as interpreted DSMLs (i-DSMLs), where models are directly interpreted by a specialized execution engine with semantics based on model changes at runtime. This execution engine uses a layered architecture and is referred to as a domain-specific virtual machine (DSVM). As the domain-specific model being executed descends the layers of the DSVM the semantic gap between the user-defined model and the services being provided by the underlying infrastructure is closed. The focus of this research is the synthesis engine, the layer in the DSVM which transforms i-DSML models into executable scripts for the next lower layer to process.^ The appeal of an i-DSML is constrained as it possesses unique semantics contained within the DSVM. Existing DSVMs for i-DSMLs exhibit tight coupling between the implicit model of execution and the semantics of the domain, making it difficult to develop DSVMs for new i-DSMLs without a significant investment in resources.^ At the onset of this research only one i-DSML had been created for the user- centric communication domain using the aforementioned approach. This i-DSML is the Communication Modeling Language (CML) and its DSVM is the Communication Virtual machine (CVM). A major problem with the CVM's synthesis engine is that the domain-specific knowledge (DSK) and the model of execution (MoE) are tightly interwoven consequently subsequent DSVMs would need to be developed from inception with no reuse of expertise.^ This dissertation investigates how to decouple the DSK from the MoE and subsequently producing a generic model of execution (GMoE) from the remaining application logic. This GMoE can be reused to instantiate synthesis engines for DSVMs in other domains. The generalized approach to developing the model synthesis component of i-DSML interpreters utilizes a reusable framework loosely coupled to DSK as swappable framework extensions.^ This approach involves first creating an i-DSML and its DSVM for a second do- main, demand-side smartgrid, or microgrid energy management, and designing the synthesis engine so that the DSK and MoE are easily decoupled. To validate the utility of the approach, the SEs are instantiated using the GMoE and DSKs of the two aforementioned domains and an empirical study to support our claim of reduced developmental effort is performed.^

A Generic Model of Execution for Synthesizing Domain-Specific Models

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Software engineering researchers are challenged to provide increasingly more pow- erful levels of abstractions to address the rising complexity inherent in software solu- tions. One new development paradigm that places models as abstraction at the fore- front of the development process is Model-Driven Software Development (MDSD). MDSD considers models as first class artifacts, extending the capability for engineers to use concepts from the problem domain of discourse to specify apropos solutions. A key component in MDSD is domain-specific modeling languages (DSMLs) which are languages with focused expressiveness, targeting a specific taxonomy of problems. The de facto approach used is to first transform DSML models to an intermediate artifact in a HLL e.g., Java or C++, then execute that resulting code. Our research group has developed a class of DSMLs, referred to as interpreted DSMLs (i-DSMLs), where models are directly interpreted by a specialized execution engine with semantics based on model changes at runtime. This execution engine uses a layered architecture and is referred to as a domain-specific virtual machine (DSVM). As the domain-specific model being executed descends the layers of the DSVM the semantic gap between the user-defined model and the services being provided by the underlying infrastructure is closed. The focus of this research is the synthesis engine, the layer in the DSVM which transforms i-DSML models into executable scripts for the next lower layer to process. The appeal of an i-DSML is constrained as it possesses unique semantics contained within the DSVM. Existing DSVMs for i-DSMLs exhibit tight coupling between the implicit model of execution and the semantics of the domain, making it difficult to develop DSVMs for new i-DSMLs without a significant investment in resources. At the onset of this research only one i-DSML had been created for the user- centric communication domain using the aforementioned approach. This i-DSML is the Communication Modeling Language (CML) and its DSVM is the Communication Virtual machine (CVM). A major problem with the CVM’s synthesis engine is that the domain-specific knowledge (DSK) and the model of execution (MoE) are tightly interwoven consequently subsequent DSVMs would need to be developed from inception with no reuse of expertise. This dissertation investigates how to decouple the DSK from the MoE and sub- sequently producing a generic model of execution (GMoE) from the remaining appli- cation logic. This GMoE can be reused to instantiate synthesis engines for DSVMs in other domains. The generalized approach to developing the model synthesis com- ponent of i-DSML interpreters utilizes a reusable framework loosely coupled to DSK as swappable framework extensions. This approach involves first creating an i-DSML and its DSVM for a second do- main, demand-side smartgrid, or microgrid energy management, and designing the synthesis engine so that the DSK and MoE are easily decoupled. To validate the utility of the approach, the SEs are instantiated using the GMoE and DSKs of the two aforementioned domains and an empirical study to support our claim of reduced developmental effort is performed.

High performance shift invariant motion estimation and compensation in wavelet domain video compression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The contributions of this dissertation are in the development of two new interrelated approaches to video data compression: (1) A level-refined motion estimation and subband compensation method for the effective motion estimation and motion compensation. (2) A shift-invariant sub-decimation decomposition method in order to overcome the deficiency of the decimation process in estimating motion due to its shift-invariant property of wavelet transform. ^ The enormous data generated by digital videos call for an intense need of efficient video compression techniques to conserve storage space and minimize bandwidth utilization. The main idea of video compression is to reduce the interpixel redundancies inside and between the video frames by applying motion estimation and motion compensation (MEMO) in combination with spatial transform coding. To locate the global minimum of the matching criterion function reasonably, hierarchical motion estimation by coarse to fine resolution refinements using discrete wavelet transform is applied due to its intrinsic multiresolution and scalability natures. ^ Due to the fact that most of the energies are concentrated in the low resolution subbands while decreased in the high resolution subbands, a new approach called level-refined motion estimation and subband compensation (LRSC) method is proposed. It realizes the possible intrablocks in the subbands for lower entropy coding while keeping the low computational loads of motion estimation as the level-refined method, thus to achieve both temporal compression quality and computational simplicity. ^ Since circular convolution is applied in wavelet transform to obtain the decomposed subframes without coefficient expansion, symmetric-extended wavelet transform is designed on the finite length frame signals for more accurate motion estimation without discontinuous boundary distortions. ^ Although wavelet transformed coefficients still contain spatial domain information, motion estimation in wavelet domain is not as straightforward as in spatial domain due to the shift variance property of the decimation process of the wavelet transform. A new approach called sub-decimation decomposition method is proposed, which maintains the motion consistency between the original frame and the decomposed subframes, improving as a consequence the wavelet domain video compressions by shift invariant motion estimation and compensation. ^

Towards next generation vertical search engines

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As the Web evolves unexpectedly fast, information grows explosively. Useful resources become more and more difficult to find because of their dynamic and unstructured characteristics. A vertical search engine is designed and implemented towards a specific domain. Instead of processing the giant volume of miscellaneous information distributed in the Web, a vertical search engine targets at identifying relevant information in specific domains or topics and eventually provides users with up-to-date information, highly focused insights and actionable knowledge representation. As the mobile device gets more popular, the nature of the search is changing. So, acquiring information on a mobile device poses unique requirements on traditional search engines, which will potentially change every feature they used to have. To summarize, users are strongly expecting search engines that can satisfy their individual information needs, adapt their current situation, and present highly personalized search results. ^ In my research, the next generation vertical search engine means to utilize and enrich existing domain information to close the loop of vertical search engine's system that mutually facilitate knowledge discovering, actionable information extraction, and user interests modeling and recommendation. I investigate three problems in which domain taxonomy plays an important role, including taxonomy generation using a vertical search engine, actionable information extraction based on domain taxonomy, and the use of ensemble taxonomy to catch user's interests. As the fundamental theory, ultra-metric, dendrogram, and hierarchical clustering are intensively discussed. Methods on taxonomy generation using my research on hierarchical clustering are developed. The related vertical search engine techniques are practically used in Disaster Management Domain. Especially, three disaster information management systems are developed and represented as real use cases of my research work.^

Towards Next Generation Vertical Search Engines

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As the Web evolves unexpectedly fast, information grows explosively. Useful resources become more and more difficult to find because of their dynamic and unstructured characteristics. A vertical search engine is designed and implemented towards a specific domain. Instead of processing the giant volume of miscellaneous information distributed in the Web, a vertical search engine targets at identifying relevant information in specific domains or topics and eventually provides users with up-to-date information, highly focused insights and actionable knowledge representation. As the mobile device gets more popular, the nature of the search is changing. So, acquiring information on a mobile device poses unique requirements on traditional search engines, which will potentially change every feature they used to have. To summarize, users are strongly expecting search engines that can satisfy their individual information needs, adapt their current situation, and present highly personalized search results. In my research, the next generation vertical search engine means to utilize and enrich existing domain information to close the loop of vertical search engine's system that mutually facilitate knowledge discovering, actionable information extraction, and user interests modeling and recommendation. I investigate three problems in which domain taxonomy plays an important role, including taxonomy generation using a vertical search engine, actionable information extraction based on domain taxonomy, and the use of ensemble taxonomy to catch user's interests. As the fundamental theory, ultra-metric, dendrogram, and hierarchical clustering are intensively discussed. Methods on taxonomy generation using my research on hierarchical clustering are developed. The related vertical search engine techniques are practically used in Disaster Management Domain. Especially, three disaster information management systems are developed and represented as real use cases of my research work.

Characterization and modeling of multi-conductor transmission line using Finite-difference Time-Domain method

Relevância:

50.00% 50.00%

Publicador:

Resumo:

A two-dimensional, 2D, finite-difference time-domain (FDTD) method is used to analyze two different models of multi-conductor transmission lines (MTL). The first model is a two-conductor MTL and the second is a threeconductor MTL. Apart from the MTL's, a three-dimensional, 3D, FDTD method is used to analyze a three-patch microstrip parasitic array. While the MTL analysis is entirely in time-domain, the microstrip parasitic array is a study of scattering parameter Sn in the frequency-domain. The results clearly indicate that FDTD is an efficient and accurate tool to model and analyze multiconductor transmission line as well as microstrip antennas and arrays.

Efficient Storage and Domain-Specific Information Discovery on Semistructured Documents

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The increasing amount of available semistructured data demands efficient mechanisms to store, process, and search an enormous corpus of data to encourage its global adoption. Current techniques to store semistructured documents either map them to relational databases, or use a combination of flat files and indexes. These two approaches result in a mismatch between the tree-structure of semistructured data and the access characteristics of the underlying storage devices. Furthermore, the inefficiency of XML parsing methods has slowed down the large-scale adoption of XML into actual system implementations. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have significant drawbacks that undermine the massive adoption of XML. Once the processing (storage and parsing) issues for semistructured data have been addressed, another key challenge to leverage semistructured data is to perform effective information discovery on such data. Previous works have addressed this problem in a generic (i.e. domain independent) way, but this process can be improved if knowledge about the specific domain is taken into consideration. This dissertation had two general goals: The first goal was to devise novel techniques to efficiently store and process semistructured documents. This goal had two specific aims: We proposed a method for storing semistructured documents that maps the physical characteristics of the documents to the geometrical layout of hard drives. We developed a Double-Lazy Parser for semistructured documents which introduces lazy behavior in both the pre-parsing and progressive parsing phases of the standard Document Object Model's parsing mechanism. The second goal was to construct a user-friendly and efficient engine for performing Information Discovery over domain-specific semistructured documents. This goal also had two aims: We presented a framework that exploits the domain-specific knowledge to improve the quality of the information discovery process by incorporating domain ontologies. We also proposed meaningful evaluation metrics to compare the results of search systems over semistructured documents.

The effects of attribute quality information and sources of information on consumers' perceptions and behavior: A model of Pre-purchase Information Utilization in Service Physical Environments

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The search-experience-credence framework from economics of information, the human-environment relations models from environmental psychology, and the consumer evaluation process from services marketing provide a conceptual basis for testing the model of "Pre-purchase Information Utilization in Service Physical Environments." The model addresses the effects of informational signs, as a dimension of the service physical environment, on consumers' perceptions (perceived veracity and perceived performance risk), emotions (pleasure) and behavior (willingness to buy). The informational signs provide attribute quality information (search and experience) through non-personal sources of information (simulated word-of-mouth and non-personal advocate sources).^ This dissertation examines: (1) the hypothesized relationships addressed in the model of "Pre-purchase Information Utilization in Service Physical Environments" among informational signs, perceived veracity, perceived performance risk, pleasure, and willingness to buy, and (2) the effects of attribute quality information and sources of information on consumers' perceived veracity and perceived performance risk.^ This research is the first in-depth study about the role and effects of information in service physical environments. Using a 2 x 2 between subjects experimental research procedure, undergraduate students were exposed to the informational signs in a simulated service physical environment. The service physical environments were simulated through color photographic slides.^ The results of the study suggest that: (1) the relationship between informational signs and willingness to buy is mediated by perceived veracity, perceived performance risk and pleasure, (2) experience attribute information shows higher perceived veracity and lower perceived performance risk when compared to search attribute information, and (3) information provided through simulated word-of-mouth shows higher perceived veracity and lower perceived performance risk when compared to information provided through non-personal advocate sources. ^

Fluorescence-enhanced optical imaging on three-dimensional phantoms using a hand-held probe based frequency-domain intensified charge coupled device (ICCD) optical imager

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Fluorescence-enhanced optical imaging is an emerging non-invasive and non-ionizing modality towards breast cancer diagnosis. Various optical imaging systems are currently available, although most of them are limited by bulky instrumentation, or their inability to flexibly image different tissue volumes and shapes. Hand-held based optical imaging systems are a recent development for its improved portability, but are currently limited only to surface mapping. Herein, a novel optical imager, consisting primarily of a hand-held probe and a gain-modulated intensified charge coupled device (ICCD) detector, is developed towards both surface and tomographic breast imaging. The unique features of this hand-held probe based optical imager are its ability to; (i) image large tissue areas (5×10 sq. cm) in a single scan, (ii) reduce overall imaging time using a unique measurement geometry, and (iii) perform tomographic imaging for tumor three-dimensional (3-D) localization. Frequency-domain based experimental phantom studies have been performed on slab geometries (650 ml) under different target depths (1-2.5 cm), target volumes (0.45, 0.23 and 0.10 cc), fluorescence absorption contrast ratios (1:0, 1000:1 to 5:1), and number of targets (up to 3), using Indocyanine Green (ICG) as fluorescence contrast agents. An approximate extended Kalman filter based inverse algorithm has been adapted towards 3-D tomographic reconstructions. Single fluorescence target(s) was reconstructed when located: (i) up to 2.5 cm deep (at 1:0 contrast ratio) and 1.5 cm deep (up to 10:1 contrast ratio) for 0.45 cc-target; and (ii) 1.5 cm deep for target as small as 0.10 cc at 1:0 contrast ratio. In the case of multiple targets, two targets as close as 0.7 cm were tomographically resolved when located 1.5 cm deep. It was observed that performing multi-projection (here dual) based tomographic imaging using a priori target information from surface images, improved the target depth recovery over using single projection based imaging. From a total of 98 experimental phantom studies, the sensitivity and specificity of the imager was estimated as 81-86% and 43-50%, respectively. With 3-D tomographic imaging successfully demonstrated for the first time using a hand-held based optical imager, the clinical translation of this technology is promising upon further experimental validation from in-vitro and in-vivo studies.

Efficient storage and domain-specific information discovery on semistructured documents

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The increasing amount of available semistructured data demands efficient mechanisms to store, process, and search an enormous corpus of data to encourage its global adoption. Current techniques to store semistructured documents either map them to relational databases, or use a combination of flat files and indexes. These two approaches result in a mismatch between the tree-structure of semistructured data and the access characteristics of the underlying storage devices. Furthermore, the inefficiency of XML parsing methods has slowed down the large-scale adoption of XML into actual system implementations. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have significant drawbacks that undermine the massive adoption of XML. ^ Once the processing (storage and parsing) issues for semistructured data have been addressed, another key challenge to leverage semistructured data is to perform effective information discovery on such data. Previous works have addressed this problem in a generic (i.e. domain independent) way, but this process can be improved if knowledge about the specific domain is taken into consideration. ^ This dissertation had two general goals: The first goal was to devise novel techniques to efficiently store and process semistructured documents. This goal had two specific aims: We proposed a method for storing semistructured documents that maps the physical characteristics of the documents to the geometrical layout of hard drives. We developed a Double-Lazy Parser for semistructured documents which introduces lazy behavior in both the pre-parsing and progressive parsing phases of the standard Document Object Model’s parsing mechanism. ^ The second goal was to construct a user-friendly and efficient engine for performing Information Discovery over domain-specific semistructured documents. This goal also had two aims: We presented a framework that exploits the domain-specific knowledge to improve the quality of the information discovery process by incorporating domain ontologies. We also proposed meaningful evaluation metrics to compare the results of search systems over semistructured documents. ^

A domain specific modeling approach for coordinating user-centric communication services

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Rapid advances in electronic communication devices and technologies have resulted in a shift in the way communication applications are being developed. These new development strategies provide abstract views of the underlying communication technologies and lead to the so-called user-centric communication applications. One user-centric communication (UCC) initiative is the Communication Virtual Machine (CVM) technology, which uses the Communication Modeling Language (CML) for modeling communication services and the CVM for realizing these services. In communication-intensive domains such as telemedicine and disaster management, there is an increasing need for user-centric communication applications that are domain-specific and that support the dynamic coordination of communication services commonly found in collaborative communication scenarios. However, UCC approaches like the CVM offer little support for the dynamic coordination of communication services resulting from inherent dependencies between individual steps of a collaboration task. Users either have to manually coordinate communication services, or reply on a process modeling technique to build customized solutions for services in a specific domain that are usually costly, rigidly defined and technology specific. ^ This dissertation proposes a domain-specific modeling approach to address this problem by extending the CVM technology with communication-specific abstractions of workflow concepts commonly found in business processes. The extension involves (1) the definition of the Workflow Communication Modeling Language (WF-CML), a superset of CML, and (2) the extension of the functionality of CVM to process communication-specific workflows. The definition of WF-CML includes the meta-model and the dynamic semantics for control constructs and concurrency. We also extended the CVM prototype to handle the modeling and realization of WF-CML models. A comparative study of the proposed approach with other workflow environments validates the claimed benefits of WF-CML and CVM.^

A multi-method approach for the assessment of composite indices and rankings

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There is growing popularity in the use of composite indices and rankings for cross-organizational benchmarking. However, little attention has been paid to alternative methods and procedures for the computation of these indices and how the use of such methods may impact the resulting indices and rankings. This dissertation developed an approach for assessing composite indices and rankings based on the integration of a number of methods for aggregation, data transformation and attribute weighting involved in their computation. The integrated model developed is based on the simulation of composite indices using methods and procedures proposed in the area of multi-criteria decision making (MCDM) and knowledge discovery in databases (KDD). The approach developed in this dissertation was automated through an IT artifact that was designed, developed and evaluated based on the framework and guidelines of the design science paradigm of information systems research. This artifact dynamically generates multiple versions of indices and rankings by considering different methodological scenarios according to user specified parameters. The computerized implementation was done in Visual Basic for Excel 2007. Using different performance measures, the artifact produces a number of excel outputs for the comparison and assessment of the indices and rankings. In order to evaluate the efficacy of the artifact and its underlying approach, a full empirical analysis was conducted using the World Bank's Doing Business database for the year 2010, which includes ten sub-indices (each corresponding to different areas of the business environment and regulation) for 183 countries. The output results, which were obtained using 115 methodological scenarios for the assessment of this index and its ten sub-indices, indicated that the variability of the component indicators considered in each case influenced the sensitivity of the rankings to the methodological choices. Overall, the results of our multi-method assessment were consistent with the World Bank rankings except in cases where the indices involved cost indicators measured in per capita income which yielded more sensitive results. Low income level countries exhibited more sensitivity in their rankings and less agreement between the benchmark rankings and our multi-method based rankings than higher income country groups.

Evidence of the heterogeneous market hypothesis using wavelet multi -resolution analysis

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In finance literature many economic theories and models have been proposed to explain and estimate the relationship between risk and return. Assuming risk averseness and rational behavior on part of the investor, the models are developed which are supposed to help in forming efficient portfolios that either maximize (minimize) the expected rate of return (risk) for a given level of risk (rates of return). One of the most used models to form these efficient portfolios is the Sharpe's Capital Asset Pricing Model (CAPM). In the development of this model it is assumed that the investors have homogeneous expectations about the future probability distribution of the rates of return. That is, every investor assumes the same values of the parameters of the probability distribution. Likewise financial volatility homogeneity is commonly assumed, where volatility is taken as investment risk which is usually measured by the variance of the rates of return. Typically the square root of the variance is used to define financial volatility, furthermore it is also often assumed that the data generating process is made of independent and identically distributed random variables. This again implies that financial volatility is measured from homogeneous time series with stationary parameters. In this dissertation, we investigate the assumptions of homogeneity of market agents and provide evidence for the case of heterogeneity in market participants' information, objectives, and expectations about the parameters of the probability distribution of prices as given by the differences in the empirical distributions corresponding to different time scales, which in this study are associated with different classes of investors, as well as demonstrate that statistical properties of the underlying data generating processes including the volatility in the rates of return are quite heterogeneous. In other words, we provide empirical evidence against the traditional views about homogeneity using non-parametric wavelet analysis on trading data, The results show heterogeneity of financial volatility at different time scales, and time-scale is one of the most important aspects in which trading behavior differs. In fact we conclude that heterogeneity as posited by the Heterogeneous Markets Hypothesis is the norm and not the exception.

Ensemble Stream Model for Data-Cleaning in Sensor Networks

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as ƒ-test is performed during each node's split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.

«
1
2
»