803 resultados para Computer science, Information systems
Resumo:
A framework for developing marketing category management decision support systems (DSS) based upon the Bayesian Vector Autoregressive (BVAR) model is extended. Since the BVAR model is vulnerable to permanent and temporary shifts in purchasing patterns over time, a form that can correct for the shifts and still provide the other advantages of the BVAR is a Bayesian Vector Error-Correction Model (BVECM). We present the mechanics of extending the DSS to move from a BVAR model to the BVECM model for the category management problem. Several additional iterative steps are required in the DSS to allow the decision maker to arrive at the best forecast possible. The revised marketing DSS framework and model fitting procedures are described. Validation is conducted on a sample problem.
Resumo:
The notorious "dimensionality curse" is a well-known phenomenon for any multi-dimensional indexes attempting to scale up to high dimensions. One well-known approach to overcome degradation in performance with respect to increasing dimensions is to reduce the dimensionality of the original dataset before constructing the index. However, identifying the correlation among the dimensions and effectively reducing them are challenging tasks. In this paper, we present an adaptive Multi-level Mahalanobis-based Dimensionality Reduction (MMDR) technique for high-dimensional indexing. Our MMDR technique has four notable features compared to existing methods. First, it discovers elliptical clusters for more effective dimensionality reduction by using only the low-dimensional subspaces. Second, data points in the different axis systems are indexed using a single B+-tree. Third, our technique is highly scalable in terms of data size and dimension. Finally, it is also dynamic and adaptive to insertions. An extensive performance study was conducted using both real and synthetic datasets, and the results show that our technique not only achieves higher precision, but also enables queries to be processed efficiently. Copyright Springer-Verlag 2005
Resumo:
Workflow systems have traditionally focused on the so-called production processes which are characterized by pre-definition, high volume, and repetitiveness. Recently, the deployment of workflow systems in non-traditional domains such as collaborative applications, e-learning and cross-organizational process integration, have put forth new requirements for flexible and dynamic specification. However, this flexibility cannot be offered at the expense of control, a critical requirement of business processes. In this paper, we will present a foundation set of constraints for flexible workflow specification. These constraints are intended to provide an appropriate balance between flexibility and control. The constraint specification framework is based on the concept of pockets of flexibility which allows ad hoc changes and/or building of workflows for highly flexible processes. Basically, our approach is to provide the ability to execute on the basis of a partially specified model, where the full specification of the model is made at runtime, and may be unique to each instance. The verification of dynamically built models is essential. Where as ensuring that the model conforms to specified constraints does not pose great difficulty, ensuring that the constraint set itself does not carry conflicts and redundancy is an interesting and challenging problem. In this paper, we will provide a discussion on both the static and dynamic verification aspects. We will also briefly present Chameleon, a prototype workflow engine that implements these concepts. (c) 2004 Elsevier Ltd. All rights reserved.
Resumo:
The Bunge-Wand-Weber (BWW) representation model defines ontological constructs for information systems. According to these constructs the completeness and efficiency of a modeling technique can be defined. Ontology plays an essential role in e-commerce. Using or updating an existing ontology and providing tools to solve any semantic conflicts become essential steps before putting a system online. We use conceptual graphs (CGs) to implement ontologies. This paper evaluates CG capabilities using the BWW representation model. It finds out that CGs are ontologically complete according to Wand and Weber definition. Also it finds out that CGs have construct overload and construct redundancy which can undermine the ontological clarity of CGs. This leads us to build a meta-model to avoid some ontological-unclarity problems. We use some of the BWW constructs to build the meta-model. (c) 2004 Elsevier Ltd. All rights reserved.
Resumo:
Online communities have evolved beyond the realm of social phenomenon to become important knowledge-sharing media with real economic consequences. However, the sharing of knowledge and the communication of meaning through Internet technology presents many difficulties. This is particularly so for online finance forums where market-sensitive information and disinformation about exchange-traded stocks is regularly disseminated. The development of trust and the effect of misinformation in this environment are important in the growth of this communication medium. Forum administrators need to better understand and handle the development of trust. In this article, we analyze and discuss the communicative practices of a group of investors and members of an online community of interest. We found that conflict as a driver of knowledge sharing is an important consideration for forum administrators and designers.
Resumo:
Multiresolution Triangular Mesh (MTM) models are widely used to improve the performance of large terrain visualization by replacing the original model with a simplified one. MTM models, which consist of both original and simplified data, are commonly stored in spatial database systems due to their size. The relatively slow access speed of disks makes data retrieval the bottleneck of such terrain visualization systems. Existing spatial access methods proposed to address this problem rely on main-memory MTM models, which leads to significant overhead during query processing. In this paper, we approach the problem from a new perspective and propose a novel MTM called direct mesh that is designed specifically for secondary storage. It supports available indexing methods natively and requires no modification to MTM structure. Experiment results, which are based on two real-world data sets, show an average performance improvement of 5-10 times over the existing methods.
Resumo:
With the rapid increase in both centralized video archives and distributed WWW video resources, content-based video retrieval is gaining its importance. To support such applications efficiently, content-based video indexing must be addressed. Typically, each video is represented by a sequence of frames. Due to the high dimensionality of frame representation and the large number of frames, video indexing introduces an additional degree of complexity. In this paper, we address the problem of content-based video indexing and propose an efficient solution, called the Ordered VA-File (OVA-File) based on the VA-file. OVA-File is a hierarchical structure and has two novel features: 1) partitioning the whole file into slices such that only a small number of slices are accessed and checked during k Nearest Neighbor (kNN) search and 2) efficient handling of insertions of new vectors into the OVA-File, such that the average distance between the new vectors and those approximations near that position is minimized. To facilitate a search, we present an efficient approximate kNN algorithm named Ordered VA-LOW (OVA-LOW) based on the proposed OVA-File. OVA-LOW first chooses possible OVA-Slices by ranking the distances between their corresponding centers and the query vector, and then visits all approximations in the selected OVA-Slices to work out approximate kNN. The number of possible OVA-Slices is controlled by a user-defined parameter delta. By adjusting delta, OVA-LOW provides a trade-off between the query cost and the result quality. Query by video clip consisting of multiple frames is also discussed. Extensive experimental studies using real video data sets were conducted and the results showed that our methods can yield a significant speed-up over an existing VA-file-based method and iDistance with high query result quality. Furthermore, by incorporating temporal correlation of video content, our methods achieved much more efficient performance.
Resumo:
In this paper, we present ICICLE (Image ChainNet and Incremental Clustering Engine), a prototype system that we have developed to efficiently and effectively retrieve WWW images based on image semantics. ICICLE has two distinguishing features. First, it employs a novel image representation model called Weight ChainNet to capture the semantics of the image content. A new formula, called list space model, for computing semantic similarities is also introduced. Second, to speed up retrieval, ICICLE employs an incremental clustering mechanism, ICC (Incremental Clustering on ChainNet), to cluster images with similar semantics into the same partition. Each cluster has a summary representative and all clusters' representatives are further summarized into a balanced and full binary tree structure. We conducted an extensive performance study to evaluate ICICLE. Compared with some recently proposed methods, our results show that ICICLE provides better recall and precision. Our clustering technique ICC facilitates speedy retrieval of images without sacrificing recall and precision significantly.
Resumo:
Quantile computation has many applications including data mining and financial data analysis. It has been shown that an is an element of-approximate summary can be maintained so that, given a quantile query d (phi, is an element of), the data item at rank [phi N] may be approximately obtained within the rank error precision is an element of N over all N data items in a data stream or in a sliding window. However, scalable online processing of massive continuous quantile queries with different phi and is an element of poses a new challenge because the summary is continuously updated with new arrivals of data items. In this paper, first we aim to dramatically reduce the number of distinct query results by grouping a set of different queries into a cluster so that they can be processed virtually as a single query while the precision requirements from users can be retained. Second, we aim to minimize the total query processing costs. Efficient algorithms are developed to minimize the total number of times for reprocessing clusters and to produce the minimum number of clusters, respectively. The techniques are extended to maintain near-optimal clustering when queries are registered and removed in an arbitrary fashion against whole data streams or sliding windows. In addition to theoretical analysis, our performance study indicates that the proposed techniques are indeed scalable with respect to the number of input queries as well as the number of items and the item arrival rate in a data stream.
Resumo:
This correspondence considers block detection for blind wireless digital transmission. At high signal-to-noise ratio (SNR), block detection errors are primarily due to the received sequence having multiple possible decoded sequences with the same likelihood. We derive analytic expressions for the probability of detection ambiguity written in terms of a Dedekind zeta function, in the zero noise case with large constellations. Expressions are also provided for finite constellations, which can be evaluated efficiently, independent of the block length. Simulations demonstrate that the analytically derived error floors exist at high SNR.
Resumo:
Objectives: In this paper, we present a unified electrodynamic heart model that permits simulations of the body surface potentials generated by the heart in motion. The inclusion of motion in the heart model significantly improves the accuracy of the simulated body surface potentials and therefore also the 12-lead ECG. Methods: The key step is to construct an electromechanical heart model. The cardiac excitation propagation is simulated by an electrical heart model, and the resulting cardiac active forces are used to calculate the ventricular wall motion based on a mechanical model. The source-field point relative position changes during heart systole and diastole. These can be obtained, and then used to calculate body surface ECG based on the electrical heart-torso model. Results: An electromechanical biventricular heart model is constructed and a standard 12-lead ECG is simulated. Compared with a simulated ECG based on the static electrical heart model, the simulated ECG based on the dynamic heart model is more accordant with a clinically recorded ECG, especially for the ST segment and T wave of a V1-V6 lead ECG. For slight-degree myocardial ischemia ECG simulation, the ST segment and T wave changes can be observed from the simulated ECG based on a dynamic heart model, while the ST segment and T wave of simulated ECG based on a static heart model is almost unchanged when compared with a normal ECG. Conclusions: This study confirms the importance of the mechanical factor in the ECG simulation. The dynamic heart model could provide more accurate ECG simulation, especially for myocardial ischemia or infarction simulation, since the main ECG changes occur at the ST segment and T wave, which correspond with cardiac systole and diastole phases.
Resumo:
In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index image's multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partition's center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images have similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the dimensionality curse existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms image's text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partition's center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. To effectively integrate multi-features, we also investigated the following evidence combination techniques-Certainty Factor, Dempster Shafer Theory, Compound Probability, and Linear Combination. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude. And Certainty Factor and Dempster Shafer Theory perform best in combining multiple similarities from corresponding multiple features.
Resumo:
Summarizing topological relations is fundamental to many spatial applications including spatial query optimization. In this article, we present several novel techniques to effectively construct cell density based spatial histograms for range (window) summarizations restricted to the four most important level-two topological relations: contains, contained, overlap, and disjoint. We first present a novel framework to construct a multiscale Euler histogram in 2D space with the guarantee of the exact summarization results for aligned windows in constant time. To minimize the storage space in such a multiscale Euler histogram, an approximate algorithm with the approximate ratio 19/12 is presented, while the problem is shown NP-hard generally. To conform to a limited storage space where a multiscale histogram may be allowed to have only k Euler histograms, an effective algorithm is presented to construct multiscale histograms to achieve high accuracy in approximately summarizing aligned windows. Then, we present a new approximate algorithm to query an Euler histogram that cannot guarantee the exact answers; it runs in constant time. We also investigate the problem of nonaligned windows and the problem of effectively partitioning the data space to support nonaligned window queries. Finally, we extend our techniques to 3D space. Our extensive experiments against both synthetic and real world datasets demonstrate that the approximate multiscale histogram techniques may improve the accuracy of the existing techniques by several orders of magnitude while retaining the cost efficiency, and the exact multiscale histogram technique requires only a storage space linearly proportional to the number of cells for many popular real datasets.
Resumo:
In many advanced applications, data are described by multiple high-dimensional features. Moreover, different queries may weight these features differently; some may not even specify all the features. In this paper, we propose our solution to support efficient query processing in these applications. We devise a novel representation that compactly captures f features into two components: The first component is a 2D vector that reflects a distance range ( minimum and maximum values) of the f features with respect to a reference point ( the center of the space) in a metric space and the second component is a bit signature, with two bits per dimension, obtained by analyzing each feature's descending energy histogram. This representation enables two levels of filtering: The first component prunes away points that do not share similar distance ranges, while the bit signature filters away points based on the dimensions of the relevant features. Moreover, the representation facilitates the use of a single index structure to further speed up processing. We employ the classical B+-tree for this purpose. We also propose a KNN search algorithm that exploits the access orders of critical dimensions of highly selective features and partial distances to prune the search space more effectively. Our extensive experiments on both real-life and synthetic data sets show that the proposed solution offers significant performance advantages over sequential scan and retrieval methods using single and multiple VA-files.
Resumo:
Business environments have become exceedingly dynamic and competitive in recent times. This dynamism is manifested in the form of changing process requirements and time constraints. Workflow technology is currently one of the most promising fields of research in business process automation. However, workflow systems to date do not provide the flexibility necessary to support the dynamic nature of business processes. In this paper we primarily discuss the issues and challenges related to managing change and time in workflows representing dynamic business processes. We also present an analysis of workflow modifications and provide feasibility considerations for the automation of this process.