188 resultados para Source code (Computer science)
                                
Resumo:
A data warehouse is a data repository which collects and maintains a large amount of data from multiple distributed, autonomous and possibly heterogeneous data sources. Often the data is stored in the form of materialized views in order to provide fast access to the integrated data. One of the most important decisions in designing a data warehouse is the selection of views for materialization. The objective is to select an appropriate set of views that minimizes the total query response time with the constraint that the total maintenance time for these materialized views is within a given bound. This view selection problem is totally different from the view selection problem under the disk space constraint. In this paper the view selection problem under the maintenance time constraint is investigated. Two efficient, heuristic algorithms for the problem are proposed. The key to devising the proposed algorithms is to define good heuristic functions and to reduce the problem to some well-solved optimization problems. As a result, an approximate solution of the known optimization problem will give a feasible solution of the original problem. (C) 2001 Elsevier Science B.V. All rights reserved.
                                
Resumo:
This note gives a theory of state transition matrices for linear systems of fuzzy differential equations. This is used to give a fuzzy version of the classical variation of constants formula. A simple example of a time-independent control system is used to illustrate the methods. While similar problems to the crisp case arise for time-dependent systems, in time-independent cases the calculations are elementary solutions of eigenvalue-eigenvector problems. In particular, for nonnegative or nonpositive matrices, the problems at each level set, can easily be solved in MATLAB to give the level sets of the fuzzy solution. (C) 2002 Elsevier Science B.V. All rights reserved.
                                
Resumo:
Loss networks have long been used to model various types of telecommunication network, including circuit-switched networks. Such networks often use admission controls, such as trunk reservation, to optimize revenue or stabilize the behaviour of the network. Unfortunately, an exact analysis of such networks is not usually possible, and reduced-load approximations such as the Erlang Fixed Point (EFP) approximation have been widely used. The performance of these approximations is typically very good for networks without controls, under several regimes. There is evidence, however, that in networks with controls, these approximations will in general perform less well. We propose an extension to the EFP approximation that gives marked improvement for a simple ring-shaped network with trunk reservation. It is based on the idea of considering pairs of links together, thus making greater allowance for dependencies between neighbouring links than does the EFP approximation, which only considers links in isolation.
                                
Resumo:
Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.
                                
Resumo:
Formulations of fuzzy integral equations in terms of the Aumann integral do not reflect the behavior of corresponding crisp models. Consequently, they are ill-adapted to describe physical phenomena, even when vagueness and uncertainty are present. A similar situation for fuzzy ODEs has been obviated by interpretation in terms of families of differential inclusions. The paper extends this formalism to fuzzy integral equations and shows that the resulting solution sets and attainability sets are fuzzy and far better descriptions of uncertain models involving integral equations. The investigation is restricted to Volterra type equations with mildly restrictive conditions, but the methods are capable of extensive generalization to other types and more general assumptions. The results are illustrated by integral equations relating to control models with fuzzy uncertainties.
                                
Resumo:
Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.
                                
Resumo:
This paper presents the results of my action research. I was involved in establishing and running a digital library that was founded by the government of South Korea. The process involved understanding the relationship between the national IT infrastructure and the success factors of the digital library. In building, the national IT infrastructure, a digital library system was implemented; it combines all existing digitized university libraries and can provide overseas information, such as foreign journal articles, instantly and freely to every Korean researcher. An empirical survey was made as a part of the action research; the survey determined user satisfaction in the newly established national digital library. After obtaining the survey results, I suggested that the current way of running the nationwide government-owned digital library should be retained. (C) 2002 Elsevier Science B.V. All rights reserved.
                                
Resumo:
At the core of the analysis task in the development process is information systems requirements modelling, Modelling of requirements has been occurring for many years and the techniques used have progressed from flowcharting through data flow diagrams and entity-relationship diagrams to object-oriented schemas today. Unfortunately, researchers have been able to give little theoretical guidance only to practitioners on which techniques to use and when. In an attempt to address this situation, Wand and Weber have developed a series of models based on the ontological theory of Mario Bunge-the Bunge-Wand-Weber (BWW) models. Two particular criticisms of the models have persisted however-the understandability of the constructs in the BWW models and the difficulty in applying the models to a modelling technique. This paper addresses these issues by presenting a meta model of the BWW constructs using a meta language that is familiar to many IS professionals, more specific than plain English text, but easier to understand than the set-theoretic language of the original BWW models. Such a meta model also facilitates the application of the BWW theory to other modelling techniques that have similar meta models defined. Moreover, this approach supports the identification of patterns of constructs that might be common across meta models for modelling techniques. Such findings are useful in extending and refining the BWW theory. (C) 2002 Elsevier Science Ltd. All rights reserved.
                                
                                
Resumo:
Motivation: A consensus sequence for a family of related sequences is, as the name suggests, a sequence that captures the features common to most members of the family. Consensus sequences are important in various DNA sequencing applications and are a convenient way to characterize a family of molecules. Results: This paper describes a new algorithm for finding a consensus sequence, using the popular optimization method known as simulated annealing. Unlike the conventional approach of finding a consensus sequence by first forming a multiple sequence alignment, this algorithm searches for a sequence that minimises the sum of pairwise distances to each of the input sequences. The resulting consensus sequence can then be used to induce a multiple sequence alignment. The time required by the algorithm scales linearly with the number of input sequences and quadratically with the length of the consensus sequence. We present results demonstrating the high quality of the consensus sequences and alignments produced by the new algorithm. For comparison, we also present similar results obtained using ClustalW. The new algorithm outperforms ClustalW in many cases.
                                
Resumo:
Management are keen to maximize the life span of an information system because of the high cost, organizational disruption, and risk of failure associated with the re-development or replacement of an information system. This research investigates the effects that various factors have on an information system's life span by understanding how the factors affect an information system's stability. The research builds on a previously developed two-stage model of information system change whereby an information system is either in a stable state of evolution in which the information system's functionality is evolving, or in a state of revolution, in which the information system is being replaced because it is not providing the functionality expected by its users. A case study surveyed a number of systems within one organization. The aim was to test whether a relationship existed between the base value of the volatility index (a measure of the stability of an information system) and certain system characteristics. Data relating to some 3000 user change requests covering 40 systems over a 10-year period were obtained. The following factors were hypothesized to have significant associations with the base value of the volatility index: language level (generation of language of construction), system size, system age, and the timing of changes applied to a system. Significant associations were found in the hypothesized directions except that the timing of user changes was not associated with any change in the value of the volatility index. Copyright (C) 2002 John Wiley Sons, Ltd.
                                
Resumo:
Within the information systems field, the task of conceptual modeling involves building a representation of selected phenomena in some domain. High-quality conceptual-modeling work is important because it facilitates early detection and correction of system development errors. It also plays an increasingly important role in activities like business process reengineering and documentation of best-practice data and process models in enterprise resource planning systems. Yet little research has been undertaken on many aspects of conceptual modeling. In this paper, we propose a framework to motivate research that addresses the following fundamental question: How can we model the world to better facilitate our developing, implementing, using, and maintaining more valuable information systems? The framework comprises four elements: conceptual-modeling grammars, conceptual-modeling methods, conceptual-modeling scripts, and conceptual-modeling contexts. We provide examples of the types of research that have already been undertaken on each element and illustrate research opportunities that exist.
                                
Resumo:
It is common for a real-time system to contain a nonterminating process monitoring an input and controlling an output. Hence, a real-time program development method needs to support nonterminating repetitions. In this paper we develop a general proof rule for reasoning about possibly nonterminating repetitions. The rule makes use of a Floyd-Hoare-style loop invariant that is maintained by each iteration of the repetition, a Jones-style relation between the pre- and post-states on each iteration, and a deadline specifying an upper bound on the starting time of each iteration. The general rule is proved correct with respect to a predicative semantics. In the case of a terminating repetition the rule reduces to the standard rule extended to handle real time. Other special cases include repetitions whose bodies are guaranteed to terminate, nonterminating repetitions with the constant true as a guard, and repetitions whose termination is guaranteed by the inclusion of a fixed deadline. (C) 2002 Elsevier Science B.V. All rights reserved.
                                
Resumo:
The suitable use of array antennas in cellular systems results in improvement in the signal-to-interference ratio (StR), This property is the basis for introducing smart or adaptive antenna systems. in general, the SIR depends on the array configuration and is a function of the direction of the desired user and interferers. Here, the SIR performance for linear and circular arrays is analysed and compared.
                                
Resumo:
We reinterpret the state space dimension equations for geometric Goppa codes. An easy consequence is that if deg G less than or equal to n-2/2 or deg G greater than or equal to n-2/2 + 2g then the state complexity of C-L(D, G) is equal to the Wolf bound. For deg G is an element of [n-1/2, n-3/2 + 2g], we use Clifford's theorem to give a simple lower bound on the state complexity of C-L(D, G). We then derive two further lower bounds on the state space dimensions of C-L(D, G) in terms of the gonality sequence of F/F-q. (The gonality sequence is known for many of the function fields of interest for defining geometric Goppa codes.) One of the gonality bounds uses previous results on the generalised weight hierarchy of C-L(D, G) and one follows in a straightforward way from first principles; often they are equal. For Hermitian codes both gonality bounds are equal to the DLP lower bound on state space dimensions. We conclude by using these results to calculate the DLP lower bound on state complexity for Hermitian codes.
 
                    