30 resultados para cloud-based applications
Resumo:
Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).
Resumo:
Interactive health communication using Internet technologies is expanding the range and flexibility of intervention and teaching options available in preventive medicine and the health sciences. Advantages of interactive health communication include the enhanced convenience, novelty, and appeal of computer-mediated communication; its flexibility and interactivity; and automated processing. We outline some of these fundamental aspects of computer-mediated communication as it applies to preventive medicine. Further, a number of key pathways of information technology evolution are creating new opportunities for the delivery of professional education in preventive medicine and other health domains, as well as for delivering automated, self-instructional health behavior-change programs through the Internet. We briefly describe several of these key evolutionary pathways, We describe some examples from work we have done in Australia. These demonstrate how we have creatively responded to the challenges of these new information environments, and how they may be pursued in the education of preventive medicine and other health care practitioners and in the development and delivery of health behavior change programs through the Internet. Innovative and thoughtful applications of this new technology can increase the consistency, reliability, and quality of information delivered.
Resumo:
The World Wide Web (WWW) is useful for distributing scientific data. Most existing web data resources organize their information either in structured flat files or relational databases with basic retrieval capabilities. For databases with one or a few simple relations, these approaches are successful, but they can be cumbersome when there is a data model involving multiple relations between complex data. We believe that knowledge-based resources offer a solution in these cases. Knowledge bases have explicit declarations of the concepts in the domain, along with the relations between them. They are usually organized hierarchically, and provide a global data model with a controlled vocabulary, We have created the OWEB architecture for building online scientific data resources using knowledge bases. OWEB provides a shell for structuring data, providing secure and shared access, and creating computational modules for processing and displaying data. In this paper, we describe the translation of the online immunological database MHCPEP into an OWEB system called MHCWeb. This effort involved building a conceptual model for the data, creating a controlled terminology for the legal values for different types of data, and then translating the original data into the new structure. The 0 WEB environment allows for flexible access to the data by both users and computer programs.
Resumo:
Computational models complement laboratory experimentation for efficient identification of MHC-binding peptides and T-cell epitopes. Methods for prediction of MHC-binding peptides include binding motifs, quantitative matrices, artificial neural networks, hidden Markov models, and molecular modelling. Models derived by these methods have been successfully used for prediction of T-cell epitopes in cancer, autoimmunity, infectious disease, and allergy. For maximum benefit, the use of computer models must be treated as experiments analogous to standard laboratory procedures and performed according to strict standards. This requires careful selection of data for model building, and adequate testing and validation. A range of web-based databases and MHC-binding prediction programs are available. Although some available prediction programs for particular MHC alleles have reasonable accuracy, there is no guarantee that all models produce good quality predictions. In this article, we present and discuss a framework for modelling, testing, and applications of computational methods used in predictions of T-cell epitopes. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
The field of protein crystallography inspires and enthrals, whether it be for the beauty and symmetry of a perfectly formed protein crystal, the unlocked secrets of a novel protein fold, or the precise atomic-level detail yielded from a protein-ligand complex. Since 1958, when the first protein structure was solved, there have been tremendous advances in all aspects of protein crystallography, from protein preparation and crystallisation through to diffraction data measurement and structure refinement. These advances have significantly reduced the time required to solve protein crystal structures, while at the same time substantially improving the quality and resolution of the resulting structures. Moreover, the technological developments have induced researchers to tackle ever more complex systems, including ribosomes and intact membrane-bound proteins, with a reasonable expectation of success. In this review, the steps involved in determining a protein crystal structure are described and the impact of recent methodological advances identified. Protein crystal structures have proved to be extraordinarily useful in medicinal chemistry research, particularly with respect to inhibitor design. The precise interaction between a drug and its receptor can be visualised at the molecular level using protein crystal structures, and this information then used to improve the complementarity and thus increase the potency and selectivity of an inhibitor. The use of protein crystal structures in receptor-based drug design is highlighted by (i) HIV protease, (ii) influenza virus neuraminidase and (iii) prostaglandin H-2-synthetase. These represent, respectively, examples of protein crystal structures that (i) influenced the design of drugs currently approved for use in the treatment of HIV infection, (ii) led to the design of compounds currently in clinical trials for the treatment of influenza infection and (iii) could enable the design of highly specific non-steroidal anti-inflammatory drugs that lack the common side-effects of this drug class.
Resumo:
In the design of lattice domes, design engineers need expertise in areas such as configuration processing, nonlinear analysis, and optimization. These are extensive numerical, iterative, and lime-consuming processes that are prone to error without an integrated design tool. This article presents the application of a knowledge-based system in solving lattice-dome design problems. An operational prototype knowledge-based system, LADOME, has been developed by employing the combined knowledge representation approach, which uses rules, procedural methods, and an object-oriented blackboard concept. The system's objective is to assist engineers in lattice-dome design by integrating all design tasks into a single computer-aided environment with implementation of the knowledge-based system approach. For system verification, results from design examples are presented.
Resumo:
Most Internet search engines are keyword-based. They are not efficient for the queries where geographical location is important, such as finding hotels within an area or close to a place of interest. A natural interface for spatial searching is a map, which can be used not only to display locations of search results but also to assist forming search conditions. A map-based search engine requires a well-designed visual interface that is intuitive to use yet flexible and expressive enough to support various types of spatial queries as well as aspatial queries. Similar to hyperlinks for text and images in an HTML page, spatial objects in a map should support hyperlinks. Such an interface needs to be scalable with the size of the geographical regions and the number of websites it covers. In spite of handling typically a very large amount of spatial data, a map-based search interface should meet the expectation of fast response time for interactive applications. In this paper we discuss general requirements and the design for a new map-based web search interface, focusing on integration with the WWW and visual spatial query interface. A number of current and future research issues are discussed, and a prototype for the University of Queensland is presented. (C) 2001 Published by Elsevier Science Ltd.
Resumo:
Dendritic cells (DC) are now recognised as a unique leukocyte type, consisting of two or more subsets. The origins and functional inter-relationships of these cells are the subject of intense basic scientific investigation. They play important roles in initiating and directing immune responses, defending the host from pathogens and maintaining self tolerance. Fundamental studies are defining new molecules and mechanisms associated with DC function. The first methods for counting these rare blood cell populations are already providing interesting new clinical data. Indeed, abnormal DC function may contribute to deficiencies in the immune response against malignancies. Phase I trial data suggests that DC-based cancer vaccination protocols may contribute an important new biological approach to cancer therapy. Manipulation of DC to facilitate allogeneic transplantation and even to manage autoimmune disease are likely developments.
Resumo:
Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.
Resumo:
The enormous amount of information generated through sequencing of the human genome has increased demands for more economical and flexible alternatives in genomics, proteomics and drug discovery. Many companies and institutions have recognised the potential of increasing the size and complexity of chemical libraries by producing large chemical libraries on colloidal support beads. Since colloid-based compounds in a suspension are randomly located, an encoding system such as optical barcoding is required to permit rapid elucidation of the compound structures. We describe in this article innovative methods for optical barcoding of colloids for use as support beads in both combinatorial and non-combinatorial libraries. We focus in particular on the difficult problem of barcoding extremely large libraries, which if solved, will transform the manner in which genomics, proteomics and drug discovery research is currently performed.
Resumo:
An efficient representation method for arbitrarily shaped image segments is proposed. This method includes a smart way to select wavelet basis to approximate the given image segment, with improved image quality and reduced computational load.
Resumo:
The majority of the world's population now resides in urban environments and information on the internal composition and dynamics of these environments is essential to enable preservation of certain standards of living. Remotely sensed data, especially the global coverage of moderate spatial resolution satellites such as Landsat, Indian Resource Satellite and Systeme Pour I'Observation de la Terre (SPOT), offer a highly useful data source for mapping the composition of these cities and examining their changes over time. The utility and range of applications for remotely sensed data in urban environments could be improved with a more appropriate conceptual model relating urban environments to the sampling resolutions of imaging sensors and processing routines. Hence, the aim of this work was to take the Vegetation-Impervious surface-Soil (VIS) model of urban composition and match it with the most appropriate image processing methodology to deliver information on VIS composition for urban environments. Several approaches were evaluated for mapping the urban composition of Brisbane city (south-cast Queensland, Australia) using Landsat 5 Thematic Mapper data and 1:5000 aerial photographs. The methods evaluated were: image classification; interpretation of aerial photographs; and constrained linear mixture analysis. Over 900 reference sample points on four transects were extracted from the aerial photographs and used as a basis to check output of the classification and mixture analysis. Distinctive zonations of VIS related to urban composition were found in the per-pixel classification and aggregated air-photo interpretation; however, significant spectral confusion also resulted between classes. In contrast, the VIS fraction images produced from the mixture analysis enabled distinctive densities of commercial, industrial and residential zones within the city to be clearly defined, based on their relative amount of vegetation cover. The soil fraction image served as an index for areas being (re)developed. The logical match of a low (L)-resolution, spectral mixture analysis approach with the moderate spatial resolution image data, ensured the processing model matched the spectrally heterogeneous nature of the urban environments at the scale of Landsat Thematic Mapper data.
Resumo:
A new algorithm has been developed for smoothing the surfaces in finite element formulations of contact-impact. A key feature of this method is that the smoothing is done implicitly by constructing smooth signed distance functions for the bodies. These functions are then employed for the computation of the gap and other variables needed for implementation of contact-impact. The smoothed signed distance functions are constructed by a moving least-squares approximation with a polynomial basis. Results show that when nodes are placed on a surface, the surface can be reproduced with an error of about one per cent or less with either a quadratic or a linear basis. With a quadratic basis, the method exactly reproduces a circle or a sphere even for coarse meshes. Results are presented for contact problems involving the contact of circular bodies. Copyright (C) 2002 John Wiley Sons, Ltd.
Resumo:
Motion study is an engineering technology that analyzes human body motions. During the past decade (1990-1999) a series of studies investigated the role of motion study in developmental disabilities. This article reviews the literature on the applications of motion study in the field. A historical and conceptual review of motion study leading to the current status of studies is presented followed by a review of the research literature. Two main eras of research focus were identified. The first era (1990-1995) of studies established the superior effectiveness and efficiency of tasks designed with motion study or motion study-related principles over traditional site-based task designs. The second era (1995-1999) of studies examined the interaction between motion study-based task designs and other variables such as choice, preference, and functionally equivalent and competing task designs and communicative alternatives. Our review found that applying motion study principles as an antecedent guide and practice to eliminating or reducing ineffective motions and simplifying effective motions resulted in positive task outcomes with most of the participants.
Resumo:
Motivation: A major issue in cell biology today is how distinct intracellular regions of the cell, like the Golgi Apparatus, maintain their unique composition of proteins and lipids. The cell differentially separates Golgi resident proteins from proteins that move through the organelle to other subcellular destinations. We set out to determine if we could distinguish these two types of transmembrane proteins using computational approaches. Results: A new method has been developed to predict Golgi membrane proteins based on their transmembrane domains. To establish the prediction procedure, we took the hydrophobicity values and frequencies of different residues within the transmembrane domains into consideration. A simple linear discriminant function was developed with a small number of parameters derived from a dataset of Type II transmembrane proteins of known localization. This can discriminate between proteins destined for Golgi apparatus or other locations (post-Golgi) with a success rate of 89.3% or 85.2%, respectively on our redundancy-reduced data sets.