57 resultados para Immunologic Databases
Resumo:
Spatial data has now been used extensively in the Web environment, providing online customized maps and supporting map-based applications. The full potential of Web-based spatial applications, however, has yet to be achieved due to performance issues related to the large sizes and high complexity of spatial data. In this paper, we introduce a multiresolution approach to spatial data management and query processing such that the database server can choose spatial data at the right resolution level for different Web applications. One highly desirable property of the proposed approach is that the server-side processing cost and network traffic can be reduced when the level of resolution required by applications are low. Another advantage is that our approach pushes complex multiresolution structures and algorithms into the spatial database engine. That is, the developer of spatial Web applications needs not to be concerned with such complexity. This paper explains the basic idea, technical feasibility and applications of multiresolution spatial databases.
Resumo:
Spatial data mining recently emerges from a number of real applications, such as real-estate marketing, urban planning, weather forecasting, medical image analysis, road traffic accident analysis, etc. It demands for efficient solutions for many new, expensive, and complicated problems. In this paper, we investigate the problem of evaluating the top k distinguished “features” for a “cluster” based on weighted proximity relationships between the cluster and features. We measure proximity in an average fashion to address possible nonuniform data distribution in a cluster. Combining a standard multi-step paradigm with new lower and upper proximity bounds, we presented an efficient algorithm to solve the problem. The algorithm is implemented in several different modes. Our experiment results not only give a comparison among them but also illustrate the efficiency of the algorithm.
Resumo:
In recent years, the phrase 'genomic medicine' has increasingly been used to describe a new development in medicine that holds great promise for human health. This new approach to health care uses the knowledge of an individual's genetic make-up to identify those that are at a higher risk of developing certain diseases and to intervene at an earlier stage to prevent these diseases. Identifying genes that are involved in disease aetiology will provide researchers with tools to develop better treatments and cures. A major role within this field is attributed to 'predictive genomic medicine', which proposes screening healthy individuals to identify those who carry alleles that increase their susceptibility to common diseases, such as cancers and heart disease. Physicians could then intervene even before the disease manifests and advise individuals with a higher genetic risk to change their behaviour - for instance, to exercise or to eat a healthier diet - or offer drugs or other medical treatment to reduce their chances of developing these diseases. These promises have fallen on fertile ground among politicians, health-care providers and the general public, particularly in light of the increasing costs of health care in developed societies. Various countries have established databases on the DNA and health information of whole populations as a first step towards genomic medicine. Biomedical research has also identified a large number of genes that could be used to predict someone's risk of developing a certain disorder. But it would be premature to assume that genomic medicine will soon become reality, as many problems remain to be solved. Our knowledge about most disease genes and their roles is far from sufficient to make reliable predictions about a patient’s risk of actually developing a disease. In addition, genomic medicine will create new political, social, ethical and economic challenges that will have to be addressed in the near future.
Resumo:
Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).
Resumo:
Data describing the composition of dietary supplements are not readily available to the public health community. As a result, intake from dietary supplements is generally not considered in most dietary surveys and, hence, little is known about the significance of supplement intake in relation to total diet or disease risk. To enable a more comprehensive analysis of dietary data, a database of the composition of various dietary supplements has been compiled. Active ingredients of all dietary supplements sold in Australia are included in the Australian Register of Therapeutic Goods (ARTG), maintained by the Therapeutic Goods Administration. Products included in the database were restricted to those vitamin, mineral and other supplements identified in dietary data collected from studies conducted in southeast Queensland and New South Wales (850 supplements). Conversion factors from ingredients compounds to active elements were compiled from standard sources. No account has been made for bioavailability, consistent with current practice for food composition databases. The database can be queried by ARTG identification number, brand, product title, or a variety of other fields. Expected future developments include development of standard formulations for use when supplements are incompletely specified, and expansion of products included for more widespread use.
Resumo:
We used the expressed sequenced tags (ESTs) approach to study the genome of the cattle tick Boophilus microplus. One hundred and forty-two of our 234 unique ESTs were from genes not previously identified from ticks, mites or any other arachnids. The largest class of identified ESTs (29%) was from genes involved in transcription and translation. Ninety-one ESTs (39% of all ESTs) did not match any sequences in international databases; some of these may be specific to ticks. Thirteen percent of our ESTs were from ribosomal proteins and two ESTs were for genes implicated in resistance to pesticides. (C) 1998 Chapman & Hall Ltd.
Resumo:
The cost of spatial join processing can be very high because of the large sizes of spatial objects and the computation-intensive spatial operations. While parallel processing seems a natural solution to this problem, it is not clear how spatial data can be partitioned for this purpose. Various spatial data partitioning methods are examined in this paper. A framework combining the data-partitioning techniques used by most parallel join algorithms in relational databases and the filter-and-refine strategy for spatial operation processing is proposed for parallel spatial join processing. Object duplication caused by multi-assignment in spatial data partitioning can result in extra CPU cost as well as extra communication cost. We find that the key to overcome this problem is to preserve spatial locality in task decomposition. We show in this paper that a near-optimal speedup can be achieved for parallel spatial join processing using our new algorithms.
Resumo:
Objective: To determine the incidence of interval cancers which occurred in the first 12 months after mammographic screening at a mammographic screening service. Design: Retrospective analysis of data obtained by crossmatching the screening Service and the New South Wales Central Cancer Registry databases. Setting: The Central & Eastern Sydney Service of BreastScreen NSW. Participants: Women aged 40-69 years at first screen, who attended for their first or second screen between 1 March 1988 and 31 December 1992. Main outcome measures: Interval-cancer rates per 10 000 screens and as a proportion of the underlying incidence of breast cancer (as estimated by the underlying rate in the total NSW population). Results: The 12-month interval-cancer incidence per 10 000 screens was 4.17 for the 40-49 years age group (95% confidence interval [CI], 1.35-9.73) and 4.64 for the 50-69 years age group (95% CI, 2.47-7.94). Proportional incidence rates were 30.1% for the 40-49 years age group (95% CI, 9.8-70.3) and 22% for the 50-69 years age group (95% CI, 11.7-37.7). There was no significant difference between the proportional incidence rate for the 50-69 years age group for the Central & Eastern Sydney Service and those of major successful overseas screening trials. Conclusion: Screening quality was acceptable and should result in a significant mortality reduction in the screened population. Given the small number of cancers involved, comparison of interval-cancer statistics of mammographic screening programs with trials requires age-specific or age-adjusted data, and consideration of confidence intervals of both program and trial data.
Resumo:
The task of segmenting cell nuclei from cytoplasm in conventional Papanicolaou (Pap) stained cervical cell images is a classical image analysis problem which may prove to be crucial to the development of successful systems which automate the analysis of Pap smears for detection of cancer of the cervix. Although simple thresholding techniques will extract the nucleus in some cases, accurate unsupervised segmentation of very large image databases is elusive. Conventional active contour models as introduced by Kass, Witkin and Terzopoulos (1988) offer a number of advantages in this application, but suffer from the well-known drawbacks of initialisation and minimisation. Here we show that a Viterbi search-based dual active contour algorithm is able to overcome many of these problems and achieve over 99% accurate segmentation on a database of 20 130 Pap stained cell images. (C) 1998 Elsevier Science B.V. All rights reserved.
Resumo:
Objective: From Census data, to document the distribution of general practitioners in Australia and to estimate the number of general practitioners needed to achieve an equitable distribution accounting for community health need. Methods: Data on location of general practitioners, population size and crude mortality by statistical division (SD) were obtained from the Australian Bureau of Statistics. The number of patients per general practitioner by SD was calculated and plotted. Using crude mortality to estimate community health need, a ratio of the number of general practitioners per person:mortality was calculated for all Australia and for each SD (the Robin Hood Index). From this, the number of general practitioners needed to achieve equity was calculated. Results: In all, 26,290 general practitioners were identified in 57 SDs. The mean number of people per general practitioner is 707, ranging from 551 to 1887. Capital city SDs have most favourable ratios. The Robin Hood Index for Australia is 1, and ranges from 0.32 (relatively under-served) to 2.46 (relatively over-served). Twelve SDs (21%) including all capital cities and 65% of all Australians, have a Robin Hood Index > 1. To achieve equity per capita 2489 more general practitioners (10% of the current workforce) are needed. To achieve equity by the Robin Hood Index 3351 (13% of the current workforce) are needed. Conclusions: The distribution of general practitioners in Australia is skewed. Nonmetropolitan areas are relatively underserved. Census data and the Robin Hood Index could provide a simple means of identifying areas of need in Australia.
Resumo:
We have isolated a family of insect-selective neurotoxins from the venom of the Australian funnel-web spider that appear to be good candidates for biopesticide engineering. These peptides, which we have named the Janus-faced atracotoxins (J-ACTXs), each contain 36 or 37 residues, with four disulfide bridges, and they show no homology to any sequences in the protein/DNA databases. The three-dimensional structure of one of these toxins reveals an extremely rare vicinal disulfide bridge that we demonstrate to be critical for insecticidal activity. We propose that J-ACTX comprises an ancestral protein fold that we refer to as the disulfide-directed beta-hairpin.