906 resultados para computational cost
Resumo:
Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.
Resumo:
This thesis presents methods for locating and analyzing cis-regulatory DNA elements involved with the regulation of gene expression in multicellular organisms. The regulation of gene expression is carried out by the combined effort of several transcription factor proteins collectively binding the DNA on the cis-regulatory elements. Only sparse knowledge of the 'genetic code' of these elements exists today. An automatic tool for discovery of putative cis-regulatory elements could help their experimental analysis, which would result in a more detailed view of the cis-regulatory element structure and function. We have developed a computational model for the evolutionary conservation of cis-regulatory elements. The elements are modeled as evolutionarily conserved clusters of sequence-specific transcription factor binding sites. We give an efficient dynamic programming algorithm that locates the putative cis-regulatory elements and scores them according to the conservation model. A notable proportion of the high-scoring DNA sequences show transcriptional enhancer activity in transgenic mouse embryos. The conservation model includes four parameters whose optimal values are estimated with simulated annealing. With good parameter values the model discriminates well between the DNA sequences with evolutionarily conserved cis-regulatory elements and the DNA sequences that have evolved neutrally. In further inquiry, the set of highest scoring putative cis-regulatory elements were found to be sensitive to small variations in the parameter values. The statistical significance of the putative cis-regulatory elements is estimated with the Two Component Extreme Value Distribution. The p-values grade the conservation of the cis-regulatory elements above the neutral expectation. The parameter values for the distribution are estimated by simulating the neutral DNA evolution. The conservation of the transcription factor binding sites can be used in the upstream analysis of regulatory interactions. This approach may provide mechanistic insight to the transcription level data from, e.g., microarray experiments. Here we give a method to predict shared transcriptional regulators for a set of co-expressed genes. The EEL (Enhancer Element Locator) software implements the method for locating putative cis-regulatory elements. The software facilitates both interactive use and distributed batch processing. We have used it to analyze the non-coding regions around all human genes with respect to the orthologous regions in various other species including mouse. The data from these genome-wide analyzes is stored in a relational database which is used in the publicly available web services for upstream analysis and visualization of the putative cis-regulatory elements in the human genome.
Resumo:
In visual object detection and recognition, classifiers have two interesting characteristics: accuracy and speed. Accuracy depends on the complexity of the image features and classifier decision surfaces. Speed depends on the hardware and the computational effort required to use the features and decision surfaces. When attempts to increase accuracy lead to increases in complexity and effort, it is necessary to ask how much are we willing to pay for increased accuracy. For example, if increased computational effort implies quickly diminishing returns in accuracy, then those designing inexpensive surveillance applications cannot aim for maximum accuracy at any cost. It becomes necessary to find trade-offs between accuracy and effort. We study efficient classification of images depicting real-world objects and scenes. Classification is efficient when a classifier can be controlled so that the desired trade-off between accuracy and effort (speed) is achieved and unnecessary computations are avoided on a per input basis. A framework is proposed for understanding and modeling efficient classification of images. Classification is modeled as a tree-like process. In designing the framework, it is important to recognize what is essential and to avoid structures that are narrow in applicability. Earlier frameworks are lacking in this regard. The overall contribution is two-fold. First, the framework is presented, subjected to experiments, and shown to be satisfactory. Second, certain unconventional approaches are experimented with. This allows the separation of the essential from the conventional. To determine if the framework is satisfactory, three categories of questions are identified: trade-off optimization, classifier tree organization, and rules for delegation and confidence modeling. Questions and problems related to each category are addressed and empirical results are presented. For example, related to trade-off optimization, we address the problem of computational bottlenecks that limit the range of trade-offs. We also ask if accuracy versus effort trade-offs can be controlled after training. For another example, regarding classifier tree organization, we first consider the task of organizing a tree in a problem-specific manner. We then ask if problem-specific organization is necessary.
Resumo:
The publish/subscribe paradigm has lately received much attention. In publish/subscribe systems, a specialized event-based middleware delivers notifications of events created by producers (publishers) to consumers (subscribers) interested in that particular event. It is considered a good approach for implementing Internet-wide distributed systems as it provides full decoupling of the communicating parties in time, space and synchronization. One flavor of the paradigm is content-based publish/subscribe which allows the subscribers to express their interests very accurately. In order to implement a content-based publish/subscribe middleware in way suitable for Internet scale, its underlying architecture must be organized as a peer-to-peer network of content-based routers that take care of forwarding the event notifications to all interested subscribers. A communication infrastructure that provides such service is called a content-based network. A content-based network is an application-level overlay network. Unfortunately, the expressiveness of the content-based interaction scheme comes with a price - compiling and maintaining the content-based forwarding and routing tables is very expensive when the amount of nodes in the network is large. The routing tables are usually partially-ordered set (poset) -based data structures. In this work, we present an algorithm that aims to improve scalability in content-based networks by reducing the workload of content-based routers by offloading some of their content routing cost to clients. We also provide experimental results of the performance of the algorithm. Additionally, we give an introduction to the publish/subscribe paradigm and content-based networking and discuss alternative ways of improving scalability in content-based networks. ACM Computing Classification System (CCS): C.2.4 [Computer-Communication Networks]: Distributed Systems - Distributed applications
Resumo:
Background The objective is to estimate the incremental cost-effectiveness of the Australian National Hand Hygiene Inititiave implemented between 2009 and 2012 using healthcare associated Staphylococcus aureus bacteraemia as the outcome. Baseline comparators are the eight existing state and territory hand hygiene programmes. The setting is the Australian public healthcare system and 1,294,656 admissions from the 50 largest Australian hospitals are included. Methods The design is a cost-effectiveness modelling study using a before and after quasi-experimental design. The primary outcome is cost per life year saved from reduced cases of healthcare associated Staphylococcus aureus bacteraemia, with cost estimated by the annual on-going maintenance costs less the costs saved from fewer infections. Data were harvested from existing sources or were collected prospectively and the time horizon for the model was 12 months, 2011–2012. Findings No useable pre-implementation Staphylococcus aureus bacteraemia data were made available from the 11 study hospitals in Victoria or the single hospital in Northern Territory leaving 38 hospitals among six states and territories available for cost-effectiveness analyses. Total annual costs increased by $2,851,475 for a return of 96 years of life giving an incremental cost-effectiveness ratio (ICER) of $29,700 per life year gained. Probabilistic sensitivity analysis revealed a 100% chance the initiative was cost effective in the Australian Capital Territory and Queensland, with ICERs of $1,030 and $8,988 respectively. There was an 81% chance it was cost effective in New South Wales with an ICER of $33,353, a 26% chance for South Australia with an ICER of $64,729 and a 1% chance for Tasmania and Western Australia. The 12 hospitals in Victoria and the Northern Territory incur annual on-going maintenance costs of $1.51M; no information was available to describe cost savings or health benefits. Conclusions The Australian National Hand Hygiene Initiative was cost-effective against an Australian threshold of $42,000 per life year gained. The return on investment varied among the states and territories of Australia.
Resumo:
Low level strategic supplements constitute one of the few options for northern beef producers to increase breeder productivity and profitability. Objectives of the project were to improve the cost-effectiveness of using such supplements and to improve supplement delivery systems. Urea-based supplements fed during the dry season can substantially reduce breeder liveweight loss and increase fertility during severe dry seasons. Also when fed during the late wet season these supplements increased breeder body liveweight and increased fertility of breeders in low body condition. Intake of dry lick supplements fed free choice is apparently determined primarily by the palatability of supplements relative to pasture, and training of cattle appears to be of limited importance. Siting of supplementation points has some effect on supplement intake, but little effect on grazing behaviour. Economic analysis of supplementation (urea, phosphorus or molasses) and weaning strategies was based on the relative efficacy of these strategies to maintain breeder body condition late in the dry season. Adequate body condition of breeders at this time of the year is needed to avoid mortality from under-nutrition and achieve satisfactory fertility of breeders during the following wet season. Supplements were highly cost-effective when they reduced mortality, but economic returns were generally low if the only benefit was increased fertility.
Resumo:
This thesis presents a highly sensitive genome wide search method for recessive mutations. The method is suitable for distantly related samples that are divided into phenotype positives and negatives. High throughput genotype arrays are used to identify and compare homozygous regions between the cohorts. The method is demonstrated by comparing colorectal cancer patients against unaffected references. The objective is to find homozygous regions and alleles that are more common in cancer patients. We have designed and implemented software tools to automate the data analysis from genotypes to lists of candidate genes and to their properties. The programs have been designed in respect to a pipeline architecture that allows their integration to other programs such as biological databases and copy number analysis tools. The integration of the tools is crucial as the genome wide analysis of the cohort differences produces many candidate regions not related to the studied phenotype. CohortComparator is a genotype comparison tool that detects homozygous regions and compares their loci and allele constitutions between two sets of samples. The data is visualised in chromosome specific graphs illustrating the homozygous regions and alleles of each sample. The genomic regions that may harbour recessive mutations are emphasised with different colours and a scoring scheme is given for these regions. The detection of homozygous regions, cohort comparisons and result annotations are all subjected to presumptions many of which have been parameterized in our programs. The effect of these parameters and the suitable scope of the methods have been evaluated. Samples with different resolutions can be balanced with the genotype estimates of their haplotypes and they can be used within the same study.
Resumo:
Digital image
Resumo:
The INFORMAS food prices module proposes a step-wise framework to measure the cost and affordability of population diets. The price differential and the tax component of healthy and less healthy foods, food groups, meals and diets will be benchmarked and monitored over time. Results can be used to model or assess the impact of fiscal policies, such as ‘fat taxes’ or subsidies. Key methodological challenges include: defining healthy and less healthy foods, meals, diets and commonly consumed items; including costs of alcohol, takeaways, convenience foods and time; selecting the price metric; sampling frameworks; and standardizing collection and analysis protocols. The minimal approach uses three complementary methods to measure the price differential between pairs of healthy and less healthy foods. Specific challenges include choosing policy relevant pairs and defining an anchor for the lists. The expanded approach measures the cost of a healthy diet compared to the current (less healthy) diet for a reference household. It requires dietary principles to guide the development of the healthy diet pricing instrument and sufficient information about the population’s current intake to inform the current (less healthy) diet tool. The optimal approach includes measures of affordability and requires a standardised measure of household income that can be used for different countries. The feasibility of implementing the protocol in different countries is being tested in New Zealand, Australia and Fiji. The impact of different decision points to address challenges will be investigated in a systematic manner. We will present early insights and results from this work.
Developing standardized methods to assess cost of healthy and unhealthy (current) diets in Australia
Resumo:
Unhealthy diets contribute at least 14% to Australia's disease burden and are driven by ‘obesogenic’ food environments. Compliance with dietary recommendations is particularly poor amongst disadvantaged populations including low socioeconomic groups, those living in rural/remote areas and Aboriginal and Torres Strait Islanders. The perception that healthy foods are expensive is a key barrier to healthy choices and a major determinant of diet-related health inequities. Available state/regional/local data (limited and non-comparable) suggests that, despite basic healthy foods not incurring GST, the cost of healthy food is higher and has increased more rapidly than unhealthy food over the last 15 years in Australia. However, there were no nationally standardised tools or protocols to benchmark, compare or monitor food prices and affordability in Australia. Globally, we are leading work to develop and test approaches to assess the price differential of healthy and less-healthy (current) diets under the food price module of the International Network for Food and Obesity/non-communicable diseases (NCDs) Research, Monitoring and Action Support (INFORMAS). This presentation describes contextualization of the INFORMAS approach to develop standardised Australian tools, survey protocols and data collection and analysis systems. The ‘healthy diet basket’ was based on the Australian Foundation Diet, 1 The ‘current diet basket’ and specific items included in each basket, were based on recent national dietary survey data.2 Data collection methods were piloted. The final tools and protocols were then applied to measure the price and affordability of healthy and less healthy (current) diets of different household groups in diverse communities across the nation. We have compared results for different geographical locations/population subgroups in Australia and assessed these against international INFORMAS benchmarks. The results inform the development of policy and practice, including those relevant to mooted changes to the GST base, to promote nutrition and healthy weight and prevent chronic disease in Australia.
Resumo:
OBJECTIVE To report the cost-effectiveness of a tailored handheld computerized procedural preparation and distraction intervention (Ditto) used during pediatric burn wound care in comparison to standard practice. METHODS An economic evaluation was performed alongside a randomized controlled trial of 75 children aged 4 to 13 years who presented with a burn to the Royal Children's Hospital, Brisbane, Australia. Participants were randomized to either the Ditto intervention (n = 35) or standard practice (n = 40) to measure the effect of the intervention on days taken for burns to re-epithelialize. Direct medical, direct nonmedical, and indirect cost data during burn re-epithelialization were extracted from the randomized controlled trial data and combined with scar management cost data obtained retrospectively from medical charts. Nonparametric bootstrapping was used to estimate statistical uncertainty in cost and effect differences and cost-effectiveness ratios. RESULTS On average, the Ditto intervention reduced the time to re-epithelialize by 3 days at AU$194 less cost for each patient compared with standard practice. The incremental cost-effectiveness plane showed that 78% of the simulated results were within the more effective and less costly quadrant and 22% were in the more effective and more costly quadrant, suggesting a 78% probability that the Ditto intervention dominates standard practice (i.e., cost-saving). At a willingness-to-pay threshold of AU$120, there is a 95% probability that the Ditto intervention is cost-effective (or cost-saving) against standard care. CONCLUSIONS This economic evaluation showed the Ditto intervention to be highly cost-effective against standard practice at a minimal cost for the significant benefits gained, supporting the implementation of the Ditto intervention during burn wound care.
Resumo:
Mammalian heparanase is an endo-β-glucuronidase associated with cell invasion in cancer metastasis, angiogenesis and inflammation. Heparanase cleaves heparan sulfate proteoglycans in the extracellular matrix and basement membrane, releasing heparin/heparan sulfate oligosaccharides of appreciable size. This in turn causes the release of growth factors, which accelerate tumor growth and metastasis. Heparanase has two glycosaminoglycan-binding domains; however, no three-dimensional structure information is available for human heparanase that can provide insights into how the two domains interact to degrade heparin fragments. We have constructed a new homology model of heparanase that takes into account the most recent structural and bioinformatics data available. Heparin analogs and glycosaminoglycan mimetics were computationally docked into the active site with energetically stable ring conformations and their interaction energies were compared. The resulting docked structures were used to propose a model for substrates and conformer selectivity based on the dimensions of the active site. The docking of substrates and inhibitors indicates the existence of a large binding site extending at least two saccharide units beyond the cleavage site (toward the nonreducing end) and at least three saccharides toward the reducing end (toward heparin-binding site 2). The docking of substrates suggests that heparanase recognizes the N-sulfated and O-sulfated glucosamines at subsite +1 and glucuronic acid at the cleavage site, whereas in the absence of 6-O-sulfation in glucosamine, glucuronic acid is docked at subsite +2. These findings will help us to focus on the rational design of heparanase-inhibiting molecules for anticancer drug development by targeting the two heparin/heparan sulfate recognition domains.
Resumo:
The past several years have seen significant advances in the development of computational methods for the prediction of the structure and interactions of coiled-coil peptides. These methods are generally based on pairwise correlations of amino acids, helical propensity, thermal melts and the energetics of sidechain interactions, as well as statistical patterns based on Hidden Markov Model (HMM) and Support Vector Machine (SVM) techniques. These methods are complemented by a number of public databases that contain sequences, motifs, domains and other details of coiled-coil structures identified by various algorithms. Some of these computational methods have been developed to make predictions of coiled-coil structure on the basis of sequence information; however, structural predictions of the oligomerisation state of these peptides still remains largely an open question due to the dynamic behaviour of these molecules. This review focuses on existing in silico methods for the prediction of coiled-coil peptides of functional importance using sequence and/or three-dimensional structural data.
Resumo:
This research develops a design support system, which is able to estimate the life cycle cost of different product families at the early stage of product development. By implementing the system, a designer is able to develop various cost effective product families in a shorter lead-time and minimise the destructive impact of the product family on the environment.