979 resultados para horned Senmurvs, roundels, trees


Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the growing number of XML documents on theWeb it becomes essential to effectively organise these XML documents in order to retrieve useful information from them. A possible solution is to apply clustering on the XML documents to discover knowledge that promotes effective data management, information retrieval and query processing. However, many issues arise in discovering knowledge from these types of semi-structured documents due to their heterogeneity and structural irregularity. Most of the existing research on clustering techniques focuses only on one feature of the XML documents, this being either their structure or their content due to scalability and complexity problems. The knowledge gained in the form of clusters based on the structure or the content is not suitable for reallife datasets. It therefore becomes essential to include both the structure and content of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both these kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. The overall objective of this thesis is to address these issues by: (1) proposing methods to utilise frequent pattern mining techniques to reduce the dimension; (2) developing models to effectively combine the structure and content of XML documents; and (3) utilising the proposed models in clustering. This research first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. A clustering framework with two types of models, implicit and explicit, is developed. The implicit model uses a Vector Space Model (VSM) to combine the structure and the content information. The explicit model uses a higher order model, namely a 3- order Tensor Space Model (TSM), to explicitly combine the structure and the content information. This thesis also proposes a novel incremental technique to decompose largesized tensor models to utilise the decomposed solution for clustering the XML documents. The proposed framework and its components were extensively evaluated on several real-life datasets exhibiting extreme characteristics to understand the usefulness of the proposed framework in real-life situations. Additionally, this research evaluates the outcome of the clustering process on the collection selection problem in the information retrieval on the Wikipedia dataset. The experimental results demonstrate that the proposed frequent pattern mining and clustering methods outperform the related state-of-the-art approaches. In particular, the proposed framework of utilising frequent structures for constraining the content shows an improvement in accuracy over content-only and structure-only clustering results. The scalability evaluation experiments conducted on large scaled datasets clearly show the strengths of the proposed methods over state-of-the-art methods. In particular, this thesis work contributes to effectively combining the structure and the content of XML documents for clustering, in order to improve the accuracy of the clustering solution. In addition, it also contributes by addressing the research gaps in frequent pattern mining to generate efficient and concise frequent subtrees with various node relationships that could be used in clustering.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Accurate and detailed road models play an important role in a number of geospatial applications, such as infrastructure planning, traffic monitoring, and driver assistance systems. In this thesis, an integrated approach for the automatic extraction of precise road features from high resolution aerial images and LiDAR point clouds is presented. A framework of road information modeling has been proposed, for rural and urban scenarios respectively, and an integrated system has been developed to deal with road feature extraction using image and LiDAR analysis. For road extraction in rural regions, a hierarchical image analysis is first performed to maximize the exploitation of road characteristics in different resolutions. The rough locations and directions of roads are provided by the road centerlines detected in low resolution images, both of which can be further employed to facilitate the road information generation in high resolution images. The histogram thresholding method is then chosen to classify road details in high resolution images, where color space transformation is used for data preparation. After the road surface detection, anisotropic Gaussian and Gabor filters are employed to enhance road pavement markings while constraining other ground objects, such as vegetation and houses. Afterwards, pavement markings are obtained from the filtered image using the Otsu's clustering method. The final road model is generated by superimposing the lane markings on the road surfaces, where the digital terrain model (DTM) produced by LiDAR data can also be combined to obtain the 3D road model. As the extraction of roads in urban areas is greatly affected by buildings, shadows, vehicles, and parking lots, we combine high resolution aerial images and dense LiDAR data to fully exploit the precise spectral and horizontal spatial resolution of aerial images and the accurate vertical information provided by airborne LiDAR. Objectoriented image analysis methods are employed to process the feature classiffcation and road detection in aerial images. In this process, we first utilize an adaptive mean shift (MS) segmentation algorithm to segment the original images into meaningful object-oriented clusters. Then the support vector machine (SVM) algorithm is further applied on the MS segmented image to extract road objects. Road surface detected in LiDAR intensity images is taken as a mask to remove the effects of shadows and trees. In addition, normalized DSM (nDSM) obtained from LiDAR is employed to filter out other above-ground objects, such as buildings and vehicles. The proposed road extraction approaches are tested using rural and urban datasets respectively. The rural road extraction method is performed using pan-sharpened aerial images of the Bruce Highway, Gympie, Queensland. The road extraction algorithm for urban regions is tested using the datasets of Bundaberg, which combine aerial imagery and LiDAR data. Quantitative evaluation of the extracted road information for both datasets has been carried out. The experiments and the evaluation results using Gympie datasets show that more than 96% of the road surfaces and over 90% of the lane markings are accurately reconstructed, and the false alarm rates for road surfaces and lane markings are below 3% and 2% respectively. For the urban test sites of Bundaberg, more than 93% of the road surface is correctly reconstructed, and the mis-detection rate is below 10%.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the large diffusion of Business Process Managemen (BPM) automation suites, the possibility of managing process-related risks arises. This paper introduces an innovative framework for process-related risk management and describes a working implementation realized by extending the YAWL system. The framework covers three aspects of risk management: risk monitoring, risk prevention, and risk mitigation. Risk monitoring functionality is provided using a sensor-based architecture, where sensors are defined at design time and used at run-time for monitoring purposes. Risk prevention functionality is provided in the form of suggestions about what should be executed, by who, and how, through the use of decision trees. Finally, risk mitigation functionality is provided as a sequence of remedial actions (e.g. reallocating, skipping, rolling back of a work item) that should be executed to restore the process to a normal situation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Divergence dating studies, which combine temporal data from the fossil record with branch length data from molecular phylogenetic trees, represent a rapidly expanding approach to understanding the history of life. National Evolutionary Synthesis Center hosted the first Fossil Calibrations Working Group (3–6 March, 2011, Durham, NC, USA), bringing together palaeontologists, molecular evolutionists and bioinformatics experts to present perspectives from disciplines that generate, model and use fossil calibration data. Presentations and discussions focused on channels for interdisciplinary collaboration, best practices for justifying, reporting and using fossil calibrations and roadblocks to synthesis of palaeontological and molecular data. Bioinformatics solutions were proposed, with the primary objective being a new database for vetted fossil calibrations with linkages to existing resources, targeted for a 2012 launch.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Australasian marsupials include three major radiations, the insectivorous/carnivorous Dasyuromorphia, the omnivorous bandicoots (Peramelemorphia), and the largely herbivorous diprotodontians. Morphologists have generally considered the bandicoots and diprotodontians to be closely related, most prominently because they are both syndactylous (with the 2nd and 3rd pedal digits being fused). Molecular studies have been unable to confirm or reject this Syndactyla hypothesis. Here we present new mitochondrial (mt) genomes from a spiny bandicoot (Echymipera rufescens) and two dasyurids, a fat-tailed dunnart (Sminthopsis crassicaudata) and a northern quoll (Dasyurus hallucatus). By comparing trees derived from pairwise base-frequency differences between taxa with standard (absolute, uncorrected) distance trees, we infer that composition bias among mt protein-coding and RNA sequences is sufficient to mislead tree reconstruction. This can explain incongruence between trees obtained from mt and nuclear data sets. However, after excluding major sources of compositional heterogeneity, both the “reduced-bias” mt and nuclear data sets clearly favor a bandicoot plus dasyuromorphian association, as well as a grouping of kangaroos and possums (Phalangeriformes) among diprotodontians. Notably, alternatives to these groupings could only be confidently rejected by combining the mt and nuclear data. Elsewhere on the tree, Dromiciops appears to be sister to the monophyletic Australasian marsupials, whereas the placement of the marsupial mole (Notoryctes) remains problematic. More generally, we contend that it is desirable to combine mt genome and nuclear sequences for inferring vertebrate phylogeny, but as separately modeled process partitions. This strategy depends on detecting and excluding (or accounting for) major sources of nonhistorical signal, such as from compositional nonstationarity.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sequence data often have competing signals that are detected by network programs or Lento plots. Such data can be formed by generating sequences on more than one tree, and combining the results, a mixture model. We report that with such mixture models, the estimates of edge (branch) lengths from maximum likelihood (ML) methods that assume a single tree are biased. Based on the observed number of competing signals in real data, such a bias of ML is expected to occur frequently. Because network methods can recover competing signals more accurately, there is a need for ML methods allowing a network. A fundamental problem is that mixture models can have more parameters than can be recovered from the data, so that some mixtures are not, in principle, identifiable. We recommend that network programs be incorporated into best practice analysis, along with ML and Bayesian trees.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Evolutionary biologists are often misled by convergence of morphology and this has been common in the study of bird evolution. However, the use of molecular data sets have their own problems and phylogenies based on short DNA sequences have the potential to mislead us too. The relationships among clades and timing of the evolution of modern birds (Neoaves) has not yet been well resolved. Evidence of convergence of morphology remain controversial. With six new bird mitochondrial genomes (hummingbird, swift, kagu, rail, flamingo and grebe) we test the proposed Metaves/Coronaves division within Neoaves and the parallel radiations in this primary avian clade. Results Our mitochondrial trees did not return the Metaves clade that had been proposed based on one nuclear intron sequence. We suggest that the high number of indels within the seventh intron of the β-fibrinogen gene at this phylogenetic level, which left a dataset with not a single site across the alignment shared by all taxa, resulted in artifacts during analysis. With respect to the overall avian tree, we find the flamingo and grebe are sister taxa and basal to the shorebirds (Charadriiformes). Using a novel site-stripping technique for noise-reduction we found this relationship to be stable. The hummingbird/swift clade is outside the large and very diverse group of raptors, shore and sea birds. Unexpectedly the kagu is not closely related to the rail in our analysis, but because neither the kagu nor the rail have close affinity to any taxa within this dataset of 41 birds, their placement is not yet resolved. Conclusion Our phylogenetic hypothesis based on 41 avian mitochondrial genomes (13,229 bp) rejects monophyly of seven Metaves species and we therefore conclude that the members of Metaves do not share a common evolutionary history within the Neoaves.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article examines the recently introduced Neighbourhood Disputes Resolution Act 2011 (Qld). The operation of the Act is considered as it impacts upon the responsibility of neighbours for dividing fences and trees as well as disclosure obligations associated with sale transactions. A particular focus of the article is the interrelationship of the disclosure obligations imposed by the Act with the operation of standard contractual warranties in Queensland.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Kyoto Protocol recognises trees as a sink of carbon and a valid means to offset greenhouse gas emissions and meet internationally agreed emissions targets. This study details biological carbon sequestration rates for common plantation species Araucaria cunninghamii (hoop pine), Eucalyptus cloeziana, Eucalyptus argophloia, Pinus elliottii and Pinus caribaea var hondurensis and individual land areas required in north-eastern Australia to offset greenhouse gas emissions of 1000tCO 2e. The 3PG simulation model was used to predict above and below-ground estimates of biomass carbon for a range of soil productivity conditions for six representative locations in agricultural regions of north-eastern Australia. The total area required to offset 1000tCO 2e ranges from 1ha of E. cloeziana under high productivity conditions in coastal North Queensland to 45ha of hoop pine in low productivity conditions of inland Central Queensland. These areas must remain planted for a minimum of 30years to meet the offset of 1000tCO 2e.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Effective, statistically robust sampling and surveillance strategies form an integral component of large agricultural industries such as the grains industry. Intensive in-storage sampling is essential for pest detection, Integrated Pest Management (IPM), to determine grain quality and to satisfy importing nation’s biosecurity concerns, while surveillance over broad geographic regions ensures that biosecurity risks can be excluded, monitored, eradicated or contained within an area. In the grains industry, a number of qualitative and quantitative methodologies for surveillance and in-storage sampling have been considered. Primarily, research has focussed on developing statistical methodologies for in storage sampling strategies concentrating on detection of pest insects within a grain bulk, however, the need for effective and statistically defensible surveillance strategies has also been recognised. Interestingly, although surveillance and in storage sampling have typically been considered independently, many techniques and concepts are common between the two fields of research. This review aims to consider the development of statistically based in storage sampling and surveillance strategies and to identify methods that may be useful for both surveillance and in storage sampling. We discuss the utility of new quantitative and qualitative approaches, such as Bayesian statistics, fault trees and more traditional probabilistic methods and show how these methods may be used in both surveillance and in storage sampling systems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

"Whe' yu' from?" The question was put to me as I wandered, camera in hand, in the old square of Spanish Town, Jamaica's former capital. The local man, lounging in the shade of one of the colonial Georgian buildings that enclose the square, was mildly curious about what he took to be a typical white tourish photgraphing the sights of the decayed historic town. At that time, my home was in Kingston where i lived with my wife and baby son. I was then working in the Jamaican Government Town Planning Department in a job that took me all over the island. Turning to my questioner, I replied, "Kingston". There was a brief pause, and then the man spoke again: "No Man! Whe' yu' really from?" I still have difficulties when asked this question. Where am I from? What does this question mean? Does it refer to where I was born, where I spent my previous life or where I live now? Does it have a broader meaning, an enquiry about my origins in terms of background and previous experience? The following chapters are my attempt to answer these questions for my own satisfaction and, I hope, for the amusement of others who may be interested in the life of an ordinary English boy whose dream to travel and see the world was realized in ways he could not possibly have imagined. Finding an appropriate title for this book was difficult. Thursday's Child, North and South and War and Peace all came to mind but, unfortunately for me, those titles had been appropriated by other writers. Thursdays's Child is quite a popular book title, presumably because people who were born on that day and, in the words of the nursery rhyme, had 'far to go', are especially likely to have travellers' tales to tell or life stories of the rags-to-riches variety. Born on a Thursday, I have travelled a lot and I suppose that I have gone far in life. Coming from a working class family, I 'got on' by 'getting a good education' and a 'good job'. I decided against adding to the list of Thursday's Children. North and South would have reflected my life in Britain, spent in both the North and South of England, and my later years, divided between the Northern and Southern Hemispheres of the globe, as well as in countries commonly referred to as the 'advanced' North and the 'underdeveloped' South. North and South has already been appropriated by Mrs Gaskell, something that did not deter one popular American writer from using the title for a book of his. My memories of World War Two and the years afterwards made War and Peace a possible candidate, but readers expectnig an epic tale of Tolstoyan proportions may have been disappointed. To my knowledge, no other book has the title "Whe' Yu' From?". I am grateful to the Jamaican man whose question lingered in my memory and provided the title of this memoir, written decades later. This book is a word picture. It is, in a sense, a self-portrait, and like all portraits, it captures something of the character, it attempts to tell the truth, but it is not the whole truth. This is because it is not my intention to write my entire life story; rather I wish to tell about some of the things in my experience of life that have seemed important or interesting to me. Unlike a painted portrait, the picture I have created is intended to suggest the passage of time. While, for most of us in Western society, time is linear and unidirectional, like the flight of an arrov or the trajectory of a bullet, memory rearranges things, calling up images of the past in no particular order, making connections that may link events in various patterns, circular, web-like, superimposed. The stream of consciousness is very unlike that of streams we encounter in the physical world. Connections are made in all directions; thoughts hop back and forth in time and space, from topic to topic. My book is a composition drawn from periods, events and thoughts as I remember them. Like life itself, it is made up of patches, some good, some bad, but in my experience, always fascinating. In recording my memories, I have been as accurate as possible. Little of what I have written is about spectacular sights and strange customs. Much of it focuses on my more modest explorations includng observations of everyday things that have attracted my attention. Reading through the chapters, I am struck by my childhood freedom to roam and engage in 'dangerous' activities like climbing trees and playing beside streams, things that many children today are no longer allowed to enjoy. Also noticeable is the survival of traditions and superstitions from the distant past. Obvious too, is my preoccupation with place names, both official ones that appear on maps and sign boards and those used by locals and children, names rarely seen in print. If there is any uniting theme to be found in what I have written, it must be my education in the fields, woods and streets of my English homeland, in the various other countries in which I have lived and travelled, as well as more formally from books and in classrooms. Much of my book is concerned with people and places. Many of the people I mention are among those who have been, and often have remained, important and close to me. Others I remember from only the briefest of encounters, but they remain in my memory because of some specific incident or circumstance that fixed a lasting image in my mind. Some of my closest friends and relatives, however, appear nowhere in these pages or they receive only the slightest mention. This is not because they played an unimportant roles in my life. It is because this book is not the whole story. Among those whe receive little or no mention are some who are especially close to me, with whom I have shared happy and sad times and who have shown me and my family much kindness, giving support when this was needed. Some I have known since childhood and have popped up at various times in my life, often in different parts of the world. Although years may pass without me seeing them, in an important sense they are always with me. These people know who they are. I hope that they know how much I love and appreciate them. When writing my memoir, I consulted a few of the people mentioned in this book, but in the main, I have relied on my own memory, asided by daiary and notebook entries and old correspondence. In the preparation of this manuscript, I benefited greatly from the expert advice and encouragement of Neil Marr of BeWrite Books. My wife Anne, inspiration for this book, also contributed in the valuable role of critic. She has my undying gratitude.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Purpose This thesis is about liveability, place and ageing in the high density urban landscape of Brisbane, Australia. As with other major developed cities around the globe, Brisbane has adopted policies to increase urban residential densities to meet the main liveability and sustainability aim of decreasing car dependence and therefore pollution, as well as to minimise the loss of greenfield areas and habitats to developers. This objective hinges on urban neighbourhoods/communities being liveable places, which residents do not have to leave for everyday living. Community/neighbourhood liveability is an essential ingredient in healthy ageing in place and has a substantial impact upon the safety, independence and well-being of older adults. It is generally accepted that ageing in place is optimal for both older people and the state. The optimality of ageing in place generally assumes that there is a particular quality to environments or standard of liveability in which people successfully age in place. The aim of this thesis was to examine if there are particular environmental qualities or aspects of liveability that test optimality and to better understand the key liveability factors that contribute to successful ageing in place. Method A strength of this thesis is that it draws on two separate studies to address the research question of what makes high density liveable for older people. In Chapter 3, the two methods are identified and differentiated as Method 1 (used in Paper 1) and Method 2 (used in Papers 2, 3, 4 and 5). Method 1 involved qualitative interviews with 24 inner city high density Brisbane residents. The major strength of this thesis is the innovative methodology outlined in the thesis as Method 2. Method 2 involved a case study approach employing qualitative and quantitative methods. Qualitative data was collected using semi-structured, in-depth interviews and time-use diaries completed by participants during the week of tracking. The quantitative data was gathered using Global Positioning Systems for tracking and Geographical Information Systems for mapping and analysis of participants’ activities. The combination of quantitative and qualitative analysis captured both participants’ subjective perceptions of their neighbourhoods and their patterns of movement. This enhanced understanding of how neighbourhoods and communities function and of the various liveability dimensions that contribute to active ageing and ageing in place for older people living in high density environments. Both studies’ participants were inner-city high density residents of Brisbane. The study based on Method 1 drew on a wider age demographic than the study based on Method 2. Findings The five papers presented in this thesis by publication indicate a complex inter-relationship of the factors that make a place liveable. The first three papers identify what is comparable and different between the physical and social factors of high density communities/neighbourhoods. The last two papers explore relationships between social engagement and broader community variables such as infrastructure and the physical built environments that are risk or protective factors relevant to community liveability, active ageing and ageing in place in high density. The research highlights the importance of creating and/or maintaining a barrier-free environment and liveable community for ageing adults. Together, the papers promote liveability, social engagement and active ageing in high density neighbourhoods by identifying factors that constitute liveability and strategies that foster active ageing and ageing in place, social connections and well-being. Recommendations There is a strong need to offer more support for active ageing and ageing in place. While the data analyses of this research provide insight into the lived experience of high density residents, further research is warranted. Further qualitative and quantitative research is needed to explore in more depth, the urban experience and opinions of older people living in urban environments. In particular, more empirical research and theory-building is needed in order to expand understanding of the particular environmental qualities that enable successful ageing in place in our cities and to guide efforts aimed at meeting this objective. The results suggest that encouraging the presence of more inner city retail outlets, particularly services that are utilised frequently in people’s daily lives such as supermarkets, medical services and pharmacies, would potentially help ensure residents fully engage in their local community. The connectivity of streets, footpaths and their role in facilitating the reaching of destinations are well understood as an important dimension of liveability. To encourage uptake of sustainable transport, the built environment must provide easy, accessible connections between buildings, walkways, cycle paths and public transport nodes. Wider streets, given that they take more time to cross than narrow streets, tend to .compromise safety - especially for older people. Similarly, the width of footpaths, the level of buffering, the presence of trees, lighting, seating and design of and distance between pedestrian crossings significantly affects the pedestrian experience for older people and impacts upon their choice of transportation. High density neighbourhoods also require greater levels of street fixtures and furniture for everyday life to make places more useable and comfortable for regular use. The importance of making the public realm useful and habitable for older people cannot be over-emphasised. Originality/value While older people are attracted to high density settings, there has been little empirical evidence linking liveability satisfaction with older people’s use of urban neighbourhoods. The current study examined the relationships between community/neighbourhood liveability, place and ageing to better understand the implications for those adults who age in place. The five papers presented in this thesis add to the understanding of what high density liveable age-friendly communities/ neighbourhoods are and what makes them so for older Australians. Neighbourhood liveability for older people is about being able to age in place and remain active. Issues of ageing in Australia and other areas of the developed world will become more critical in the coming decades. Creating livable communities for all ages calls for partnerships across all levels of government agencies and among different sectors within communities. The increasing percentage of older people in the community will have increasing political influence and it will be a foolish government who ignores the needs of an older society.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The benefits of applying tree-based methods to the purpose of modelling financial assets as opposed to linear factor analysis are increasingly being understood by market practitioners. Tree-based models such as CART (classification and regression trees) are particularly well suited to analysing stock market data which is noisy and often contains non-linear relationships and high-order interactions. CART was originally developed in the 1980s by medical researchers disheartened by the stringent assumptions applied by traditional regression analysis (Brieman et al. [1984]). In the intervening years, CART has been successfully applied to many areas of finance such as the classification of financial distress of firms (see Frydman, Altman and Kao [1985]), asset allocation (see Sorensen, Mezrich and Miller [1996]), equity style timing (see Kao and Shumaker [1999]) and stock selection (see Sorensen, Miller and Ooi [2000])...

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes a technique that supports process participants in making risk-informed decisions, with the aim to reduce the process risks. Risk reduction involves decreasing the likelihood and severity of a process fault from occurring. Given a process exposed to risks, e.g. a financial process exposed to a risk of reputation loss, we enact this process and whenever a process participant needs to provide input to the process, e.g. by selecting the next task to execute or by filling out a form, we prompt the participant with the expected risk that a given fault will occur given the particular input. These risks are predicted by traversing decision trees generated from the logs of past process executions and considering process data, involved resources, task durations and contextual information like task frequencies. The approach has been implemented in the YAWL system and its effectiveness evaluated. The results show that the process instances executed in the tests complete with substantially fewer faults and with lower fault severities, when taking into account the recommendations provided by our technique.