416 resultados para Feature Felection
Resumo:
Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.
Resumo:
Typical flow fields in a stormwater gross pollutant trap (GPT) with blocked retaining screens were experimentally captured and visualised. Particle image velocimetry (PIV) software was used to capture the flow field data by tracking neutrally buoyant particles with a high speed camera. A technique was developed to apply the Image Based Flow Visualization (IBFV) algorithm to the experimental raw dataset generated by the PIV software. The dataset consisted of scattered 2D point velocity vectors and the IBFV visualisation facilitates flow feature characterisation within the GPT. The flow features played a pivotal role in understanding gross pollutant capture and retention within the GPT. It was found that the IBFV animations revealed otherwise unnoticed flow features and experimental artefacts. For example, a circular tracer marker in the IBFV program visually highlighted streamlines to investigate specific areas and identify the flow features within the GPT.
Resumo:
Creativity is an attribute of individual people, but also a feature of organizations like firms, cultural institutions and social networks. In the knowledge economy of today, creativity is of increasing value, for developing, emergent and advanced countries, and for competing cities. This book is the first to present an organized study of the key concepts that underlie and motivate the field of creative industries. Written by a world-leading team of experts, it presents readers with compact accounts of the history of terms, the debates and tensions associated with their usage, and examples of how they apply to the creative industries around the world. Crisp and relevant, this is an invaluable text for students of the creative industries across a range of disciplines, especially media, communication, economics, sociology, creative and performing arts and regional studies.
Resumo:
Background Cancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset. Results We compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours. Conclusions We demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers
Resumo:
Dutch-born Australian director, Rolf de Heer, is Australia's most successful and unpredictable film-maker, with thirteen feature films of widely varying style and genre to his name. Arising from the author's 2006 - 2009 PhD research at the Queensland University of Technology (which focussed on the psychoanalytic use of sound in his films), and a fixed term Research Fellowship at the National Film and Sound Archive in Canberra, Australia, "Dutch Tilt, Aussie Auteur: The Films of Rolf de Heer" was first published in 2009 by VDM in Saarbrucken, Germany. This second edition addresses de Heer's additional film-making since 2009, and as with the first edition, is an auteur analysis of the thirteen feature films he has directed (and mostly written and produced). The book explores the theoretical instability of the concept of auteurism and concludes that there is a signature world view to be detected in his oeuvre, and that de Heer (quite possibly unconsciously) promotes unlikely protagonists who are non-hyper masculine, child-like and nurturing, as opposed to the typical Hollywood hero who is macho, exploitative and hyper masculine. Rolf de Heer was born in Heemskerk, Holland, in 1951 and migrated to Australia with his family in 1959. He spent seven years working for the ABC before gaining entry to Australia's Film, Television and Radio School, where he studied Producing and Directing. From his debut feature film after graduating, the children's story about the restoration of a Tiger Moth biplane, "Tail of a Tiger" (1984) to his breakout cult sensation "Bad Boy Bubby" (1993) which "tore Venice [Film Festival] apart" to the first Aboriginal Australian language film "Ten Canoes" (2006) which scooped the pool at the Australian Film Institute awards, de Heer has consistently proven himself unpredictable. This analysis of his widely disparate films, however, suggests that Australia's most innovative film-maker has a signature pre-occupation with giving a voice to marginalised, non-hyper masculine protagonists. Demonstrating a propensity to write and direct in a European-like style, his 'Dutch tilt' is very much not Hollywood, but is nevertheless representative of a typically Aussie world-view.
Resumo:
This article investigates young children’s interactions with their peers and teachers following the events of the Christchurch earthquakes in New Zealand on September 2010 and February 2011. Drawing on conversation analysis and psychological literature, we focus on one outdoor excursion to visit a broken water pipe caused by the earthquake to show how the teacher and children mutually accomplished trouble telling and storying. A particular feature of talk was the use of pivotal utterances to transition from talking about the damaged environment, to talking about reflections of actual earthquake events. This article shows how teachers initiate and prompt children’s informal and spontaneous story telling as an interactional resource for discussing traumatic events.
Resumo:
The rapid increase in the deployment of CCTV systems has led to a greater demand for algorithms that are able to process incoming video feeds. These algorithms are designed to extract information of interest for human operators. During the past several years, there has been a large effort to detect abnormal activities through computer vision techniques. Typically, the problem is formulated as a novelty detection task where the system is trained on normal data and is required to detect events which do not fit the learned `normal' model. Many researchers have tried various sets of features to train different learning models to detect abnormal behaviour in video footage. In this work we propose using a Semi-2D Hidden Markov Model (HMM) to model the normal activities of people. The outliers of the model with insufficient likelihood are identified as abnormal activities. Our Semi-2D HMM is designed to model both the temporal and spatial causalities of the crowd behaviour by assuming the current state of the Hidden Markov Model depends not only on the previous state in the temporal direction, but also on the previous states of the adjacent spatial locations. Two different HMMs are trained to model both the vertical and horizontal spatial causal information. Location features, flow features and optical flow textures are used as the features for the model. The proposed approach is evaluated using the publicly available UCSD datasets and we demonstrate improved performance compared to other state of the art methods.
Resumo:
“Who are you? How do you define yourself, your identity?” With these words Allan Moore opens his exhaustive new work proposing a more comprehensive approach to the musicological analysis of popular song. The last three decades have seen a huge expansion of the anthology of the sociological and cultural meanings of pop, but Moore’s book is not another exploration of this field, although some of these ideas are incorporated in this work. Rather, he addresses the limitations of conventional musicology when dealing particularly with songs: “I address popular song rather than popular music. The defining feature of popular song lies in the interaction of everyday words and music… it is how they interact that produces significance in the experience of song”.
Resumo:
Transition between epithelial and mesenchymal states is a feature of both normal development and tumor progression. We report that expression of chloride channel accessory protein hCLCA2 is a characteristic of epithelial differentiation in the immortalized MCF10A and HMLE models, while induction of epithelial-to-mesenchymal transition by cell dilution, TGFβ or mesenchymal transcription factors sharply reduces hCLCA2 levels. Attenuation of hCLCA2 expression by lentiviral small hairpin RNA caused cell overgrowth and focus formation, enhanced migration and invasion, and increased mammosphere formation in methylcellulose. These changes were accompanied by downregulation of E-cadherin and upregulation of mesenchymal markers such as vimentin and fibronectin. Moreover, hCLCA2 expression is greatly downregulated in breast cancer cells with a mesenchymal or claudin-low profile. These observations suggest that loss of hCLCA2 may promote metastasis. We find that higher-than-median expression of hCLCA2 is associated with a one-third lower rate of metastasis over an 18-year period among breast cancer patients compared with lower-than-median (n=344, unfiltered for subtype). Thus, hCLCA2 is required for epithelial differentiation, and its loss during tumor progression contributes to metastasis. Overexpression of hCLCA2 has been reported to inhibit cell proliferation and is accompanied by increases in chloride current at the plasma membrane and reduced intracellular pH (pHi). We found that knockdown cells have sharply reduced chloride current and higher pHi, both characteristics of tumor cells. These results suggest a mechanism for the effects on differentiation. Loss of hCLCA2 may allow escape from pHi homeostatic mechanisms, permitting the higher intracellular and lower extracellular pH that are characteristic of aggressive tumor cells.
Resumo:
This Foreword introduces key concepts explored in Pahl and Rowsell’s Literacy and Education, Including artifactual critical literacy, multimodality and design. It gives a sense of the rich narratives, sites and classroom examples which feature in the book – key elements of a New Literacy Studies approach, but here applied to classroom practice.
Resumo:
An advanced rule-based Transit Signal Priority (TSP) control method is presented in this paper. An on-line transit travel time prediction model is the key component of the proposed method, which enables the selection of the most appropriate TSP plans for the prevailing traffic and transit condition. The new method also adopts a priority plan re-development feature that enables modifying or even switching the already implemented priority plan to accommodate changes in the traffic conditions. The proposed method utilizes conventional green extension and red truncation strategies and also two new strategies including green truncation and queue clearance. The new method is evaluated against a typical active TSP strategy and also the base case scenario assuming no TSP control in microsimulation. The evaluation results indicate that the proposed method can produce significant benefits in reducing the bus delay time and improving the service regularity with negligible adverse impacts on the non-transit street traffic.
Resumo:
The purpose of Changing Lanes was to question the identity of Brisbane laneways through the collaboration of local stakeholders by promoting design. Community partners provided design briefs for student work from Architecture and Interior Design to be included in a design competition. Shortlisted student projects were featured in the Changing Lanes event during which the winners were announced. In addition to student work from Architecture and Interior Design, the five other disciplines from QUT's School of Design also exhibited samples of student work. The engagement of local stakeholders; architectural practice, interior designers, engineers, and a media and publication agency was fundamental to the success of this event. The design work on display provided creative expression for the potential of Brisbane Laneways to bring communities together through the language of design. An underutilised area of Fortitude Valley was activated through a combination of media including drawings, videos, street furniture, and music.
Resumo:
This paper considers the role of CCTV (closed circuit television) in the surveillance, policing and control of public space in urban and rural locations, specifically in relation to the use of public space by young people. The use of CCTV technology in public spaces is now an established and largely uncontested feature of everyday life in a number of countries and the assertion that they are essentially there for the protection of law abiding and consuming citizens has broadly gone unchallenged. With little or no debate in the U.K. to critique the claims made by the burgeoning security industry that CCTV protects people in the form of a ‘Big Friend’, the state at both central and local levels has endorsed the installation of CCTV apparatus across the nation. Some areas assert in their promotional material that the centre of the shopping and leisure zone is fully surveilled by cameras in order to reassure visitors that their personal safety is a matter of civic concern, with even small towns and villages expending monies on sophisticated and expensive to maintain camera systems. It is within a context of monitoring, recording and control procedures that young people’s use of public space is constructed as a threat to social order, in need of surveillance and exclusion which forms a major and contemporary feature in shaping thinking about urban and rural working class young people in the U.K. As Loader (1996) notes, young people’s claims on public space rarely gain legitimacy if ‘colliding’ with those of local residents, and Davis (1990) describes the increasing ‘militarization and destruction of public space’, while Jacobs (1965) asserts that full participation in the ‘daily life of urban streets’ is essential to the development of young people and beneficial for all who live in an area. This paper challenges the uncritical acceptance of widespread use of CCTV and identifies its oppressive and malevolent potential in forming a ‘surveillance gaze’ over young people (adapting Foucault’s ‘clinical gaze’c. 1973) which can jeopardise mental health and well being in coping with the ‘metropolis’, after Simmel, (1964).
Resumo:
This paper presents a graph-based method to weight medical concepts in documents for the purposes of information retrieval. Medical concepts are extracted from free-text documents using a state-of-the-art technique that maps n-grams to concepts from the SNOMED CT medical ontology. In our graph-based concept representation, concepts are vertices in a graph built from a document, edges represent associations between concepts. This representation naturally captures dependencies between concepts, an important requirement for interpreting medical text, and a feature lacking in bag-of-words representations. We apply existing graph-based term weighting methods to weight medical concepts. Using concepts rather than terms addresses vocabulary mismatch as well as encapsulates terms belonging to a single medical entity into a single concept. In addition, we further extend previous graph-based approaches by injecting domain knowledge that estimates the importance of a concept within the global medical domain. Retrieval experiments on the TREC Medical Records collection show our method outperforms both term and concept baselines. More generally, this work provides a means of integrating background knowledge contained in medical ontologies into data-driven information retrieval approaches.
Resumo:
Retrieving information from Twitter is always challenging due to its large volume, inconsistent writing and noise. Most existing information retrieval (IR) and text mining methods focus on term-based approach, but suffers from the problems of terms variation such as polysemy and synonymy. This problem deteriorates when such methods are applied on Twitter due to the length limit. Over the years, people have held the hypothesis that pattern-based methods should perform better than term-based methods as it provides more context, but limited studies have been conducted to support such hypothesis especially in Twitter. This paper presents an innovative framework to address the issue of performing IR in microblog. The proposed framework discover patterns in tweets as higher level feature to assign weight for low-level features (i.e. terms) based on their distributions in higher level features. We present the experiment results based on TREC11 microblog dataset and shows that our proposed approach significantly outperforms term-based methods Okapi BM25, TF-IDF and pattern based methods, using precision, recall and F measures.