178 resultados para candidate features
Resumo:
As increasing numbers of Chinese language learners choose to learn English online (CNNIC, 2012), there is a need to investigate popular websites and their language learning designs. This paper reports on the first stage of a study that analysed the pedagogical, linguistic and content features of 25 Chinese English Language Learning (ELL) websites ranked according to their value and importance to users. The website ranking was undertaken using a system known as PageRank. The aim of the study was to identify the features characterising popular sites as opposed to those of less popular sites for the purpose of producing a framework for ELL website design in the Chinese context. The study found that a pedagogical focus with developmental instructional materials accommodating diverse proficiency levels was a major contributor to website popularity. Chinese language use for translations and teaching directives and intermediate level English for learning materials were also significant features. Content topics included Anglophone/Western and non-Anglophone/Eastern contexts. Overall, popular websites were distinguished by their mediation of access to and scaffolded support for ELL.
Rotorcraft collision avoidance using spherical image-based visual servoing and single point features
Resumo:
This paper presents a reactive collision avoidance method for small unmanned rotorcraft using spherical image-based visual servoing. Only a single point feature is used to guide the aircraft in a safe spiral like trajectory around the target, whilst a spherical camera model ensures the target always remains visible. A decision strategy to stop the avoidance control is derived based on the properties of spiral like motion, and the effect of accurate range measurements on the control scheme is discussed. We show that using a poor range estimate does not significantly degrade the collision avoidance performance, thus relaxing the need for accurate range measurements. We present simulated and experimental results using a small quad rotor to validate the approach.
Resumo:
Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.
Resumo:
This paper investigates the use of mel-frequency deltaphase (MFDP) features in comparison to, and in fusion with, traditional mel-frequency cepstral coefficient (MFCC) features within joint factor analysis (JFA) speaker verification. MFCC features, commonly used in speaker recognition systems, are derived purely from the magnitude spectrum, with the phase spectrum completely discarded. In this paper, we investigate if features derived from the phase spectrum can provide additional speaker discriminant information to the traditional MFCC approach in a JFA based speaker verification system. Results are presented which provide a comparison of MFCC-only, MFDPonly and score fusion of the two approaches within a JFA speaker verification approach. Based upon the results presented using the NIST 2008 Speaker Recognition Evaluation (SRE) dataset, we believe that, while MFDP features alone cannot compete with MFCC features, MFDP can provide complementary information that result in improved speaker verification performance when both approaches are combined in score fusion, particularly in the case of shorter utterances.
Resumo:
Background Cancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset. Results We compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours. Conclusions We demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers
Resumo:
Optimisation of Organic Rankine Cycle (ORCs) for binary-cycle geothermal applications could play a major role in determining the competitiveness of low to moderate temperature geothermal resources. Part of this optimisation process is matching cycles to a given resource such that power output can be maximised. Two major and largely interrelated components of the cycle are the working fluid and the turbine. Both components need careful consideration: the selection of working fluid and appropriate operating conditions as well as optimisation of the turbine design for those conditions will determine the amount of power that can be extracted from a resource. In this paper, we present the rationale for the use of radial-inflow turbines for ORC applications and the preliminary design of several radial-inflow machines based on a number of promising ORC systems that use five different working fluids: R134a, R143a, R236fa, R245fa and n-Pentane. Preliminary meanline analysis lead to the generation of turbine designs for the various cycles with similar efficiencies (77%) but large differences in dimensions (139–289 mm rotor diameter). The highest performing cycle, based on R134a, was found to produce 33% more net power from a 150 °C resource flowing at 10 kg/s than the lowest performing cycle, based on n-Pentane.
Resumo:
Optimisation of Organic Rankine Cycles (ORCs) for binary-cycle geothermal applications could play a major role in the competitiveness of low to moderate temperature geothermal resources. Part of this optimisation process is matching cycles to a given resource such that power output can be maximised. Two major and largely interrelated components of the cycle are the working fluid and the turbine. Both components need careful consideration. Due to the temperature differences in geothermal resources a one-size-fits-all approach to surface power infrastructure is not appropriate. Furthermore, the traditional use of steam as a working fluid does not seem practical due to the low temperatures of many resources. A variety of organic fluids with low boiling points may be utilised as ORC working fluids in binary power cycle loops. Due to differences in thermodynamic properties, certain fluids are able to extract more heat from a given resource than others over certain temperature and pressure ranges. This enables the tailoring of power cycle infrastructure to best match the geothermal resource through careful selection of the working fluid and turbine design optimisation to yield the optimum overall cycle performance. This paper presents the rationale for the use of radial-inflow turbines for ORC applications and the preliminary design of several radial-inflow turbines based on a selection of promising ORC cycles using five different high-density working fluids: R134a, R143a, R236fa, R245fa and n-Pentane at sub- or trans-critical conditions. Numerous studies published compare a variety of working fluids for various ORC configurations. However, there is little information specifically pertaining to the design and implementation of ORCs using realistic radial turbine designs in terms of pressure ratios, inlet pressure, rotor size and rotational speed. Preliminary 1D analysis leads to the generation of turbine designs for the various cycles with similar efficiencies (77%) but large differences in dimensions (139289 mm rotor diameter). The highest performing cycle (R134a) was found to produce 33% more net power from a 150°C resource flowing at 10 kg/s than the lowest performing cycle (n-Pentane).
Resumo:
Power system restoration after a large area outage involves many factors, and the procedure is usually very complicated. A decision-making support system could then be developed so as to find the optimal black-start strategy. In order to evaluate candidate black-start strategies, some indices, usually both qualitative and quantitative, are employed. However, it may not be possible to directly synthesize these indices, and different extents of interactions may exist among these indices. In the existing black-start decision-making methods, qualitative and quantitative indices cannot be well synthesized, and the interactions among different indices are not taken into account. The vague set, an extended version of the well-developed fuzzy set, could be employed to deal with decision-making problems with interacting attributes. Given this background, the vague set is first employed in this work to represent the indices for facilitating the comparisons among them. Then, a concept of the vague-valued fuzzy measure is presented, and on that basis a mathematical model for black-start decision-making developed. Compared with the existing methods, the proposed method could deal with the interactions among indices and more reasonably represent the fuzzy information. Finally, an actual power system is served for demonstrating the basic features of the developed model and method.
Resumo:
Image representations derived from simplified models of the primary visual cortex (V1), such as HOG and SIFT, elicit good performance in a myriad of visual classification tasks including object recognition/detection, pedestrian detection and facial expression classification. A central question in the vision, learning and neuroscience communities regards why these architectures perform so well. In this paper, we offer a unique perspective to this question by subsuming the role of V1-inspired features directly within a linear support vector machine (SVM). We demonstrate that a specific class of such features in conjunction with a linear SVM can be reinterpreted as inducing a weighted margin on the Kronecker basis expansion of an image. This new viewpoint on the role of V1-inspired features allows us to answer fundamental questions on the uniqueness and redundancies of these features, and offer substantial improvements in terms of computational and storage efficiency.
Resumo:
The aim of this paper is to examine the association between a range of objectively measured neighbourhood features and the likelihood of mid-aged adults walking for transport. Increased walking for transport would bring multiple benefits, including improved population and environmental health. As part of the baseline HABITAT study, 10,745 residents of Brisbane, Australia, aged 40–65 years, from 200 neighbourhoods were asked about the time they spent walking for transport. Walking data were collected by mail survey and the physical environmental features of neighbourhoods were compiled using a geographic information systems database. Walking for transport was categorised into four levels and the association between walking and each neighbourhood characteristic was examined using multilevel multinomial models. A number of threshold trends were evident; for example, off-road bikeways were consistently associated with walking between 60 and 150 min per week. Living within 500 m of public transit was also an important predictor but only for those who walked for less than 150 min per week. Interventions targeting these neighbourhood characteristics may lead to improved environmental quality, lower rates of overweight and obesity and associated chromic disease.
Resumo:
In order to comprehend user information needs by concepts, this paper introduces a novel method to match relevance features with ontological concepts. The method first discovers relevance features from user local instances. Then, a concept matching approach is developed for matching these features to accurate concepts in a global knowledge base. This approach is significant for the transition of informative descriptor and conceptional descriptor. The proposed method is elaborately evaluated by comparing against three information gathering baseline models. The experimental results shows the matching approach is successful and achieves a series of remarkable improvements on search effectiveness.
Resumo:
In recent years, there has been a growing interest from the design and construction community to adopt Building Information Models (BIM). BIM provides semantically-rich information models that explicitly represent both 3D geometric information (e.g., component dimensions), along with non-geometric properties (e.g., material properties). While the richness of design information offered by BIM is evident, there are still tremendous challenges in getting construction-specific information out of BIM, limiting the usability of these models for construction. In this paper, we describe our approach for extracting construction-specific design conditions from a BIM model based on user-defined queries. This approach leverages an ontology of features we are developing to formalize the design conditions that affect construction. Our current implementation analyzes the component geometry and topological relationships between components in a BIM model represented using the Industry Foundation Classes (IFC) to identify construction features. We describe the reasoning process implemented to extract these construction features, and provide a critique of the IFC’s to support the querying process. We use examples from two case studies to illustrate the construction features, the querying process, and the challenges involved in deriving construction features from an IFC model.
Resumo:
Population-wide associations between loci due to linkage disequilibrium can be used to map quantitative trait loci (QTL) with high resolution. However, spurious associations between markers and QTL can also arise as a consequence of population stratification. Statistical methods that cannot differentiate between loci associations due to linkage disequilibria from those caused in other ways can render false-positive results. The transmission-disequilibrium test (TDT) is a robust test for detecting QTL. The TDT exploits within-family associations that are not affected by population stratification. However, some TDTs are formulated in a rigid-form, with reduced potential applications. In this study we generalize TDT using mixed linear models to allow greater statistical flexibility. Allelic effects are estimated with two independent parameters: one exploiting the robust within-family information and the other the potentially biased between-family information. A significant difference between these two parameters can be used as evidence for spurious association. This methodology was then used to test the effects of the fourth melanocortin receptor (MC4R) on production traits in the pig. The new analyses supported the previously reported results; i.e., the studied polymorphism is either causal of in very strong linkage disequilibrium with the causal mutation, and provided no evidence for spurious association.
Resumo:
The assembly of retroviruses is driven by oligomerization of the Gag polyprotein. We have used cryo-electron tomography together with subtomogram averaging to describe the three-dimensional structure of in vitro-assembled Gag particles from human immunodeficiency virus, Mason-Pfizer monkey virus, and Rous sarcoma virus. These represent three different retroviral genera: the lentiviruses, betaretroviruses and alpharetroviruses. Comparison of the three structures reveals the features of the supramolecular organization of Gag that are conserved between genera and therefore reflect general principles of Gag-Gag interactions and the features that are specific to certain genera. All three Gag proteins assemble to form approximately spherical hexameric lattices with irregular defects. In all three genera, the N-terminal domain of CA is arranged in hexameric rings around large holes. Where the rings meet, 2-fold densities, assigned to the C-terminal domain of CA, extend between adjacent rings, and link together at the 6-fold symmetry axis with a density, which extends toward the center of the particle into the nucleic acid layer. Although this general arrangement is conserved, differences can be seen throughout the CA and spacer peptide regions. These differences can be related to sequence differences among the genera. We conclude that the arrangement of the structural domains of CA is well conserved across genera, whereas the relationship between CA, the spacer peptide region, and the nucleic acid is more specific to each genus.