242 resultados para Floristic similarity
Resumo:
Appearance-based localization is increasingly used for loop closure detection in metric SLAM systems. Since it relies only upon the appearance-based similarity between images from two locations, it can perform loop closure regardless of accumulated metric error. However, the computation time and memory requirements of current appearance-based methods scale linearly not only with the size of the environment but also with the operation time of the platform. These properties impose severe restrictions on longterm autonomy for mobile robots, as loop closure performance will inevitably degrade with increased operation time. We present a set of improvements to the appearance-based SLAM algorithm CAT-SLAM to constrain computation scaling and memory usage with minimal degradation in performance over time. The appearance-based comparison stage is accelerated by exploiting properties of the particle observation update, and nodes in the continuous trajectory map are removed according to minimal information loss criteria. We demonstrate constant time and space loop closure detection in a large urban environment with recall performance exceeding FAB-MAP by a factor of 3 at 100% precision, and investigate the minimum computational and memory requirements for maintaining mapping performance.
Resumo:
Background Bactrocera dorsalis s.s. is a pestiferous tephritid fruit fly distributed from Pakistan to the Pacific, with the Thai/Malay peninsula its southern limit. Sister pest taxa, B. papayae and B. philippinensis, occur in the southeast Asian archipelago and the Philippines, respectively. The relationship among these species is unclear due to their high molecular and morphological similarity. This study analysed population structure of these three species within a southeast Asian biogeographical context to assess potential dispersal patterns and the validity of their current taxonomic status. Results Geometric morphometric results generated from 15 landmarks for wings of 169 flies revealed significant differences in wing shape between almost all sites following canonical variate analysis. For the combined data set there was a greater isolation-by-distance (IBD) effect under a ‘non-Euclidean’ scenario which used geographical distances within a biogeographical ‘Sundaland context’ (r2 = 0.772, P < 0.0001) as compared to a ‘Euclidean’ scenario for which direct geographic distances between sample sites was used (r2 = 0.217, P < 0.01). COI sequence data were obtained for 156 individuals and yielded 83 unique haplotypes with no correlation to current taxonomic designations via a minimum spanning network. BEAST analysis provided a root age and location of 540kya in northern Thailand, with migration of B. dorsalis s.l. into Malaysia 470kya and Sumatra 270kya. Two migration events into the Philippines are inferred. Sequence data revealed a weak but significant IBD effect under the ‘non-Euclidean’ scenario (r2 = 0.110, P < 0.05), with no historical migration evident between Taiwan and the Philippines. Results are consistent with those expected at the intra-specific level. Conclusions Bactrocera dorsalis s.s., B. papayae and B. philippinensis likely represent one species structured around the South China Sea, having migrated from northern Thailand into the southeast Asian archipelago and across into the Philippines. No migration is apparent between the Philippines and Taiwan. This information has implications for quarantine, trade and pest management.
Resumo:
Siamese mud carp (Henichorynchus siamensis) is a freshwater teleost of high economic importance in the Mekong River Basin. However, genetic data relevant for delineating wild stocks for management purposes currently are limited for this species. Here, we used 454 pyrosequencing to generate a partial genome survey sequence (GSS) dataset to develop simple sequence repeat (SSR) markers from H. siamensis genomic DNA. Data generated included a total of 65,954 sequence reads with average length of 264 nucleotides, of which 2.79% contain SSR motifs. Based on GSS-BLASTx results, 10.5% of contigs and 8.1% singletons possessed significant similarity (E value < 10–5) with the majority matching well to reported fish sequences. KEGG analysis identified several metabolic pathways that provide insights into specific potential roles and functions of sequences involved in molecular processes in H. siamensis. Top protein domains detected included reverse transcriptase and the top putative functional transcript identified was an ORF2-encoded protein. One thousand eight hundred and thirty seven sequences containing SSR motifs were identified, of which 422 qualified for primer design and eight polymorphic loci have been tested with average observed and expected heterozygosity estimated at 0.75 and 0.83, respectively. Regardless of their relative levels of polymorphism and heterozygosity, microsatellite loci developed here are suitable for further population genetic studies in H. siamensis and may also be applicable to other related taxa.
Resumo:
This paper outlines a novel approach for modelling semantic relationships within medical documents. Medical terminologies contain a rich source of semantic information critical to a number of techniques in medical informatics, including medical information retrieval. Recent research suggests that corpus-driven approaches are effective at automatically capturing semantic similarities between medical concepts, thus making them an attractive option for accessing semantic information. Most previous corpus-driven methods only considered syntagmatic associations. In this paper, we adapt a recent approach that explicitly models both syntagmatic and paradigmatic associations. We show that the implicit similarity between certain medical concepts can only be modelled using paradigmatic associations. In addition, the inclusion of both types of associations overcomes the sensitivity to the training corpus experienced by previous approaches, making our method both more effective and more robust. This finding may have implications for researchers in the area of medical information retrieval.
Resumo:
Genetic recombination is a fundamental evolutionary mechanism promoting biological adaptation. Using engineered recombinants of the small single-stranded DNA plant virus, Maize streak virus (MSV), we experimentally demonstrate that fragments of genetic material only function optimally if they reside within genomes similar to those in which they evolved. The degree of similarity necessary for optimal functionality is correlated with the complexity of intragenomic interaction networks within which genome fragments must function. There is a striking correlation between our experimental results and the types of MSV recombinants that are detectable in nature, indicating that obligatory maintenance of intragenome interaction networks strongly constrains the evolutionary value of recombination for this virus and probably for genomes in general.
Resumo:
Background Maize streak virus -strain A (MSV-A; Genus Mastrevirus, Family Geminiviridae), the maize-adapted strain of MSV that causes maize streak disease throughout sub-Saharan Africa, probably arose between 100 and 200 years ago via homologous recombination between two MSV strains adapted to wild grasses. MSV recombination experiments and analyses of natural MSV recombination patterns have revealed that this recombination event entailed the exchange of the movement protein - coat protein gene cassette, bounded by the two genomic regions most prone to recombination in mastrevirus genomes; the first surrounding the virion-strand origin of replication, and the second around the interface between the coat protein gene and the short intergenic region. Therefore, aside from the likely adaptive advantages presented by a modular exchange of this cassette, these specific breakpoints may have been largely predetermined by the underlying mechanisms of mastrevirus recombination. To investigate this hypothesis, we constructed artificial, low-fitness, reciprocal chimaeric MSV genomes using alternating genomic segments from two MSV strains; a grass-adapted MSV-B, and a maize-adapted MSV-A. Between them, each pair of reciprocal chimaeric genomes represented all of the genetic material required to reconstruct - via recombination - the highly maize-adapted MSV-A genotype, MSV-MatA. We then co-infected a selection of differentially MSV-resistant maize genotypes with pairs of reciprocal chimaeras to determine the efficiency with which recombination would give rise to high-fitness progeny genomes resembling MSV-MatA. Results Recombinants resembling MSV-MatA invariably arose in all of our experiments. However, the accuracy and efficiency with which the MSV-MatA genotype was recovered across all replicates of each experiment depended on the MSV susceptibility of the maize genotypes used and the precise positions - in relation to known recombination hotspots - of the breakpoints required to re-create MSV-MatA. Although the MSV-sensitive maize genotype gave rise to the greatest variety of recombinants, the measured fitness of each of these recombinants correlated with their similarity to MSV-MatA. Conclusions The mechanistic predispositions of different MSV genomic regions to recombination can strongly influence the accessibility of high-fitness MSV recombinants. The frequency with which the fittest recombinant MSV genomes arise also correlates directly with the escalating selection pressures imposed by increasingly MSV-resistant maize hosts.
Resumo:
Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.
Resumo:
Traditional area-based matching techniques make use of similarity metrics such as the Sum of Absolute Differences(SAD), Sum of Squared Differences (SSD) and Normalised Cross Correlation (NCC). Non-parametric matching algorithms such as the rank and census rely on the relative ordering of pixel values rather than the pixels themselves as a similarity measure. Both traditional area-based and non-parametric stereo matching techniques have an algorithmic structure which is amenable to fast hardware realisation. This investigation undertakes a performance assessment of these two families of algorithms for robustness to radiometric distortion and random noise. A generic implementation framework is presented for the stereo matching problem and the relative hardware requirements for the various metrics investigated.
Resumo:
The authors present a qualitative and quantitative comparison of various similarity measures that form the kernel of common area-based stereo-matching systems. The authors compare classical difference and correlation measures as well as nonparametric measures based on the rank and census transforms for a number of outdoor images. For robotic applications, important considerations include robustness to image defects such as intensity variation and noise, the number of false matches, and computational complexity. In the absence of ground truth data, the authors compare the matching techniques based on the percentage of matches that pass the left-right consistency test. The authors also evaluate the discriminatory power of several match validity measures that are reported in the literature for eliminating false matches and for estimating match confidence. For guidance applications, it is essential to have and estimate of confidence in the three-dimensional points generated by stereo vision. Finally, a new validity measure, the rank constraint, is introduced that is capable of resolving ambiguous matches for rank transform-based matching.
Resumo:
This paper introduces PartSS, a new partition-based fil- tering for tasks performing string comparisons under edit distance constraints. PartSS offers improvements over the state-of-the-art method NGPP with the implementation of a new partitioning scheme and also improves filtering abil- ities by exploiting theoretical results on shifting and scaling ranges, thus accelerating the rate of calculating edit distance between strings. PartSS filtering has been implemented within two major tasks of data integration: similarity join and approximate membership extraction under edit distance constraints. The evaluation on an extensive range of real-world datasets demonstrates major gain in efficiency over NGPP and QGrams approaches.
Resumo:
Numerical study is carried out using large eddy simulation to study the heat and toxic gases released from fires in real road tunnels. Due to disasters about tunnel fires in previous decade, it attracts increasing attention of researchers to create safe and reliable ventilation designs. In this research, a real tunnel with 10 MW fire (which approximately equals to the heat output speed of a burning bus) at the middle of tunnel is simulated using FDS (Fire Dynamic Simulator) for different ventilation velocities. Carbone monoxide concentration and temperature vertical profiles are shown for various locations to explore the flow field. It is found that, with the increase of the longitudinal ventilation velocity, the vertical profile gradients of CO concentration and smoke temperature were shown to be both reduced. However, a relatively large longitudinal ventilation velocity leads to a high similarity between the vertical profile of CO volume concentration and that of temperature rise.
Resumo:
Modelling video sequences by subspaces has recently shown promise for recognising human actions. Subspaces are able to accommodate the effects of various image variations and can capture the dynamic properties of actions. Subspaces form a non-Euclidean and curved Riemannian manifold known as a Grassmann manifold. Inference on manifold spaces usually is achieved by embedding the manifolds in higher dimensional Euclidean spaces. In this paper, we instead propose to embed the Grassmann manifolds into reproducing kernel Hilbert spaces and then tackle the problem of discriminant analysis on such manifolds. To achieve efficient machinery, we propose graph-based local discriminant analysis that utilises within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability, respectively. Experiments on KTH, UCF Sports, and Ballet datasets show that the proposed approach obtains marked improvements in discrimination accuracy in comparison to several state-of-the-art methods, such as the kernel version of affine hull image-set distance, tensor canonical correlation analysis, spatial-temporal words and hierarchy of discriminative space-time neighbourhood features.
Resumo:
Grouping users in social networks is an important process that improves matching and recommendation activities in social networks. The data mining methods of clustering can be used in grouping the users in social networks. However, the existing general purpose clustering algorithms perform poorly on the social network data due to the special nature of users' data in social networks. One main reason is the constraints that need to be considered in grouping users in social networks. Another reason is the need of capturing large amount of information about users which imposes computational complexity to an algorithm. In this paper, we propose a scalable and effective constraint-based clustering algorithm based on a global similarity measure that takes into consideration the users' constraints and their importance in social networks. Each constraint's importance is calculated based on the occurrence of this constraint in the dataset. Performance of the algorithm is demonstrated on a dataset obtained from an online dating website using internal and external evaluation measures. Results show that the proposed algorithm is able to increases the accuracy of matching users in social networks by 10% in comparison to other algorithms.
Resumo:
The human kallikrein-related peptidases are a subgroup of trypsin and chymotrypsin-like serine peptidases that are characterized by their homology to tissue kallikrein or kallikrein 1 (KLK1) encoded by the KLK1 gene (reviewed in[1-4]). The human KLK locus spans an approximately 320 kb region on chromosome 19q13.3-13.4 and contains fifteen genes encoding KLK1 and fourteen other kallikrein-related peptidases, KLK2-KLK15, which have been named contiguously in the locus in the order of their discovery [5-8] (Figure 606.1). It is the largest contiguous cluster of serine protease encoding genes in the human genome which has evolved from gene duplication of KLK1 and then subsequent reduplication of the newly evolved KLK genes [2]. The high conservation noted for KLK1-KLK3 (62-77%) reflects the proposed duplication of the KLK1 gene that produced the KLK2 gene which further generated the KLK3 gene. In contrast, the newer KLK4-KLK15 proteases share much less similarity, from 24-66%, although strong homology between KLK4 and KLK5, KLK9 and KLK11, and KLK10 and KLK12 suggests these genes are duplications of each other [2]...
Resumo:
Cometary and interplanetary dust particles (IDP) are compared, and the mineralogical evolution of comet nuclei is discussed. Chondritic IDP have properties consistent with properties expected for cometary dust. The complex and varied mineralogy of these particles may indicate mineral alteration processes that occur in comet nuclei. Depending on the thermal budget of a comet, the upper few meters of nucleus material may maintain temperatures within regimes of hydrocryogenic (200 to 237K) and low-temperature aqueous (274 to 400K) alteration. Thus, layer silicates, carbonates, and sulfates may be important components of cometary dust and, correspondingly are common constituents of chondritic IDPs. Alteration of comet starting materials may be a common occurrence, and depends on the specific physical and chemical properties of each individual comet.