984 resultados para Experimental validation


Relevância:

60.00% 60.00%

Publicador:

Resumo:

BACKGROUND: We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a 'reference set' of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment. RESULTS: The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified. CONCLUSION: This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Within the ENCODE Consortium, GENCODE aimed to accurately annotate all protein-coding genes, pseudogenes, and noncoding transcribed loci in the human genome through manual curation and computational methods. Annotated transcript structures were assessed, and less well-supported loci were systematically, experimentally validated. Predicted exon-exon junctions were evaluated by RT-PCR amplification followed by highly multiplexed sequencing readout, a method we called RT-PCR-seq. Seventy-nine percent of all assessed junctions are confirmed by this evaluation procedure, demonstrating the high quality of the GENCODE gene set. RT-PCR-seq was also efficient to screen gene models predicted using the Human Body Map (HBM) RNA-seq data. We validated 73% of these predictions, thus confirming 1168 novel genes, mostly noncoding, which will further complement the GENCODE annotation. Our novel experimental validation pipeline is extremely sensitive, far more than unbiased transcriptome profiling through RNA sequencing, which is becoming the norm. For example, exon-exon junctions unique to GENCODE annotated transcripts are five times more likely to be corroborated with our targeted approach than with extensive large human transcriptome profiling. Data sets such as the HBM and ENCODE RNA-seq data fail sampling of low-expressed transcripts. Our RT-PCR-seq targeted approach also has the advantage of identifying novel exons of known genes, as we discovered unannotated exons in ~11% of assessed introns. We thus estimate that at least 18% of known loci have yet-unannotated exons. Our work demonstrates that the cataloging of all of the genic elements encoded in the human genome will necessitate a coordinated effort between unbiased and targeted approaches, like RNA-seq and RT-PCR-seq.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

BACKGROUND: The GENCODE consortium was formed to identify and map all protein-coding genes within the ENCODE regions. This was achieved by a combination of initial manual annotation by the HAVANA team, experimental validation by the GENCODE consortium and a refinement of the annotation based on these experimental results. RESULTS: The GENCODE gene features are divided into eight different categories of which only the first two (known and novel coding sequence) are confidently predicted to be protein-coding genes. 5' rapid amplification of cDNA ends (RACE) and RT-PCR were used to experimentally verify the initial annotation. Of the 420 coding loci tested, 229 RACE products have been sequenced. They supported 5' extensions of 30 loci and new splice variants in 50 loci. In addition, 46 loci without evidence for a coding sequence were validated, consisting of 31 novel and 15 putative transcripts. We assessed the comprehensiveness of the GENCODE annotation by attempting to validate all the predicted exon boundaries outside the GENCODE annotation. Out of 1,215 tested in a subset of the ENCODE regions, 14 novel exon pairs were validated, only two of them in intergenic regions. CONCLUSION: In total, 487 loci, of which 434 are coding, have been annotated as part of the GENCODE reference set available from the UCSC browser. Comparison of GENCODE annotation with RefSeq and ENSEMBL show only 40% of GENCODE exons are contained within the two sets, which is a reflection of the high number of alternative splice forms with unique exons annotated. Over 50% of coding loci have been experimentally verified by 5' RACE for EGASP and the GENCODE collaboration is continuing to refine its annotation of 1% human genome with the aid of experimental validation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Integration of biological data of various types and the development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an adapted infrastructure to connect databases, and platforms to enable both the generation of new bioinformatics tools and the experimental validation of computational predictions. With the aim of bridging the gap existing between standard wet laboratories and bioinformatics, the ENFIN Network runs integrative research projects to bring the latest computational techniques to bear directly on questions dedicated to systems biology in the wet laboratory environment. The Network maintains internally close collaboration between experimental and computational research, enabling a permanent cycling of experimental validation and improvement of computational prediction methods. The computational work includes the development of a database infrastructure (EnCORE), bioinformatics analysis methods and a novel platform for protein function analysis FuncNet.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The domestic hot water cylinder incorporates encapsulated pcm placed in 57 vertical pipes. The use of PCM increases the thermal energy storage capacity of the cylinder and allows the use of low cost electricity during low peak periods. After experimental validation the numerical model developed in the project will be used to optimize the distribution of the pcm inside the water tank.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: We use an approach based on Factor Analysis to analyze datasets generated for transcriptional profiling. The method groups samples into biologically relevant categories, and enables the identification of genes and pathways most significantly associated to each phenotypic group, while allowing for the participation of a given gene in more than one cluster. Genes assigned to each cluster are used for the detection of pathways predominantly activated in that cluster by finding statistically significant associated GO terms. We tested the approach with a published dataset of microarray experiments in yeast. Upon validation with the yeast dataset, we applied the technique to a prostate cancer dataset. Results: Two major pathways are shown to be activated in organ-confined, non-metastatic prostate cancer: those regulated by the androgen receptor and by receptor tyrosine kinases. A number of gene markers (HER3, IQGAP2 and POR1) highlighted by the software and related to the later pathway have been validated experimentally a posteriori on independent samples. Conclusion: Using a new microarray analysis tool followed by a posteriori experimental validation of the results, we have confirmed several putative markers of malignancy associated with peptide growth factor signalling in prostate cancer and revealed others, most notably ERRB3 (HER3). Our study suggest that, in primary prostate cancer, HER3, together or not with HER4, rather than in receptor complexes involving HER2, could play an important role in the biology of these tumors. These results provide new evidence for the role of receptor tyrosine kinases in the establishment and progression of prostate cancer.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

COD is an important parameter to estimate the concentration of organic contaminants. The closed system technique with the use of K2Cr2O7 is the most important one, however, it has the inconvenience to suffer positive chemical interferences from inorganic compounds such as Fe2+ and H2O2 (not enough reported in the literature). This paper considers a statistical-experimental set capable to validate a empirical mathematical model generated from a 23 experimental design, in the presence of Fe2+ and H2O2. The t test shows that mathematical model has 99,99999% confidence degree and the experimental validation test indicates absolute mean error of 4,70%.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The aim of this thesis is to propose a novel control method for teleoperated electrohydraulic servo systems that implements a reliable haptic sense between the human and manipulator interaction, and an ideal position control between the manipulator and the task environment interaction. The proposed method has the characteristics of a universal technique independent of the actual control algorithm and it can be applied with other suitable control methods as a real-time control strategy. The motivation to develop this control method is the necessity for a reliable real-time controller for teleoperated electrohydraulic servo systems that provides highly accurate position control based on joystick inputs with haptic capabilities. The contribution of the research is that the proposed control method combines a directed random search method and a real-time simulation to develop an intelligent controller in which each generation of parameters is tested on-line by the real-time simulator before being applied to the real process. The controller was evaluated on a hydraulic position servo system. The simulator of the hydraulic system was built based on Markov chain Monte Carlo (MCMC) method. A Particle Swarm Optimization algorithm combined with the foraging behavior of E. coli bacteria was utilized as the directed random search engine. The control strategy allows the operator to be plugged into the work environment dynamically and kinetically. This helps to ensure the system has haptic sense with high stability, without abstracting away the dynamics of the hydraulic system. The new control algorithm provides asymptotically exact tracking of both, the position and the contact force. In addition, this research proposes a novel method for re-calibration of multi-axis force/torque sensors. The method makes several improvements to traditional methods. It can be used without dismantling the sensor from its application and it requires smaller number of standard loads for calibration. It is also more cost efficient and faster in comparison to traditional calibration methods. The proposed method was developed in response to re-calibration issues with the force sensors utilized in teleoperated systems. The new approach aimed to avoid dismantling of the sensors from their applications for applying calibration. A major complication with many manipulators is the difficulty accessing them when they operate inside a non-accessible environment; especially if those environments are harsh; such as in radioactive areas. The proposed technique is based on design of experiment methodology. It has been successfully applied to different force/torque sensors and this research presents experimental validation of use of the calibration method with one of the force sensors which method has been applied to.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Human endogenous retroviruses (HERVs) are the result of ancient germ cell infections of human germ cells by exogenous retroviruses. HERVs belong to the long terminal repeat (LTR) group of retrotransposons that comprise ~8% of the human genome. The majority of the HERVs documented have been truncated and/or incurred lethal mutations and no longer encode functional genes; however a very small number of HERVs seem to maintain functional in making new copies by retrotranspositon as suggested by the identification of a handful of polymorphic HERV insertions in human populations. The objectives of this study were to identify novel insertion of HERVs via analysis of personal genomic data and survey the polymorphism levels of new and known HERV insertions in the human genome. Specifically, this study involves the experimental validation of polymorphic HERV insertion candidates predicted by personal genome-based computation prediction and survey the polymorphism level within the human population based on a set of 30 diverse human DNA samples. Based on computational analysis of a limited number of personal genome sequences, PCR genotyping aided in the identification of 15 dimorphic, 2 trimorphic and 5 fixed full-length HERV-K insertions not previously investigated. These results suggest that the proliferation rate of HERVKs, perhaps also other ERVs, in the human genome may be much higher than we previously appreciated and the recently inserted HERVs exhibit a high level of instability. Throughout this study we have observed the frequent presence of additional forms of genotypes for these HERV insertions, and we propose for the first time the establishment of new genotype reporting nomenclature to reflect all possible combinations of the pre-integration site, solo-LTR and full-length HERV alleles.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La bio-informatique est un champ pluridisciplinaire qui utilise la biologie, l’informatique, la physique et les mathématiques pour résoudre des problèmes posés par la biologie. L’une des thématiques de la bio-informatique est l’analyse des séquences génomiques et la prédiction de gènes d’ARN non codants. Les ARN non codants sont des molécules d’ARN qui sont transcrites mais pas traduites en protéine et qui ont une fonction dans la cellule. Trouver des gènes d’ARN non codants par des techniques de biochimie et de biologie moléculaire est assez difficile et relativement coûteux. Ainsi, la prédiction des gènes d’ARNnc par des méthodes bio-informatiques est un enjeu important. Cette recherche décrit un travail d’analyse informatique pour chercher des nouveaux ARNnc chez le pathogène Candida albicans et d’une validation expérimentale. Nous avons utilisé comme stratégie une analyse informatique combinant plusieurs logiciels d’identification d’ARNnc. Nous avons validé un sous-ensemble des prédictions informatiques avec une expérience de puces à ADN couvrant 1979 régions du génome. Grace à cette expérience nous avons identifié 62 nouveaux transcrits chez Candida albicans. Ce travail aussi permit le développement d’une méthode d’analyse pour des puces à ADN de type tiling array. Ce travail présente également une tentation d’améliorer de la prédiction d’ARNnc avec une méthode se basant sur la recherche de motifs d’ARN dans les séquences.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Alors que certains mécanismes pourtant jugés cruciaux pour la transformation de la pluie en débit restent peu ou mal compris, le concept de connectivité hydrologique a récemment été proposé pour expliquer pourquoi certains processus sont déclenchés de manière épisodique en fonction des caractéristiques des événements de pluie et de la teneur en eau des sols avant l’événement. L’adoption de ce nouveau concept en hydrologie reste cependant difficile puisqu’il n’y a pas de consensus sur la définition de la connectivité, sa mesure, son intégration dans les modèles hydrologiques et son comportement lors des transferts d’échelles spatiales et temporelles. Le but de ce travail doctoral est donc de préciser la définition, la mesure, l’agrégation et la prédiction des processus liés à la connectivité hydrologique en s’attardant aux questions suivantes : 1) Quel cadre méthodologique adopter pour une étude sur la connectivité hydrologique ?, 2) Comment évaluer le degré de connectivité hydrologique des bassins versants à partir de données de terrain ?, et 3) Dans quelle mesure nos connaissances sur la connectivité hydrologique doivent-elles conduire à la modification des postulats de modélisation hydrologique ? Trois approches d’étude sont différenciées, soit i) une approche de type « boite noire », basée uniquement sur l’exploitation des données de pluie et de débits sans examiner le fonctionnement interne du bassin versant ; ii) une approche de type « boite grise » reposant sur l’étude de données géochimiques ponctuelles illustrant la dynamique interne du bassin versant ; et iii) une approche de type « boite blanche » axée sur l’analyse de patrons spatiaux exhaustifs de la topographie de surface, la topographie de subsurface et l’humidité du sol. Ces trois approches sont ensuite validées expérimentalement dans le bassin versant de l’Hermine (Basses Laurentides, Québec). Quatre types de réponses hydrologiques sont distingués en fonction de leur magnitude et de leur synchronisme, sachant que leur présence relative dépend des conditions antécédentes. Les forts débits enregistrés à l’exutoire du bassin versant sont associés à une contribution accrue de certaines sources de ruissellement, ce qui témoigne d’un lien hydraulique accru et donc d’un fort degré de connectivité hydrologique entre les sources concernées et le cours d’eau. Les aires saturées couvrant des superficies supérieures à 0,85 ha sont jugées critiques pour la genèse de forts débits de crue. La preuve est aussi faite que les propriétés statistiques des patrons d’humidité du sol en milieu forestier tempéré humide sont nettement différentes de celles observées en milieu de prairie tempéré sec, d’où la nécessité d’utiliser des méthodes de calcul différentes pour dériver des métriques spatiales de connectivité dans les deux types de milieux. Enfin, la double existence de sources contributives « linéaires » et « non linéaires » est mise en évidence à l’Hermine. Ces résultats suggèrent la révision de concepts qui sous-tendent l’élaboration et l’exécution des modèles hydrologiques. L’originalité de cette thèse est le fait même de son sujet. En effet, les objectifs de recherche poursuivis sont conformes à la théorie hydrologique renouvelée qui prône l’arrêt des études de particularismes de petite échelle au profit de l’examen des propriétés émergentes des bassins versants telles que la connectivité hydrologique. La contribution majeure de cette thèse consiste ainsi en la proposition d’une définition unifiée de la connectivité, d’un cadre méthodologique, d’approches de mesure sur le terrain, d’outils techniques et de pistes de solution pour la modélisation des systèmes hydrologiques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper reports a novel region-based shape descriptor based on orthogonal Legendre moments. The preprocessing steps for invariance improvement of the proposed Improved Legendre Moment Descriptor (ILMD) are discussed. The performance of the ILMD is compared to the MPEG-7 approved region shape descriptor, angular radial transformation descriptor (ARTD), and the widely used Zernike moment descriptor (ZMD). Set B of the MPEG-7 CE-1 contour database and all the datasets of the MPEG-7 CE-2 region database were used for experimental validation. The average normalized modified retrieval rate (ANMRR) and precision- recall pair were employed for benchmarking the performance of the candidate descriptors. The ILMD has lower ANMRR values than ARTD for most of the datasets, and ARTD has a lower value compared to ZMD. This indicates that overall performance of the ILMD is better than that of ARTD and ZMD. This result is confirmed by the precision-recall test where ILMD was found to have better precision rates for most of the datasets tested. Besides retrieval accuracy, ILMD is more compact than ARTD and ZMD. The descriptor proposed is useful as a generic shape descriptor for content-based image retrieval (CBIR) applications

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Modal filtering is based on the capability of single-mode waveguides to transmit only one complex amplitude function to eliminate virtually any perturbation of the interfering wavefronts, thus making very high rejection ratios possible in a nulling interferometer. In the present paper we focus on the progress of Integrated Optics in the thermal infrared [6-20 mu m] range, one of the two candidate technologies for the fabrication of Modal Filters, together with fiber optics. In conclusion of the European Space Agency's (ESA) "Integrated Optics for Darwin" activity, etched layers of clialcogenide material deposited on chalcogenide glass substrates was selected among four candidates as the technology with the best potential to simultaneously meet the filtering efficiency, absolute and spectral transmission, and beam coupling requirements. ESA's new "Integrated Optics" activity started at mid-2007 with the purpose of improving the technology until compliant prototypes can be manufactured and validated, expectedly by the end of 2009. The present paper aims at introducing the project and the components requirements and functions. The selected materials and preliminary designs, as well as the experimental validation logic and test benches are presented. More details are provided on the progress of the main technology: vacuum deposition in the co-evaporation mode and subsequent etching of chalcogenide layers. In addition., preliminary investigations of an alternative technology based on burying a chalcogenide optical fiber core into a chalcogenide substrate are presented. Specific developments of anti-reflective solutions designed for the mitigation of Fresnel losses at the input and output surface of the components are also introduced.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work proposes a unified neurofuzzy modelling scheme. To begin with, the initial fuzzy base construction method is based on fuzzy clustering utilising a Gaussian mixture model (GMM) combined with the analysis of covariance (ANOVA) decomposition in order to obtain more compact univariate and bivariate membership functions over the subspaces of the input features. The mean and covariance of the Gaussian membership functions are found by the expectation maximisation (EM) algorithm with the merit of revealing the underlying density distribution of system inputs. The resultant set of membership functions forms the basis of the generalised fuzzy model (GFM) inference engine. The model structure and parameters of this neurofuzzy model are identified via the supervised subspace orthogonal least square (OLS) learning. Finally, instead of providing deterministic class label as model output by convention, a logistic regression model is applied to present the classifier’s output, in which the sigmoid type of logistic transfer function scales the outputs of the neurofuzzy model to the class probability. Experimental validation results are presented to demonstrate the effectiveness of the proposed neurofuzzy modelling scheme.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We report the results of a transcript finishing initiative, undertaken for the purpose of identifying and characterizing novel human transcripts, in which RT-PCR was used to bridge gaps between paired EST Clusters, mapped against the genomic sequence. Each pair of EST Clusters selected for experimental validation was designated a transcript finishing unit (TFU). A total of 489 TFUs were selected for validation, and an overall efficiency of 43.1% was achieved. We generated a total of 59,975 bp of transcribed sequences organized into 432 exons, contributing to the definition of the structure of 211 human transcripts. The structure of several transcripts reported here was confirmed during the course of this project, through the generation of their corresponding full-length cDNA sequences. Nevertheless, for 21% of the validated TFUs, a full-length cDNA sequence is not yet available in public databases, and the structure of 69.2% of these TFUs was not correctly predicted by computer programs. The TF strategy provides a significant contribution to the definition of the complete catalog of human genes and transcripts, because it appears to be particularly useful for identification of low abundance transcripts expressed in a restricted Set of tissues as well as for the delineation of gene boundaries and alternatively spliced isoforms.