956 resultados para Sequence motif analysis
Resumo:
Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.
Resumo:
Median filtering is a simple digital non—linear signal smoothing operation in which median of the samples in a sliding window replaces the sample at the middle of the window. The resulting filtered sequence tends to follow polynomial trends in the original sample sequence. Median filter preserves signal edges while filtering out impulses. Due to this property, median filtering is finding applications in many areas of image and speech processing. Though median filtering is simple to realise digitally, its properties are not easily analysed with standard analysis techniques,
Resumo:
Code clones are portions of source code which are similar to the original program code. The presence of code clones is considered as a bad feature of software as the maintenance of software becomes difficult due to the presence of code clones. Methods for code clone detection have gained immense significance in the last few years as they play a significant role in engineering applications such as analysis of program code, program understanding, plagiarism detection, error detection, code compaction and many more similar tasks. Despite of all these facts, several features of code clones if properly utilized can make software development process easier. In this work, we have pointed out such a feature of code clones which highlight the relevance of code clones in test sequence identification. Here program slicing is used in code clone detection. In addition, a classification of code clones is presented and the benefit of using program slicing in code clone detection is also mentioned in this work.
Resumo:
Antimicrobial peptides (AMPs) play a major role in innate immunity. Penaeidins are a family of AMPs that appear to be expressed in all penaeid shrimps. Penaeidins are composed of an N-terminal proline-rich domain, followed by a C-terminal domain containing six cysteine residues organized in two doublets. This study reports the first penaeidin AMP sequence, Fi-penaeidin (GenBank accession number HM243617) from the Indian white shrimp, Fenneropenaeus indicus. The full length cDNA consists of 186 base pairs encoding 61 amino acidswith an ORF of 42 amino acids and contains a putative signal peptide of 19 amino acids. Comparison of F. indicus penaeidin (Fi-penaeidin) with other known penaeidins showed that it shared maximum similarity with penaeidins of Farfantepenaeus paulensis and Farfantepenaeus subtilis (96% each). Fi-penaeidin has a predicted molecular weight (MW) of 4.478 kDa and theoretical isoelectric point (pI) of 5.3
Resumo:
DNA sequence representation methods are used to denote a gene structure effectively and help in similarities/dissimilarities analysis of coding sequences. Many different kinds of representations have been proposed in the literature. They can be broadly classified into Numerical, Graphical, Geometrical and Hybrid representation methods. DNA structure and function analysis are made easy with graphical and geometrical representation methods since it gives visual representation of a DNA structure. In numerical method, numerical values are assigned to a sequence and digital signal processing methods are used to analyze the sequence. Hybrid approaches are also reported in the literature to analyze DNA sequences. This paper reviews the latest developments in DNA Sequence representation methods. We also present a taxonomy of various methods. A comparison of these methods where ever possible is also done
Resumo:
Pedicle screw insertion technique has made revolution in the surgical treatment of spinal fractures and spinal disorders. Although X- ray fluoroscopy based navigation is popular, there is risk of prolonged exposure to X- ray radiation. Systems that have lower radiation risk are generally quite expensive. The position and orientation of the drill is clinically very important in pedicle screw fixation. In this paper, the position and orientation of the marker on the drill is determined using pattern recognition based methods, using geometric features, obtained from the input video sequence taken from CCD camera. A search is then performed on the video frames after preprocessing, to obtain the exact position and orientation of the drill. An animated graphics, showing the instantaneous position and orientation of the drill is then overlaid on the processed video for real time drill control and navigation
Resumo:
This thesis describes a representation of gait appearance for the purpose of person identification and classification. This gait representation is based on simple localized image features such as moments extracted from orthogonal view video silhouettes of human walking motion. A suite of time-integration methods, spanning a range of coarseness of time aggregation and modeling of feature distributions, are applied to these image features to create a suite of gait sequence representations. Despite their simplicity, the resulting feature vectors contain enough information to perform well on human identification and gender classification tasks. We demonstrate the accuracy of recognition on gait video sequences collected over different days and times and under varying lighting environments. Each of the integration methods are investigated for their advantages and disadvantages. An improved gait representation is built based on our experiences with the initial set of gait representations. In addition, we show gender classification results using our gait appearance features, the effect of our heuristic feature selection method, and the significance of individual features.
Resumo:
When underwater vehicles perform navigation close to the ocean floor, computer vision techniques can be applied to obtain quite accurate motion estimates. The most crucial step in the vision-based estimation of the vehicle motion consists on detecting matchings between image pairs. Here we propose the extensive use of texture analysis as a tool to ameliorate the correspondence problem in underwater images. Once a robust set of correspondences has been found, the three-dimensional motion of the vehicle can be computed with respect to the bed of the sea. Finally, motion estimates allow the construction of a map that could aid to the navigation of the robot
Resumo:
This paper focus on the problem of locating single-phase faults in mixed distribution electric systems, with overhead lines and underground cables, using voltage and current measurements at the sending-end and sequence model of the network. Since calculating series impedance for underground cables is not as simple as in the case of overhead lines, the paper proposes a methodology to obtain an estimation of zero-sequence impedance of underground cables starting from previous single-faults occurred in the system, in which an electric arc occurred at the fault location. For this reason, the signal is previously pretreated to eliminate its peaks voltage and the analysis can be done working with a signal as close as a sinus wave as possible
Resumo:
El Antígeno Leucocitario Humano (HLA en inglés) ha sido descrito en muchos casos como factor de pronóstico para cáncer. La característica principal de los genes de HLA, localizados en el cromosoma 6 (6p21.3), son sus numerosos polimorfismos. Los análisis de secuencia de nucleótidos muestran que la variación está restringida predominantemente a los exones que codifican los dominios de unión a péptidos de la proteína. Por lo tanto, el polimorfismo del HLA define el repertorio de péptidos que se unen a los alotipos de HLA y este hecho define la habilidad de un individuo para responder a la exposición a muchos agentes infecciosos durante su vida. La tipificación de HLA se ha convertido en un análisis importante en clínica. Muestras de tejido embebidas en parafina y fijadas con formalina (FFPE en inglés) son recolectadas rutinariamente en oncología. Este procedimiento podría ser utilizado como una buena fuente de ADN, dado que en estudios en el pasado los ensayos de recolección de ADN no eran normalmente llevados a cabo de casi ningún tejido o muestra en procedimientos clínicos regulares. Teniendo en cuenta que el problema más importante con el ADN de muestras FFPE es la fragmentación, nosotros propusimos un nuevo método para la tipificación del alelo HLA-A desde muestras FFPE basado en las secuencias del exón 2, 3 y 4. Nosotros diseñamos un juego de 12 cebadores: cuatro para el exón 2 de HLA-A, tres para el exón 3 de HLA-A y cinco para el exón 4 de HLA-A, cada uno de acuerdo las secuencias flanqueantes de su respectivo exón y la variación en la secuencia entre diferentes alelos. 17 muestran FFPE colectadas en el Hospital Universitario de Karolinska en Estocolmo Suecia fueron sometidas a PCR y los productos fueron secuenciados. Finalmente todas las secuencias obtenidas fueron analizadas y comparadas con la base de datos del IMGT-HLA. Las muestras FFPE habían sido previamente tipificadas para HLA y los resultados fueron comparados con los de este método. De acuerdo con nuestros resultados, las muestras pudieron ser correctamente secuenciadas. Con este procedimiento, podemos concluir que nuestro estudio es el primer método de tipificación basado en secuencia que permite analizar muestras viejas de ADN de las cuales no se tiene otra fuente. Este estudio abre la posibilidad de desarrollar análisis para establecer nuevas relaciones entre HLA y diferentes enfermedades como el cáncer también.
Resumo:
El marcaje de proteínas con ubiquitina, conocido como ubiquitinación, cumple diferentes funciones que incluyen la regulación de varios procesos celulares, tales como: la degradación de proteínas por medio del proteosoma, la reparación del ADN, la señalización mediada por receptores de membrana, y la endocitosis, entre otras (1). Las moléculas de ubiquitina pueden ser removidas de sus sustratos gracias a la acción de un gran grupo de proteasas, llamadas enzimas deubiquitinizantes (DUBs) (2). Las DUBs son esenciales para la manutención de la homeostasis de la ubiquitina y para la regulación del estado de ubiquitinación de diferentes sustratos. El gran número y la diversidad de DUBs descritas refleja tanto su especificidad como su utilización para regular un amplio espectro de sustratos y vías celulares. Aunque muchas DUBs han sido estudiadas a profundidad, actualmente se desconocen los sustratos y las funciones biológicas de la mayoría de ellas. En este trabajo se investigaron las funciones de las DUBs: USP19, USP4 y UCH-L1. Utilizando varias técnicas de biología molecular y celular se encontró que: i) USP19 es regulada por las ubiquitin ligasas SIAH1 y SIAH2 ii) USP19 es importante para regular HIF-1α, un factor de transcripción clave en la respuesta celular a hipoxia, iii) USP4 interactúa con el proteosoma, iv) La quimera mCherry-UCH-L1 reproduce parcialmente los fenotipos que nuestro grupo ha descrito previamente al usar otros constructos de la misma enzima, y v) UCH-L1 promueve la internalización de la bacteria Yersinia pseudotuberculosis.
Resumo:
Background: The tight junction (TJ) is one of the most important structures established during merozoite invasion of host cells and a large amount of proteins stored in Toxoplasma and Plasmodium parasites’ apical organelles are involved in forming the TJ. Plasmodium falciparum and Toxoplasma gondii apical membrane antigen 1 (AMA-1) and rhoptry neck proteins (RONs) are the two main TJ components. It has been shown that RON4 plays an essential role during merozoite and sporozoite invasion to target cells. This study has focused on characterizing a novel Plasmodium vivax rhoptry protein, RON4, which is homologous to PfRON4 and PkRON4. Methods: The ron4 gene was re-annotated in the P. vivax genome using various bioinformatics tools and taking PfRON4 and PkRON4 amino acid sequences as templates. Gene synteny, as well as identity and similarity values between open reading frames (ORFs) belonging to the three species were assessed. The gene transcription of pvron4, and the expression and localization of the encoded protein were also determined in the VCG-1 strain by molecular and immunological studies. Nucleotide and amino acid sequences obtained for pvron4 in VCG-1 were compared to those from strains coming from different geographical areas. Results: PvRON4 is a 733 amino acid long protein, which is encoded by three exons, having similar transcription and translation patterns to those reported for its homologue, PfRON4. Sequencing PvRON4 from the VCG-1 strain and comparing it to P. vivax strains from different geographical locations has shown two conserved regions separated by a low complexity variable region, possibly acting as a “smokescreen”. PvRON4 contains a predicted signal sequence, a coiled-coil α-helical motif, two tandem repeats and six conserved cysteines towards the carboxyterminus and is a soluble protein lacking predicted transmembranal domains or a GPI anchor. Indirect immunofluorescence assays have shown that PvRON4 is expressed at the apical end of schizonts and co-localizes at the rhoptry neck with PvRON2.
Resumo:
The anxiolytic properties of ethanol (1 g/kg, 15% dose, i.p.) were studied in two experiments with rats involving incentive downshifts from a 32% to a 4% sucrose solution. In Experiment 1, alcohol administration before a downshift from 32% to 4% sucrose prevented the development of consummatory suppression (consummatory successive negative contrast, cSNC). In Experiment 2, ethanol prevented the attenuating effects of partial reinforcement (random sequence of 32% sucrose and nothing) on cSNC, causing a retardation of recovery from contrast. These effects of ethanol on cSNC are analogous to those described for the benzodiazepine anxiolytic chlordiazepoxide, suggesting that at least some of its anxiolytic effects are mediated by the same mechanisms.