982 resultados para functional prediction
Resumo:
Nowadays, Computational Fluid Dynamics (CFD) solvers are widely used within the industry to model fluid flow phenomenons. Several fluid flow model equations have been employed in the last decades to simulate and predict forces acting, for example, on different aircraft configurations. Computational time and accuracy are strongly dependent on the fluid flow model equation and the spatial dimension of the problem considered. While simple models based on perfect flows, like panel methods or potential flow models can be very fast to solve, they usually suffer from a poor accuracy in order to simulate real flows (transonic, viscous). On the other hand, more complex models such as the full Navier- Stokes equations provide high fidelity predictions but at a much higher computational cost. Thus, a good compromise between accuracy and computational time has to be fixed for engineering applications. A discretisation technique widely used within the industry is the so-called Finite Volume approach on unstructured meshes. This technique spatially discretises the flow motion equations onto a set of elements which form a mesh, a discrete representation of the continuous domain. Using this approach, for a given flow model equation, the accuracy and computational time mainly depend on the distribution of nodes forming the mesh. Therefore, a good compromise between accuracy and computational time might be obtained by carefully defining the mesh. However, defining an optimal mesh for complex flows and geometries requires a very high level expertize in fluid mechanics and numerical analysis, and in most cases a simple guess of regions of the computational domain which might affect the most the accuracy is impossible. Thus, it is desirable to have an automatized remeshing tool, which is more flexible with unstructured meshes than its structured counterpart. However, adaptive methods currently in use still have an opened question: how to efficiently drive the adaptation ? Pioneering sensors based on flow features generally suffer from a lack of reliability, so in the last decade more effort has been made in developing numerical error-based sensors, like for instance the adjoint-based adaptation sensors. While very efficient at adapting meshes for a given functional output, the latter method is very expensive as it requires to solve a dual set of equations and computes the sensor on an embedded mesh. Therefore, it would be desirable to develop a more affordable numerical error estimation method. The current work aims at estimating the truncation error, which arises when discretising a partial differential equation. These are the higher order terms neglected in the construction of the numerical scheme. The truncation error provides very useful information as it is strongly related to the flow model equation and its discretisation. On one hand, it is a very reliable measure of the quality of the mesh, therefore very useful in order to drive a mesh adaptation procedure. On the other hand, it is strongly linked to the flow model equation, so that a careful estimation actually gives information on how well a given equation is solved, which may be useful in the context of _ -extrapolation or zonal modelling. The following work is organized as follows: Chap. 1 contains a short review of mesh adaptation techniques as well as numerical error prediction. In the first section, Sec. 1.1, the basic refinement strategies are reviewed and the main contribution to structured and unstructured mesh adaptation are presented. Sec. 1.2 introduces the definitions of errors encountered when solving Computational Fluid Dynamics problems and reviews the most common approaches to predict them. Chap. 2 is devoted to the mathematical formulation of truncation error estimation in the context of finite volume methodology, as well as a complete verification procedure. Several features are studied, such as the influence of grid non-uniformities, non-linearity, boundary conditions and non-converged numerical solutions. This verification part has been submitted and accepted for publication in the Journal of Computational Physics. Chap. 3 presents a mesh adaptation algorithm based on truncation error estimates and compares the results to a feature-based and an adjoint-based sensor (in collaboration with Jorge Ponsín, INTA). Two- and three-dimensional cases relevant for validation in the aeronautical industry are considered. This part has been submitted and accepted in the AIAA Journal. An extension to Reynolds Averaged Navier- Stokes equations is also included, where _ -estimation-based mesh adaptation and _ -extrapolation are applied to viscous wing profiles. The latter has been submitted in the Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering. Keywords: mesh adaptation, numerical error prediction, finite volume Hoy en día, la Dinámica de Fluidos Computacional (CFD) es ampliamente utilizada dentro de la industria para obtener información sobre fenómenos fluidos. La Dinámica de Fluidos Computacional considera distintas modelizaciones de las ecuaciones fluidas (Potencial, Euler, Navier-Stokes, etc) para simular y predecir las fuerzas que actúan, por ejemplo, sobre una configuración de aeronave. El tiempo de cálculo y la precisión en la solución depende en gran medida de los modelos utilizados, así como de la dimensión espacial del problema considerado. Mientras que modelos simples basados en flujos perfectos, como modelos de flujos potenciales, se pueden resolver rápidamente, por lo general aducen de una baja precisión a la hora de simular flujos reales (viscosos, transónicos, etc). Por otro lado, modelos más complejos tales como el conjunto de ecuaciones de Navier-Stokes proporcionan predicciones de alta fidelidad, a expensas de un coste computacional mucho más elevado. Por lo tanto, en términos de aplicaciones de ingeniería se debe fijar un buen compromiso entre precisión y tiempo de cálculo. Una técnica de discretización ampliamente utilizada en la industria es el método de los Volúmenes Finitos en mallas no estructuradas. Esta técnica discretiza espacialmente las ecuaciones del movimiento del flujo sobre un conjunto de elementos que forman una malla, una representación discreta del dominio continuo. Utilizando este enfoque, para una ecuación de flujo dado, la precisión y el tiempo computacional dependen principalmente de la distribución de los nodos que forman la malla. Por consiguiente, un buen compromiso entre precisión y tiempo de cálculo se podría obtener definiendo cuidadosamente la malla, concentrando sus elementos en aquellas zonas donde sea estrictamente necesario. Sin embargo, la definición de una malla óptima para corrientes y geometrías complejas requiere un nivel muy alto de experiencia en la mecánica de fluidos y el análisis numérico, así como un conocimiento previo de la solución. Aspecto que en la mayoría de los casos no está disponible. Por tanto, es deseable tener una herramienta que permita adaptar los elementos de malla de forma automática, acorde a la solución fluida (remallado). Esta herramienta es generalmente más flexible en mallas no estructuradas que con su homóloga estructurada. No obstante, los métodos de adaptación actualmente en uso todavía dejan una pregunta abierta: cómo conducir de manera eficiente la adaptación. Sensores pioneros basados en las características del flujo en general, adolecen de una falta de fiabilidad, por lo que en la última década se han realizado grandes esfuerzos en el desarrollo numérico de sensores basados en el error, como por ejemplo los sensores basados en el adjunto. A pesar de ser muy eficientes en la adaptación de mallas para un determinado funcional, este último método resulta muy costoso, pues requiere resolver un doble conjunto de ecuaciones: la solución y su adjunta. Por tanto, es deseable desarrollar un método numérico de estimación de error más asequible. El presente trabajo tiene como objetivo estimar el error local de truncación, que aparece cuando se discretiza una ecuación en derivadas parciales. Estos son los términos de orden superior olvidados en la construcción del esquema numérico. El error de truncación proporciona una información muy útil sobre la solución: es una medida muy fiable de la calidad de la malla, obteniendo información que permite llevar a cabo un procedimiento de adaptación de malla. Está fuertemente relacionado al modelo matemático fluido, de modo que una estimación precisa garantiza la idoneidad de dicho modelo en un campo fluido, lo que puede ser útil en el contexto de modelado zonal. Por último, permite mejorar la precisión de la solución resolviendo un nuevo sistema donde el error local actúa como término fuente (_ -extrapolación). El presenta trabajo se organiza de la siguiente manera: Cap. 1 contiene una breve reseña de las técnicas de adaptación de malla, así como de los métodos de predicción de los errores numéricos. En la primera sección, Sec. 1.1, se examinan las estrategias básicas de refinamiento y se presenta la principal contribución a la adaptación de malla estructurada y no estructurada. Sec 1.2 introduce las definiciones de los errores encontrados en la resolución de problemas de Dinámica Computacional de Fluidos y se examinan los enfoques más comunes para predecirlos. Cap. 2 está dedicado a la formulación matemática de la estimación del error de truncación en el contexto de la metodología de Volúmenes Finitos, así como a un procedimiento de verificación completo. Se estudian varias características que influyen en su estimación: la influencia de la falta de uniformidad de la malla, el efecto de las no linealidades del modelo matemático, diferentes condiciones de contorno y soluciones numéricas no convergidas. Esta parte de verificación ha sido presentada y aceptada para su publicación en el Journal of Computational Physics. Cap. 3 presenta un algoritmo de adaptación de malla basado en la estimación del error de truncación y compara los resultados con sensores de featured-based y adjointbased (en colaboración con Jorge Ponsín del INTA). Se consideran casos en dos y tres dimensiones, relevantes para la validación en la industria aeronáutica. Este trabajo ha sido presentado y aceptado en el AIAA Journal. También se incluye una extensión de estos métodos a las ecuaciones RANS (Reynolds Average Navier- Stokes), en donde adaptación de malla basada en _ y _ -extrapolación son aplicados a perfiles con viscosidad de alas. Este último trabajo se ha presentado en los Actas de la Institución de Ingenieros Mecánicos, Parte G: Journal of Aerospace Engineering. Palabras clave: adaptación de malla, predicción del error numérico, volúmenes finitos
Resumo:
La investigación para el conocimiento del cerebro es una ciencia joven, su inicio se remonta a Santiago Ramón y Cajal en 1888. Desde esta fecha a nuestro tiempo la neurociencia ha avanzado mucho en el desarrollo de técnicas que permiten su estudio. Desde la neurociencia cognitiva hoy se explican muchos modelos que nos permiten acercar a nuestro entendimiento a capacidades cognitivas complejas. Aun así hablamos de una ciencia casi en pañales que tiene un lago recorrido por delante. Una de las claves del éxito en los estudios de la función cerebral ha sido convertirse en una disciplina que combina conocimientos de diversas áreas: de la física, de las matemáticas, de la estadística y de la psicología. Esta es la razón por la que a lo largo de este trabajo se entremezclan conceptos de diferentes campos con el objetivo de avanzar en el conocimiento de un tema tan complejo como el que nos ocupa: el entendimiento de la mente humana. Concretamente, esta tesis ha estado dirigida a la integración multimodal de la magnetoencefalografía (MEG) y la resonancia magnética ponderada en difusión (dMRI). Estas técnicas son sensibles, respectivamente, a los campos magnéticos emitidos por las corrientes neuronales, y a la microestructura de la materia blanca cerebral. A lo largo de este trabajo hemos visto que la combinación de estas técnicas permiten descubrir sinergias estructurofuncionales en el procesamiento de la información en el cerebro sano y en el curso de patologías neurológicas. Más específicamente en este trabajo se ha estudiado la relación entre la conectividad funcional y estructural y en cómo fusionarlas. Para ello, se ha cuantificado la conectividad funcional mediante el estudio de la sincronización de fase o la correlación de amplitudes entre series temporales, de esta forma se ha conseguido un índice que mide la similitud entre grupos neuronales o regiones cerebrales. Adicionalmente, la cuantificación de la conectividad estructural a partir de imágenes de resonancia magnética ponderadas en difusión, ha permitido hallar índices de la integridad de materia blanca o de la fuerza de las conexiones estructurales entre regiones. Estas medidas fueron combinadas en los capítulos 3, 4 y 5 de este trabajo siguiendo tres aproximaciones que iban desde el nivel más bajo al más alto de integración. Finalmente se utilizó la información fusionada de MEG y dMRI para la caracterización de grupos de sujetos con deterioro cognitivo leve, la detección de esta patología resulta relevante en la identificación precoz de la enfermedad de Alzheimer. Esta tesis está dividida en seis capítulos. En el capítulos 1 se establece un contexto para la introducción de la connectómica dentro de los campos de la neuroimagen y la neurociencia. Posteriormente en este capítulo se describen los objetivos de la tesis, y los objetivos específicos de cada una de las publicaciones científicas que resultaron de este trabajo. En el capítulo 2 se describen los métodos para cada técnica que fue empleada: conectividad estructural, conectividad funcional en resting state, redes cerebrales complejas y teoría de grafos y finalmente se describe la condición de deterioro cognitivo leve y el estado actual en la búsqueda de nuevos biomarcadores diagnósticos. En los capítulos 3, 4 y 5 se han incluido los artículos científicos que fueron producidos a lo largo de esta tesis. Estos han sido incluidos en el formato de la revista en que fueron publicados, estando divididos en introducción, materiales y métodos, resultados y discusión. Todos los métodos que fueron empleados en los artículos están descritos en el capítulo 2 de la tesis. Finalmente, en el capítulo 6 se concluyen los resultados generales de la tesis y se discuten de forma específica los resultados de cada artículo. ABSTRACT In this thesis I apply concepts from mathematics, physics and statistics to the neurosciences. This field benefits from the collaborative work of multidisciplinary teams where physicians, psychologists, engineers and other specialists fight for a common well: the understanding of the brain. Research on this field is still in its early years, being its birth attributed to the neuronal theory of Santiago Ramo´n y Cajal in 1888. In more than one hundred years only a very little percentage of the brain functioning has been discovered, and still much more needs to be explored. Isolated techniques aim at unraveling the system that supports our cognition, nevertheless in order to provide solid evidence in such a field multimodal techniques have arisen, with them we will be able to improve current knowledge about human cognition. Here we focus on the multimodal integration of magnetoencephalography (MEG) and diffusion weighted magnetic resonance imaging. These techniques are sensitive to the magnetic fields emitted by the neuronal currents and to the white matter microstructure, respectively. The combination of such techniques could bring up evidences about structural-functional synergies in the brain information processing and which part of this synergy fails in specific neurological pathologies. In particular, we are interested in the relationship between functional and structural connectivity, and how two integrate this information. We quantify the functional connectivity by studying the phase synchronization or the amplitude correlation between time series obtained by MEG, and so we get an index indicating similarity between neuronal entities, i.e. brain regions. In addition we quantify structural connectivity by performing diffusion tensor estimation from the diffusion weighted images, thus obtaining an indicator of the integrity of the white matter or, if preferred, the strength of the structural connections between regions. These quantifications are then combined following three different approaches, from the lowest to the highest level of integration, in chapters 3, 4 and 5. We finally apply the fused information to the characterization or prediction of mild cognitive impairment, a clinical entity which is considered as an early step in the continuum pathological process of dementia. The dissertation is divided in six chapters. In chapter 1 I introduce connectomics within the fields of neuroimaging and neuroscience. Later in this chapter we describe the objectives of this thesis, and the specific objectives of each of the scientific publications that were produced as result of this work. In chapter 2 I describe the methods for each of the techniques that were employed, namely structural connectivity, resting state functional connectivity, complex brain networks and graph theory, and finally, I describe the clinical condition of mild cognitive impairment and the current state of the art in the search for early biomarkers. In chapters 3, 4 and 5 I have included the scientific publications that were generated along this work. They have been included in in their original format and they contain introduction, materials and methods, results and discussion. All methods that were employed in these papers have been described in chapter 2. Finally, in chapter 6 I summarize all the results from this thesis, both locally for each of the scientific publications and globally for the whole work.
Resumo:
Single-stranded regions in RNA secondary structure are important for RNA–RNA and RNA–protein interactions. We present a probability profile approach for the prediction of these regions based on a statistical algorithm for sampling RNA secondary structures. For the prediction of phylogenetically-determined single-stranded regions in secondary structures of representative RNA sequences, the probability profile offers substantial improvement over the minimum free energy structure. In designing antisense oligonucleotides, a practical problem is how to select a secondary structure for the target mRNA from the optimal structure(s) and many suboptimal structures with similar free energies. By summarizing the information from a statistical sample of probable secondary structures in a single plot, the probability profile not only presents a solution to this dilemma, but also reveals ‘well-determined’ single-stranded regions through the assignment of probabilities as measures of confidence in predictions. In antisense application to the rabbit β-globin mRNA, a significant correlation between hybridization potential predicted by the probability profile and the degree of inhibition of in vitro translation suggests that the probability profile approach is valuable for the identification of effective antisense target sites. Coupling computational design with DNA–RNA array technique provides a rational, efficient framework for antisense oligonucleotide screening. This framework has the potential for high-throughput applications to functional genomics and drug target validation.
Resumo:
Although much of the brain’s functional organization is genetically predetermined, it appears that some noninnate functions can come to depend on dedicated and segregated neural tissue. In this paper, we describe a series of experiments that have investigated the neural development and organization of one such noninnate function: letter recognition. Functional neuroimaging demonstrates that letter and digit recognition depend on different neural substrates in some literate adults. How could the processing of two stimulus categories that are distinguished solely by cultural conventions become segregated in the brain? One possibility is that correlation-based learning in the brain leads to a spatial organization in cortex that reflects the temporal and spatial clustering of letters with letters in the environment. Simulations confirm that environmental co-occurrence does indeed lead to spatial localization in a neural network that uses correlation-based learning. Furthermore, behavioral studies confirm one critical prediction of this co-occurrence hypothesis, namely, that subjects exposed to a visual environment in which letters and digits occur together rather than separately (postal workers who process letters and digits together in Canadian postal codes) do indeed show less behavioral evidence for segregated letter and digit processing.
Sequence similarity analysis of Escherichia coli proteins: functional and evolutionary implications.
Resumo:
A computer analysis of 2328 protein sequences comprising about 60% of the Escherichia coli gene products was performed using methods for database screening with individual sequences and alignment blocks. A high fraction of E. coli proteins--86%--shows significant sequence similarity to other proteins in current databases; about 70% show conservation at least at the level of distantly related bacteria, and about 40% contain ancient conserved regions (ACRs) shared with eukaryotic or Archaeal proteins. For > 90% of the E. coli proteins, either functional information or sequence similarity, or both, are available. Forty-six percent of the E. coli proteins belong to 299 clusters of paralogs (intraspecies homologs) defined on the basis of pairwise similarity. Another 10% could be included in 70 superclusters using motif detection methods. The majority of the clusters contain only two to four members. In contrast, nearly 25% of all E. coli proteins belong to the four largest superclusters--namely, permeases, ATPases and GTPases with the conserved "Walker-type" motif, helix-turn-helix regulatory proteins, and NAD(FAD)-binding proteins. We conclude that bacterial protein sequences generally are highly conserved in evolution, with about 50% of all ACR-containing protein families represented among the E. coli gene products. With the current sequence databases and methods of their screening, computer analysis yields useful information on the functions and evolutionary relationships of the vast majority of genes in a bacterial genome. Sequence similarity with E. coli proteins allows the prediction of functions for a number of important eukaryotic genes, including several whose products are implicated in human diseases.
Resumo:
An electronic phase with coexisting magnetic and ferroelectric order is predicted for graphene ribbons with zigzag edges. The electronic structure of the system is described with a mean-field Hubbard model that yields results very similar to those of density functional calculations. Without further approximations, the mean-field theory is recasted in terms of a BCS wave function for electron-hole pairs in the edge bands. The BCS coherence present in each spin channel is related to spin-resolved electric polarization. Although the total electric polarization vanishes, due to an internal phase locking of the BCS state, strong magnetoelectric effects are expected in this system. The formulation naturally accounts for the two gaps in the quasiparticle spectrun, Δ0 and Δ1, and relates them to the intraband and interband self-energies.
Resumo:
Abstract Mobile Edge Computing enables the deployment of services, applications, content storage and processing in close proximity to mobile end users. This highly distributed computing environment can be used to provide ultra-low latency, precise positional awareness and agile applications, which could significantly improve user experience. In order to achieve this, it is necessary to consider next-generation paradigms such as Information-Centric Networking and Cloud Computing, integrated with the upcoming 5th Generation networking access. A cohesive end-to-end architecture is proposed, fully exploiting Information-Centric Networking together with the Mobile Follow-Me Cloud approach, for enhancing the migration of content-caches located at the edge of cloudified mobile networks. The chosen content-relocation algorithm attains content-availability improvements of up to 500 when a mobile user performs a request and compared against other existing solutions. The performed evaluation considers a realistic core-network, with functional and non-functional measurements, including the deployment of the entire system, computation and allocation/migration of resources. The achieved results reveal that the proposed architecture is beneficial not only from the users’ perspective but also from the providers point-of-view, which may be able to optimize their resources and reach significant bandwidth savings.
Resumo:
The C2 domain is one of the most frequent and widely distributed calcium-binding motifs. Its structure comprises an eight-stranded beta-sandwich with two structural types as if the result of a circular permutation. Combining sequence, structural and modelling information, we have explored, at different levels of granularity, the functional characteristics of several families of C2 domains. At the coarsest level,the similarity correlates with key structural determinants of the C2 domain fold and, at the finest level, with the domain architecture of the proteins containing them, highlighting the functional diversity between the various subfamilies. The functional diversity appears as different conserved surface patches throughout this common fold. In some cases, these patches are related to substrate-binding sites whereas in others they correspond to interfaces of presumably permanent interaction between other domains within the same polypeptide chain. For those related to substrate-binding sites, the predictions overlap with biochemical data in addition to providing some novel observations. For those acting as protein-protein interfaces' our modelling analysis suggests that slight variations between families are a result of not only complementary adaptations in the interfaces involved but also different domain architecture. In the light of the sequence and structural genomic projects, the work presented here shows that modelling approaches along with careful sub-typing of protein families will be a powerful combination for a broader coverage in proteomics. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
We present a new approach accounting for the nonadditivity of attractive parts of solid-fluid and fluidfluid potentials to improve the quality of the description of nitrogen and argon adsorption isotherms on graphitized carbon black in the framework of non-local density functional theory. We show that the strong solid-fluid interaction in the first monolayer decreases the fluid-fluid interaction, which prevents the twodimensional phase transition to occur. This results in smoother isotherm, which agrees much better with experimental data. In the region of multi-layer coverage the conventional non-local density functional theory and grand canonical Monte Carlo simulations are known to over-predict the amount adsorbed against experimental isotherms. Accounting for the non-additivity factor decreases the solid-fluid interaction with the increase of intermolecular interactions in the dense adsorbed fluid, preventing the over-prediction of loading in the region of multi-layer adsorption. Such an improvement of the non-local density functional theory allows us to describe experimental nitrogen and argon isotherms on carbon black quite accurately with mean error of 2.5 to 5.8% instead of 17 to 26% in the conventional technique. With this approach, the local isotherms of model pores can be derived, and consequently a more reliab * le pore size distribution can be obtained. We illustrate this by applying our theory against nitrogen and argon isotherms on a number of activated carbons. The fitting between our model and the data is much better than the conventional NLDFT, suggesting the more reliable PSD obtained with our approach.
Resumo:
Background Previous work suggesting a better correlation of diastolic than systolic function with exercise capacity in heart failure may reflect the -relative insensitivity and load-dependence of ejection fraction (EF). We sought the correlation of new and more sensitive methods of quantifying systolic and diastolic function and filling pressure with functional capacity. Methods We studied 155 consecutive exercise tests on 95 patients with congestive heart failure (81 male, aged 62 +/- 10 years), who underwent resting 2-climensional echocardiography and tissue Doppler imaging before and after measurement of maximum oxygen uptake (peak VO2)Results The resting EF was 3 1 % 10% and a peak VO(2)was 13 +/- 5 mL/kg/min; the majority of these patients (80%) had an ischemic cardiornyopathy. Resting EF (r 0.14, P =.09) correlated poorly with peak VO2 and mean systolic (r = 0.23, P =.004) and diastolic tissue velocities (r 0.18, P =.02). Peak EF was weakly correlated with the mean systolic (r = 0.18, P =.02) and diastolic velocities (r = 0.16, P <.04). The mean sum of systolic and diastolic velocities in both annuli (r = 0.30, P <.001) and E/Ea ratio (r 0.31, P <.001) were better correlated with peak VO2 Prediction of peak VO2 was similar with models based on models of filling pressure (R = 0.61), systolic factors (R = 0.63), and diastolic factors (R 0.59), although a composite model of filling pressure, systolic and diastolic function was a superior predictor of peak VO2 (R 0.69; all P<.001). Conclusions The reported association of diastolic rather than systolic function with functional capacity may have reflected the limitations of EF. Functional capacity appears related not only to diastolic function, but also to systolic function and filling pressure, and is most closely associated with a combination of these factors.
Resumo:
Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.
Resumo:
MULTIPRED is a web-based computational system for the prediction of peptide binding to multiple molecules ( proteins) belonging to human leukocyte antigens (HLA) class I A2, A3 and class II DR supertypes. It uses hidden Markov models and artificial neural network methods as predictive engines. A novel data representation method enables MULTIPRED to predict peptides that promiscuously bind multiple HLA alleles within one HLA supertype. Extensive testing was performed for validation of the prediction models. Testing results show that MULTIPRED is both sensitive and specific and it has good predictive ability ( area under the receiver operating characteristic curve A(ROC) > 0.80). MULTIPRED can be used for the mapping of promiscuous T-cell epitopes as well as the regions of high concentration of these targets termed T-cell epitope hotspots. MULTIPRED is available at http:// antigen.i2r.a-star.edu.sg/ multipred/.
Resumo:
Background: There is a recognized need to move from mortality to morbidity outcome predictions following traumatic injury. However, there are few morbidity outcome prediction scoring methods and these fail to incorporate important comorbidities or cofactors. This study aims to develop and evaluate a method that includes such variables. Methods: This was a consecutive case series registered in the Queensland Trauma Registry that consented to a prospective 12-month telephone conducted follow-up study. A multivariable statistical model was developed relating Trauma Registry data to trichotomized 12-month post-injury outcome (categories: no limitations, minor limitations and major limitations). Cross-validation techniques using successive single hold-out samples were then conducted to evaluate the model's predictive capabilities. Results: In total, 619 participated, with 337 (54%) experiencing no limitations, 101 (16%) experiencing minor limitations and 181 (29%) experiencing major limitations 12 months after injury. The final parsimonious multivariable statistical model included whether the injury was in the lower extremity body region, injury severity, age, length of hospital stay, pulse at admission and whether the participant was admitted to an intensive care unit. This model explained 21% of the variability in post-injury outcome. Predictively, 64% of those with no limitations, 18% of those with minor limitations and 37% of those with major limitations were correctly identified. Conclusion: Although carefully developed, this statistical model lacks the predictive power necessary for its use as a basis of a useful prognostic tool. Further research is required to identify variables other than those routinely used in the Trauma Registry to develop a model with the necessary predictive utility.
Resumo:
Motivation: While processing of MHC class II antigens for presentation to helper T-cells is essential for normal immune response, it is also implicated in the pathogenesis of autoimmune disorders and hypersensitivity reactions. Sequence-based computational techniques for predicting HLA-DQ binding peptides have encountered limited success, with few prediction techniques developed using three-dimensional models. Methods: We describe a structure-based prediction model for modeling peptide-DQ3.2 beta complexes. We have developed a rapid and accurate protocol for docking candidate peptides into the DQ3.2 beta receptor and a scoring function to discriminate binders from the background. The scoring function was rigorously trained, tested and validated using experimentally verified DQ3.2 beta binding and non-binding peptides obtained from biochemical and functional studies. Results: Our model predicts DQ3.2 beta binding peptides with high accuracy [area under the receiver operating characteristic (ROC) curve A(ROC) > 0.90], compared with experimental data. We investigated the binding patterns of DQ3.2 beta peptides and illustrate that several registers exist within a candidate binding peptide. Further analysis reveals that peptides with multiple registers occur predominantly for high-affinity binders.
Resumo:
Background: Determination of the subcellular location of a protein is essential to understanding its biochemical function. This information can provide insight into the function of hypothetical or novel proteins. These data are difficult to obtain experimentally but have become especially important since many whole genome sequencing projects have been finished and many resulting protein sequences are still lacking detailed functional information. In order to address this paucity of data, many computational prediction methods have been developed. However, these methods have varying levels of accuracy and perform differently based on the sequences that are presented to the underlying algorithm. It is therefore useful to compare these methods and monitor their performance. Results: In order to perform a comprehensive survey of prediction methods, we selected only methods that accepted large batches of protein sequences, were publicly available, and were able to predict localization to at least nine of the major subcellular locations (nucleus, cytosol, mitochondrion, extracellular region, plasma membrane, Golgi apparatus, endoplasmic reticulum (ER), peroxisome, and lysosome). The selected methods were CELLO, MultiLoc, Proteome Analyst, pTarget and WoLF PSORT. These methods were evaluated using 3763 mouse proteins from SwissProt that represent the source of the training sets used in development of the individual methods. In addition, an independent evaluation set of 2145 mouse proteins from LOCATE with a bias towards the subcellular localization underrepresented in SwissProt was used. The sensitivity and specificity were calculated for each method and compared to a theoretical value based on what might be observed by random chance. Conclusion: No individual method had a sufficient level of sensitivity across both evaluation sets that would enable reliable application to hypothetical proteins. All methods showed lower performance on the LOCATE dataset and variable performance on individual subcellular localizations was observed. Proteins localized to the secretory pathway were the most difficult to predict, while nuclear and extracellular proteins were predicted with the highest sensitivity.