889 resultados para Perception-based Analysis
Resumo:
The genetic variability of milk protein genes may influence the nutritive value or processing and functional properties of the milk. While numerous protein variants are known in ruminants, knowledge about milk protein variability in horses is still limited. Mare's milk is, however, produced for human consumption in many countries. Beta-lactoglobulin belonging to the protein family of lipocalins, which are known as common food- and airborne allergens, is a major whey protein. It is absent from human milk and thus a key agent in provoking cow's milk protein allergy. Mare's milk is, however, usually better tolerated by most affected people. Several functions of β-lactoglobulin have been discussed, but its ultimate physiological role remains unclear. In the current study, the open reading frames of the two equine β-lactoglobulin paralogues LGB1 and LGB2 were re-sequenced in 249 horses belonging to 14 different breeds in order to predict the existence of protein variants at the DNA-level. Thereby, only a single signal peptide variant of LGB1, but 10 different putative protein variants of LGB2 were identified. In horses, both genes are expressed and in such this is a striking previously unknown difference in genetic variability between the two genes. It can be assumed that LGB1 is the ancestral paralogue, which has an essential function causing a high selection pressure. As horses have very low milk fat content this unknown function might well be related to vitamin-uptake. Further studies are, however, needed, to elucidate the properties of the different gene products.
Resumo:
The genomic era brought by recent advances in the next-generation sequencing technology makes the genome-wide scans of natural selection a reality. Currently, almost all the statistical tests and analytical methods for identifying genes under selection was performed on the individual gene basis. Although these methods have the power of identifying gene subject to strong selection, they have limited power in discovering genes targeted by moderate or weak selection forces, which are crucial for understanding the molecular mechanisms of complex phenotypes and diseases. Recent availability and rapid completeness of many gene network and protein-protein interaction databases accompanying the genomic era open the avenues of exploring the possibility of enhancing the power of discovering genes under natural selection. The aim of the thesis is to explore and develop normal mixture model based methods for leveraging gene network information to enhance the power of natural selection target gene discovery. The results show that the developed statistical method, which combines the posterior log odds of the standard normal mixture model and the Guilt-By-Association score of the gene network in a naïve Bayes framework, has the power to discover moderate/weak selection gene which bridges the genes under strong selection and it helps our understanding the biology under complex diseases and related natural selection phenotypes.^
Resumo:
Abstract is not available.
Resumo:
The biggest problem when analyzing the brain is that its synaptic connections are extremely complex. Generally, the billions of neurons making up the brain exchange information through two types of highly specialized structures: chemical synapses (the vast majority) and so-called gap junctions (a substrate of one class of electrical synapse). Here we are interested in exploring the three-dimensional spatial distribution of chemical synapses in the cerebral cortex. Recent research has showed that the three-dimensional spatial distribution of synapses in layer III of the neocortex can be modeled by a random sequential adsorption (RSA) point process, i.e., synapses are distributed in space almost randomly, with the only constraint that they cannot overlap. In this study we hypothesize that RSA processes can also explain the distribution of synapses in all cortical layers. We also investigate whether there are differences in both the synaptic density and spatial distribution of synapses between layers. Using combined focused ion beam milling and scanning electron microscopy (FIB/SEM), we obtained three-dimensional samples from the six layers of the rat somatosensory cortex and identified and reconstructed the synaptic junctions. A total volume of tissue of approximately 4500μm3 and around 4000 synapses from three different animals were analyzed. Different samples, layers and/or animals were aggregated and compared using RSA replicated spatial point processes. The results showed no significant differences in the synaptic distribution across the different rats used in the study. We found that RSA processes described the spatial distribution of synapses in all samples of each layer. We also found that the synaptic distribution in layers II to VI conforms to a common underlying RSA process with different densities per layer. Interestingly, the results showed that synapses in layer I had a slightly different spatial distribution from the other layers.
Resumo:
Análisis de precisión en modelos digitales de elevación globales. ABSTRACT: Terrain-Based Analysis results in derived products from an input DEM and these products are needed to perform various analyses. To efficiently use these products in decision-making, their accuracies must be estimated systematically. This paper proposes a procedure to assess the accuracy of these derived products, by calculating the accuracy of the slope dataset and its significance, taking as an input the accuracy of the DEM. Based on the output of previously published research on modeling the relative accuracy of a DEM, specifically ASTER and SRTM DEMs with Lebanon coverage as the area of study, analysis have showed that ASTER has a low significance in the majority of the area where only 2% of the modeled terrain has 50% or more significance. On the other hand, SRTM showed a better significance, where 37% of the modeled terrain has 50% or more significance. Statistical analysis deduced that the accuracy of the slope dataset, calculated on a cell-by-cell basis, is highly correlated to the accuracy of the input DEM. However, this correlation becomes lower between the slope accuracy and the slope significance, whereas it becomes much higher between the modeled slope and the slope significance.
Resumo:
El uso de aritmética de punto fijo es una opción de diseño muy extendida en sistemas con fuertes restricciones de área, consumo o rendimiento. Para producir implementaciones donde los costes se minimicen sin impactar negativamente en la precisión de los resultados debemos llevar a cabo una asignación cuidadosa de anchuras de palabra. Encontrar la combinación óptima de anchuras de palabra en coma fija para un sistema dado es un problema combinatorio NP-hard al que los diseñadores dedican entre el 25 y el 50 % del ciclo de diseño. Las plataformas hardware reconfigurables, como son las FPGAs, también se benefician de las ventajas que ofrece la aritmética de coma fija, ya que éstas compensan las frecuencias de reloj más bajas y el uso más ineficiente del hardware que hacen estas plataformas respecto a los ASICs. A medida que las FPGAs se popularizan para su uso en computación científica los diseños aumentan de tamaño y complejidad hasta llegar al punto en que no pueden ser manejados eficientemente por las técnicas actuales de modelado de señal y ruido de cuantificación y de optimización de anchura de palabra. En esta Tesis Doctoral exploramos distintos aspectos del problema de la cuantificación y presentamos nuevas metodologías para cada uno de ellos: Las técnicas basadas en extensiones de intervalos han permitido obtener modelos de propagación de señal y ruido de cuantificación muy precisos en sistemas con operaciones no lineales. Nosotros llevamos esta aproximación un paso más allá introduciendo elementos de Multi-Element Generalized Polynomial Chaos (ME-gPC) y combinándolos con una técnica moderna basada en Modified Affine Arithmetic (MAA) estadístico para así modelar sistemas que contienen estructuras de control de flujo. Nuestra metodología genera los distintos caminos de ejecución automáticamente, determina las regiones del dominio de entrada que ejercitarán cada uno de ellos y extrae los momentos estadísticos del sistema a partir de dichas soluciones parciales. Utilizamos esta técnica para estimar tanto el rango dinámico como el ruido de redondeo en sistemas con las ya mencionadas estructuras de control de flujo y mostramos la precisión de nuestra aproximación, que en determinados casos de uso con operadores no lineales llega a tener tan solo una desviación del 0.04% con respecto a los valores de referencia obtenidos mediante simulación. Un inconveniente conocido de las técnicas basadas en extensiones de intervalos es la explosión combinacional de términos a medida que el tamaño de los sistemas a estudiar crece, lo cual conlleva problemas de escalabilidad. Para afrontar este problema presen tamos una técnica de inyección de ruidos agrupados que hace grupos con las señales del sistema, introduce las fuentes de ruido para cada uno de los grupos por separado y finalmente combina los resultados de cada uno de ellos. De esta forma, el número de fuentes de ruido queda controlado en cada momento y, debido a ello, la explosión combinatoria se minimiza. También presentamos un algoritmo de particionado multi-vía destinado a minimizar la desviación de los resultados a causa de la pérdida de correlación entre términos de ruido con el objetivo de mantener los resultados tan precisos como sea posible. La presente Tesis Doctoral también aborda el desarrollo de metodologías de optimización de anchura de palabra basadas en simulaciones de Monte-Cario que se ejecuten en tiempos razonables. Para ello presentamos dos nuevas técnicas que exploran la reducción del tiempo de ejecución desde distintos ángulos: En primer lugar, el método interpolativo aplica un interpolador sencillo pero preciso para estimar la sensibilidad de cada señal, y que es usado después durante la etapa de optimización. En segundo lugar, el método incremental gira en torno al hecho de que, aunque es estrictamente necesario mantener un intervalo de confianza dado para los resultados finales de nuestra búsqueda, podemos emplear niveles de confianza más relajados, lo cual deriva en un menor número de pruebas por simulación, en las etapas iniciales de la búsqueda, cuando todavía estamos lejos de las soluciones optimizadas. Mediante estas dos aproximaciones demostramos que podemos acelerar el tiempo de ejecución de los algoritmos clásicos de búsqueda voraz en factores de hasta x240 para problemas de tamaño pequeño/mediano. Finalmente, este libro presenta HOPLITE, una infraestructura de cuantificación automatizada, flexible y modular que incluye la implementación de las técnicas anteriores y se proporciona de forma pública. Su objetivo es ofrecer a desabolladores e investigadores un entorno común para prototipar y verificar nuevas metodologías de cuantificación de forma sencilla. Describimos el flujo de trabajo, justificamos las decisiones de diseño tomadas, explicamos su API pública y hacemos una demostración paso a paso de su funcionamiento. Además mostramos, a través de un ejemplo sencillo, la forma en que conectar nuevas extensiones a la herramienta con las interfaces ya existentes para poder así expandir y mejorar las capacidades de HOPLITE. ABSTRACT Using fixed-point arithmetic is one of the most common design choices for systems where area, power or throughput are heavily constrained. In order to produce implementations where the cost is minimized without negatively impacting the accuracy of the results, a careful assignment of word-lengths is required. The problem of finding the optimal combination of fixed-point word-lengths for a given system is a combinatorial NP-hard problem to which developers devote between 25 and 50% of the design-cycle time. Reconfigurable hardware platforms such as FPGAs also benefit of the advantages of fixed-point arithmetic, as it compensates for the slower clock frequencies and less efficient area utilization of the hardware platform with respect to ASICs. As FPGAs become commonly used for scientific computation, designs constantly grow larger and more complex, up to the point where they cannot be handled efficiently by current signal and quantization noise modelling and word-length optimization methodologies. In this Ph.D. Thesis we explore different aspects of the quantization problem and we present new methodologies for each of them: The techniques based on extensions of intervals have allowed to obtain accurate models of the signal and quantization noise propagation in systems with non-linear operations. We take this approach a step further by introducing elements of MultiElement Generalized Polynomial Chaos (ME-gPC) and combining them with an stateof- the-art Statistical Modified Affine Arithmetic (MAA) based methodology in order to model systems that contain control-flow structures. Our methodology produces the different execution paths automatically, determines the regions of the input domain that will exercise them, and extracts the system statistical moments from the partial results. We use this technique to estimate both the dynamic range and the round-off noise in systems with the aforementioned control-flow structures. We show the good accuracy of our approach, which in some case studies with non-linear operators shows a 0.04 % deviation respect to the simulation-based reference values. A known drawback of the techniques based on extensions of intervals is the combinatorial explosion of terms as the size of the targeted systems grows, which leads to scalability problems. To address this issue we present a clustered noise injection technique that groups the signals in the system, introduces the noise terms in each group independently and then combines the results at the end. In this way, the number of noise sources in the system at a given time is controlled and, because of this, the combinato rial explosion is minimized. We also present a multi-way partitioning algorithm aimed at minimizing the deviation of the results due to the loss of correlation between noise terms, in order to keep the results as accurate as possible. This Ph.D. Thesis also covers the development of methodologies for word-length optimization based on Monte-Carlo simulations in reasonable times. We do so by presenting two novel techniques that explore the reduction of the execution times approaching the problem in two different ways: First, the interpolative method applies a simple but precise interpolator to estimate the sensitivity of each signal, which is later used to guide the optimization effort. Second, the incremental method revolves on the fact that, although we strictly need to guarantee a certain confidence level in the simulations for the final results of the optimization process, we can do it with more relaxed levels, which in turn implies using a considerably smaller amount of samples, in the initial stages of the process, when we are still far from the optimized solution. Through these two approaches we demonstrate that the execution time of classical greedy techniques can be accelerated by factors of up to ×240 for small/medium sized problems. Finally, this book introduces HOPLITE, an automated, flexible and modular framework for quantization that includes the implementation of the previous techniques and is provided for public access. The aim is to offer a common ground for developers and researches for prototyping and verifying new techniques for system modelling and word-length optimization easily. We describe its work flow, justifying the taken design decisions, explain its public API and we do a step-by-step demonstration of its execution. We also show, through an example, the way new extensions to the flow should be connected to the existing interfaces in order to expand and improve the capabilities of HOPLITE.
Resumo:
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.
Resumo:
Computer models were used to examine whether and under what conditions the multimeric protein complex is inhibited by high concentrations of one of its components—an effect analogous to the prozone phenomenon in precipitin tests. A series of idealized simple “ball-and-stick” structures representing small oligomeric complexes of protein molecules formed by reversible binding reactions were analyzed to determine the binding steps leading to each structure. The equilibrium state of each system was then determined over a range of starting concentrations and Kds and the steady-state concentration of structurally complete oligomer calculated for each situation. A strong inhibitory effect at high concentrations was shown by any protein molecule forming a bridge between two or more separable parts of the complex. By contrast, proteins linked to the outside of the complex by a single bond showed no inhibition whatsoever at any concentration. Nonbridging, multivalent proteins in the body of the complex could show an inhibitory effect or not depending on the structure of the complex and the strength of its bonds. On the basis of this study, we suggest that the prozone phenomenon will occur widely in living cells and that it could be a crucial factor in the regulation of protein complex formation.
Resumo:
Background: Celiac disease (CD) has a negative impact on the health-related quality of life (HRQL) of affected patients. Although HRQL and its determinants have been examined in Spanish CD patients specifically recruited in hospital settings, these aspects of CD have not been assessed among the general Spanish population. Methods: An observational, transversal study of a non-randomized, representative sample of adult celiac patients throughout all of Spain's Autonomous Regions. Subjects were recruited through celiac patient associations. A Spanish version of the self-administered Celiac Disease-Quality of Life (CD-QOL) questionnaire was used. Determinant factors of HRQL were assessed with the aid of multivariate analysis to control for confounding factors. Results: We analyzed the responses provided by 1,230 patients, 1,092 (89.2%) of whom were women. The overall mean value for the CD-QOL index was 56.3 ± 18.27 points. The dimension that obtained the most points was dysphoria, with 81.3 ± 19.56 points, followed by limitations with 52.3 ± 23.43 points; health problems, with 51.6 ± 26.08 points, and inadequate treatment, with 36.1 ± 21.18 points. Patient age and sex, along with time to diagnosis, and length of time on a gluten-free diet were all independent determinant factors of certain dimensions of HRQL: women aged 31 to 40 expressed poorer HRQL while time to diagnosis and length of time on a gluten-free diet were determinant factors for better HRQL scores. Conclusions: The HRQL of adult Spanish celiac subjects is moderate, improving with the length of time patients remain on a gluten-free diet.
Resumo:
Many multifactorial biologic effects, particularly in the context of complex human diseases, are still poorly understood. At the same time, the systematic acquisition of multivariate data has become increasingly easy. The use of such data to analyze and model complex phenotypes, however, remains a challenge. Here, a new analytic approach is described, termed coreferentiality, together with an appropriate statistical test. Coreferentiality is the indirect relation of two variables of functional interest in respect to whether they parallel each other in their respective relatedness to multivariate reference data, which can be informative for a complex effect or phenotype. It is shown that the power of coreferentiality testing is comparable to multiple regression analysis, sufficient even when reference data are informative only to a relatively small extent of 2.5%, and clearly exceeding the power of simple bivariate correlation testing. Thus, coreferentiality testing uses the increased power of multivariate analysis, however, in order to address a more straightforward interpretable bivariate relatedness. Systematic application of this approach could substantially improve the analysis and modeling of complex phenotypes, particularly in the context of human study where addressing functional hypotheses by direct experimentation is often difficult.
Resumo:
This paper presents a corpus-based descriptive analysis of the most prevalent transfer effects and connected speech processes observed in a comparison of 11 Vietnamese English speakers (6 females, 5 males) and 12 Australian English speakers (6 males, 6 females) over 24 grammatical paraphrase items. The phonetic processes are segmentally labelled in terms of IPA diacritic features using the EMU speech database system with the aim of labelling departures from native-speaker pronunciation. An analysis of prosodic features was made using ToBI framework. The results show many phonetic and prosodic processes which make non-native speakers’ speech distinct from native ones. The corpusbased methodology of analysing foreign accent may have implications for the evaluation of non-native accent, accented speech recognition and computer assisted pronunciation- learning.