914 resultados para High-dimensional index structure
Resumo:
There is an emerging interest in modeling spatially correlated survival data in biomedical and epidemiological studies. In this paper, we propose a new class of semiparametric normal transformation models for right censored spatially correlated survival data. This class of models assumes that survival outcomes marginally follow a Cox proportional hazard model with unspecified baseline hazard, and their joint distribution is obtained by transforming survival outcomes to normal random variables, whose joint distribution is assumed to be multivariate normal with a spatial correlation structure. A key feature of the class of semiparametric normal transformation models is that it provides a rich class of spatial survival models where regression coefficients have population average interpretation and the spatial dependence of survival times is conveniently modeled using the transformed variables by flexible normal random fields. We study the relationship of the spatial correlation structure of the transformed normal variables and the dependence measures of the original survival times. Direct nonparametric maximum likelihood estimation in such models is practically prohibited due to the high dimensional intractable integration of the likelihood function and the infinite dimensional nuisance baseline hazard parameter. We hence develop a class of spatial semiparametric estimating equations, which conveniently estimate the population-level regression coefficients and the dependence parameters simultaneously. We study the asymptotic properties of the proposed estimators, and show that they are consistent and asymptotically normal. The proposed method is illustrated with an analysis of data from the East Boston Ashma Study and its performance is evaluated using simulations.
Resumo:
Several studies have examined the association between high glycemic index (GI) and glycemic load (GL) diets and the risk for coronary heart disease (CHD). However, most of these studies were conducted primarily on white populations. The primary aim of this study was to examine whether high GI and GL diets are associated with increased risk for developing CHD in whites and African Americans, non-diabetics and diabetics, and within stratifications of body mass index (BMI) and hypertension (HTN). Baseline and 17-year follow-up data from ARIC (Atherosclerosis Risk in Communities) study was used. The study population (13,051) consisted of 74% whites, 26% African Americans, 89% non-diabetics, 11% diabetics, 43% male, 57% female aged 44 to 66 years at baseline. Data from the ARIC food frequency questionnaire at baseline were analyzed to provide GI and GL indices for each subject. Increases of 25 and 30 units for GI and GL respectively were used to describe relationships on incident CHD risk. Adjusted hazard ratios for propensity score with 95% confidence intervals (CI) were used to assess associations. During 17 years of follow-up (1987 to 2004), 1,683 cases of CHD was recorded. Glycemic index was associated with 2.12 fold (95% CI: 1.05, 4.30) increased incident CHD risk for all African Americans and GL was associated with 1.14 fold (95% CI: 1.04, 1.25) increased CHD risk for all whites. In addition, GL was also an important CHD risk factor for white non-diabetics (HR=1.59; 95% CI: 1.33, 1.90). Furthermore, within stratum of BMI 23.0 to 29.9 in non-diabetics, GI was associated with an increased hazard ratio of 11.99 (95% CI: 2.31, 62.18) for CHD in African Americans, and GL was associated with 1.23 fold (1.08, 1.39) increased CHD risk in whites. Body mass index modified the effect of GI and GL on CHD risk in all whites and white non-diabetics. For HTN, both systolic blood pressure and diastolic blood pressure modified the effect on GI and GL on CHD risk in all whites and African Americans, white and African American non-diabetics, and white diabetics. Further studies should examine other factors that could influence the effects of GI and GL on CHD risk, including dietary factors, physical activity, and diet-gene interactions. ^
Resumo:
Authigenic carbonates associated with cold seeps provide valuable archives of changes in the long-term seepage activity. To investigate the role of shallow-buried hydrates on the seepage strength and fluid composition we analysed methane-derived carbonate precipitates from a high-flux hydrocarbon seepage area ("Batumi seep area") located on the south-eastern Black Sea slope in ca. 850 m. In a novel approach, we combined computerized X-ray tomography (CT) with mineralogical and isotope geochemical methods to get additional insights into the three-dimensional internal structure of the carbonate build-ups. X-ray diffractometry revealed the presence of two different authigenic carbonate phases, i.e. pure aragonitic rims associated with vital microbial mats and high-Mg calcite cementing the hemipelagic sediment. As indicated by the CT images, the initial sediment has been strongly deformed, first plastic then brittle, leading to brecciation of the progressively cemented sediment. The aragonitic rims on the other hand, represent a presumably recent carbonate growth phase since they cover the already deformed sediment. The stable oxygen isotope signature indicates that the high-Mg calcite cement incorporated pore water mixed with substantial hydrate water amounts. This points at a dominant role of high gas/fluid flux from decomposing gas hydrates leading to the deformation and cementation of the overlying sediment. In contrast, the aragonitic rims do not show an influence of 18O-enriched hydrate water. The differences in d18O between the presumably recent aragonite precipitates and the older high-Mg cements suggest that periods of hydrate dissociation and vigorous fluid discharge alternated with times of hydrate stability and moderate fluid flow. These results indicate that shallow-buried gas hydrates are prone to episodic decomposition with associated vigorous fluid flow. This might have a profound impact on the seafloor morphology resulting e.g. in the formation of carbonate pavements and pockmark-like structures but might also affect the local carbon cycle.
Resumo:
The Self-OrganizingMap (SOM) is a neural network model that performs an ordered projection of a high dimensional input space in a low-dimensional topological structure. The process in which such mapping is formed is defined by the SOM algorithm, which is a competitive, unsupervised and nonparametric method, since it does not make any assumption about the input data distribution. The feature maps provided by this algorithm have been successfully applied for vector quantization, clustering and high dimensional data visualization processes. However, the initialization of the network topology and the selection of the SOM training parameters are two difficult tasks caused by the unknown distribution of the input signals. A misconfiguration of these parameters can generate a feature map of low-quality, so it is necessary to have some measure of the degree of adaptation of the SOM network to the input data model. The topologypreservation is the most common concept used to implement this measure. Several qualitative and quantitative methods have been proposed for measuring the degree of SOM topologypreservation, particularly using Kohonen's model. In this work, two methods for measuring the topologypreservation of the Growing Cell Structures (GCSs) model are proposed: the topographic function and the topology preserving map
Resumo:
Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.
Resumo:
Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.
Resumo:
Multi-dimensional classification (MDC) is the supervised learning problem where an instance is associated with multiple classes, rather than with a single class, as in traditional classification problems. Since these classes are often strongly correlated, modeling the dependencies between them allows MDC methods to improve their performance – at the expense of an increased computational cost. In this paper we focus on the classifier chains (CC) approach for modeling dependencies, one of the most popular and highest-performing methods for multi-label classification (MLC), a particular case of MDC which involves only binary classes (i.e., labels). The original CC algorithm makes a greedy approximation, and is fast but tends to propagate errors along the chain. Here we present novel Monte Carlo schemes, both for finding a good chain sequence and performing efficient inference. Our algorithms remain tractable for high-dimensional data sets and obtain the best predictive performance across several real data sets.
Resumo:
La preservación del patrimonio bibliográfico y documental en papel es uno de los mayores retos a los que se enfrentan bibliotecas y archivos de todo el mundo. La búsqueda de soluciones al problema del papel degradado ha sido abordada históricamente desde dos líneas de trabajo predominantes: la conservación de estos documentos mediante la neutralización de los ácidos presentes en ellos con agentes alcalinos, y su restauración mediante el método de laminación fundamentalmente con papel de origen vegetal. Sin embargo, no se ha explorado con éxito la posibilidad de reforzar la celulosa dañada, y el problema sigue sin encontrar una solución satisfactoria. Hasta el día de hoy, el desarrollo de tratamientos basados en biotecnología en la conservación del patrimonio documental ha sido muy escaso, aunque la capacidad de ciertas bacterias de producir celulosa lleva a plantear su uso en el campo de la conservación y restauración del papel. La celulosa bacteriana (CB) es químicamente idéntica a la celulosa vegetal, pero su organización macroscópica es diferente. Sus propiedades únicas (alto grado de cristalinidad, durabilidad, resistencia y biocompatibilidad) han hecho de este material un excelente recurso en diferentes campos. En el desarrollo de esta tesis se ha estudiado el uso de la celulosa bacteriana, de alta calidad, generada por Gluconacetobacter sucrofermentans CECT 7291, para restaurar documentos deteriorados y consolidar los que puedan estar en peligro de degradación, evitando así su destrucción y proporcionando al papel que se restaura unas buenas propiedades mecánicas, ópticas y estructurales. Se desarrollan asimismo protocolos de trabajo que permitan la aplicación de dicha celulosa. En primer lugar se seleccionó el medio de cultivo que proporcionó una celulosa adecuada para su uso en restauración. Para ello se evaluó el efecto que tienen sobre la celulosa generada las fuentes de carbono y nitrógeno del medio de cultivo, manteniendo como parámetros fijos la temperatura y el pH inicial del medio, y efectuando los ensayos en condiciones estáticas. Se evaluó, también, el efecto que tiene en la CB la adición de un 1% de etanol al medio de cultivo. Las capas de celulosa se recolectaron a cuatro tiempos distintos, caracterizando en cada uno de ellos el medio de cultivo (pH y consumo de fuente de carbono), y las capas de CB (pH, peso seco y propiedades ópticas y mecánicas). La mejor combinación de fuentes de carbono y nitrógeno resultó ser fructosa más extracto de levadura y extracto de maíz, con o sin etanol, que proporcionaban una buena relación entre la producción de celulosa y el consumo de fuente de carbono, y que generaban una capa de celulosa resistente y homogénea. La adición de etanol al medio de cultivo, si bien aumentaba la productividad, causaba un descenso apreciable de pH. Las capas de CB obtenidas con los medios de cultivo optimizados se caracterizaron en términos de sus índices de desgarro y estallido, propiedades ópticas, microscopía electrónica de barrido (SEM), difracción de rayos-X, espectroscopía infrarroja con transformada de Fourier (FTIR), grado de polimerización, ángulos de contacto estáticos y dinámicos, y porosimetría de intrusión de mercurio. Por otro lado hay que tener en cuenta que el material restaurado debe ser estable con el tiempo. Por ello esta misma caracterización se efectuó tras someter a las capas de CB a un proceso de envejecimiento acelerado. Los resultados mostraron que la CB resultante tiene un elevado índice de cristalinidad, baja porosidad interna, buenas propiedades mecánicas, y alta estabilidad en el tiempo. Para desarrollar los protocolos de trabajo que permitan la restauración con esta celulosa optimizada, se comienzó con un proceso de selección de los papeles que van a ser restaurados. Se eligieron tres tipos de papeles modelo, hechos con pasta mecánica, química y filtro (antes y después de ser sometidos a un proceso de envejecimiento acelerado), y tres libros viejos adquiridos en el mercado de segunda mano. Estos ejemplares a restaurar se caracterizaron también en términos de sus propiedades mecánicas y fisicoquímicas. El primer protocolo de restauración con CB que se evaluó fue el denominado laminación. Consiste en aplicar un material de refuerzo al documento mediante el uso de un adhesivo. Se seleccionó para ello la CB producida en el medio de cultivo optimizado con un 1% de etanol. Se aplicó un método de purificación alcalino (1 hora a 90 °C en NaOH al 1%) y como adhesivo se seleccionó almidón de trigo. El proceso de laminación se efectuó también con papel japonés (PJ), un material habitualmente utilizado en conservación, para comparar ambos materiales. Se concluyó que no hay diferencias significativas en las características estudiadas entre los dos tipos de materiales de refuerzo. Se caracterizó el material reforzado y, también, después de sufrir un proceso de envejecimiento acelerado. Los papeles laminados con CB mostraban diferencias más marcadas en las propiedades ópticas que los restaurados con PJ, con respecto a los originales. Sin embargo, el texto era más legible cuando el material de restauración era la CB. La mojabilidad disminuía con ambos tipos de refuerzo, aunque en los papeles laminados con CB de manera más marcada e independiente del material a restaurar. Esto se debe a la estructura cerrada de la CB, que también conduce a una disminución en la permeabilidad al aire. Este estudio sugiere que la CB mejora la calidad del papel deteriorado, sin alterar la información que contiene, y que esta mejora se mantiene a lo largo del tiempo. Por tanto, la CB puede ser utilizada como material de refuerzo para laminar, pudiendo ser más adecuada que el PJ para ciertos tipos de papeles. El otro método de restauración que se estudió fue la generación in situ de la CB sobre el papel a restaurar. Para ello se seleccionó el medio de cultivo sin etanol, ya que el descenso de pH que causaba su presencia podría dañar el documento a restaurar. El método de purificación elegido fue un tratamiento térmico (24 horas a 65 °C), menos agresivo para el material a restaurar que el tratamiento alcalino. Se seleccionó la aplicación del medio de cultivo con la bacteria mediante pincel sobre el material a restaurar. Una vez caracterizado el material restaurado, y éste mismo tras sufrir un proceso de envejecimiento acelerado, se concluyó que no hay modificación apreciable en ninguna característica, salvo en la permeabilidad al aire, que disminuye de manera muy evidente con la generación de CB, dando lugar a un material prácticamente impermeable al aire. En general se puede concluir que ha quedado demostrada la capacidad que tiene la celulosa generada por la bacteria Gluconacetobacter sucrofermentans CECT 7291 para ser utilizada como material de refuerzo en la restauración del patrimonio documental en papel. Asimismo se han desarrollado dos métodos de aplicación, uno ex situ y otro in situ, para efectuar esta tarea de restauración. ABSTRACT The preservation of bibliographic and documentary heritage is one of the biggest challenges that libraries and archives around the world have to face. The search for solutions to the problem of degraded paper has historically been focused from two predominants lines of work: the conservation of these documents by the neutralization of acids in them with alkaline agents, and their restoration by lining them with, basically, cellulose from vegetal sources. However, the possibility of strengthening the damaged cellulose has not been successfully explored, and the problem still persists. Until today, the development of biotechnology-based treatments in documentary heritage conservation has been scarce, although the ability of certain bacteria to produce cellulose takes to propose its use in the field of conservation and restoration of paper. The bacterial cellulose (BC) is chemically identical to the plant cellulose, but its macroscopic organization is different. Its unique properties (high degree of crystallinity, durability, strength and biocompatibility), makes it an excellent resource in different fields. The use of high-quality BC generated by Gluconacetobacter sucrofermentans CECT 7291 to restore damaged documents and to consolidate those that may be at risk of degradation, has been studied in this thesis, trying to prevent the document destruction, and to get reinforced papers with good mechanical, optical and structural properties. Protocols that allow the implementation of the BC as a reinforcing material were also developed. First of all, in order to select the culture medium that provides a cellulose suitable for its use in restoration, it has been evaluated the effect that the carbon and nitrogen sources from the culture medium have on the generated BC, keeping the temperature and the initial pH of the medium as fixed parameters, and performing the culture without shaking. The effect of the addition of 1% ethanol to the culture medium on BC properties was also evaluated. The cellulose layers were collected at four different times, characterizing in all of them the culture medium (pH and carbon source consumption), and the BC sheets (pH, dry weight and optical and mechanical properties). The best combination of carbon and nitrogen sources proved to be fructose plus yeast extract and corn steep liquor, with or without ethanol, which provided a good balance between the cellulose production and the consumption of carbon source, and generating BC sheets homogeneous and resistant. The addition of ethanol to the culture medium increased productivity but caused a noticeable decrement in pH. The BC layers generated with these optimized culture media, have been characterized in terms of tear and burst index, optical properties, scanning electron microscopy (SEM), X-ray diffraction, infrared Fourier transform spectroscopy (FTIR), polymerization degree, static and dynamic contact angles, and mercury intrusion porosimetry. Moreover it must be kept in mind that the restored materials should be stable over time. Therefore, the same characterization was performed after subjecting the layers of BC to an accelerated aging process. The results showed that the BC sheets obtained have a high crystallinity index, low internal porosity, good mechanical properties, and high stability over time. To develop working protocols to use this optimized BC in paper restoration, the first step was to select the samples to restore. Three types of model papers, made from mechanical pulp, chemical pulp and filter paper (before and after an accelerated aging process), and three old books purchased in the second hand market, were chosen. These specimens to be restored were also characterized in terms of its mechanical and physicochemical properties. The first protocol of restoration with BC to be evaluated is called linning. It consists on applying a reinforcing material to the document using an adhesive. The BC produced in the optimized culture medium with 1% ethanol was selected. An alkali purification method (1 hour at 90 °C in 1% NaOH) was applied, and wheat starch was selected as adhesive. The linning process was also carried out with Japanese paper (JP), a material commonly used in conservation, in order to compare both materials. It was concluded that there are no significant differences in the characteristics studied of the two types of reinforcing materials. The reinforced materials were characterized before and after undergoing to an accelerated aging. Papers lined with BC showed more marked differences in the optical properties that papers restored with JP. However, the text was more readable when BC was the reinforcing material. Wettability decreased with both types of reinforcement, although in the papers linned with BC it happened more marked and independently of the sample to restore. This is due to the closed structure of BC, which also leads to a decrement in air permeance. This study suggests that BC improves the deteriorated paper quality, without altering the information on it, and that this improvement is maintained over time. Therefore, the BC may be used as reinforcing material for linning, being more suitable than the JP to restore certain types of papers. The other restoration method to be evaluated was the in situ generation of BC over the paper to restore. For this purpose the culture medium without ethanol was selected, as the pH decrement caused by his presence would damage the document to restore. As purification method a heat treatment (24 hours at 65 °C) was chosen, less aggressive to the material to restore than the alkaline treatment. It was decided to apply the culture medium with the bacteria onto the material to restore with a brush. The reinforced material was characterized before and after an accelerated aging process. It was concluded that there was no substantial change in any characteristic, except for air permeance, which decreases very sharply after the generation of BC, getting a substantially air impermeable material. In general, it can be concluded that the ability of BC produced by Gluconacetobacter sucrofermentans CECT 7291 for its use as a reinforcing material in the restoration of paper documentary heritage, has been demonstrated. Also, two restoration methods, one ex situ and another in situ have been developed.
Resumo:
The normal function of human intercellular adhesion molecule-1 (ICAM-1) is to provide adhesion between endothelial cells and leukocytes after injury or stress. ICAM-1 binds to leukocyte function-associated antigen (LFA-1) or macrophage-1 antigen (Mac-1). However, ICAM-1 is also used as a receptor by the major group of human rhinoviruses and is a catalyst for the subsequent viral uncoating during cell entry. The three-dimensional atomic structure of the two amino-terminal domains (D1 and D2) of ICAM-1 has been determined to 2.2-Å resolution and fitted into a cryoelectron microscopy reconstruction of a rhinovirus–ICAM-1 complex. Rhinovirus attachment is confined to the BC, CD, DE, and FG loops of the amino-terminal Ig-like domain (D1) at the end distal to the cellular membrane. The loops are considerably different in structure to those of human ICAM-2 or murine ICAM-1, which do not bind rhinoviruses. There are extensive charge interactions between ICAM-1 and human rhinoviruses, which are mostly conserved in both major and minor receptor groups of rhinoviruses. The interaction of ICAMs with LFA-1 is known to be mediated by a divalent cation bound to the insertion (I)-domain on the α chain of LFA-1 and the carboxyl group of a conserved glutamic acid residue on ICAMs. Domain D1 has been docked with the known structure of the I-domain. The resultant model is consistent with mutational data and provides a structural framework for the adhesion between these molecules.
Resumo:
The function of a protein generally is determined by its three-dimensional (3D) structure. Thus, it would be useful to know the 3D structure of the thousands of protein sequences that are emerging from the many genome projects. To this end, fold assignment, comparative protein structure modeling, and model evaluation were automated completely. As an illustration, the method was applied to the proteins in the Saccharomyces cerevisiae (baker’s yeast) genome. It resulted in all-atom 3D models for substantial segments of 1,071 (17%) of the yeast proteins, only 40 of which have had their 3D structure determined experimentally. Of the 1,071 modeled yeast proteins, 236 were related clearly to a protein of known structure for the first time; 41 of these previously have not been characterized at all.
Resumo:
The full sequence of the genome-linked viral protein (VPg) cistron located in the central part of potato virus Y (common strain) genome has been identified. The VPg gene codes for a protein of 188 amino acids, with significant homology to other known potyviral VPg polypeptides. A three-dimensional model structure of VPg is proposed on the basis of similarity of hydrophobic-hydrophilic residue distribution to the sequence of malate dehydrogenase of known crystal structure. The 5' end of the viral RNA can be fitted to interact with the protein through the exposed hydroxyl group of Tyr-64, in agreement with experimental data. The complex favors stereochemically the formation of a phosphodiester bond [5'-(O4-tyrosylphospho)adenylate] typical for representatives of picornavirus-like viruses. The chemical mechanisms of viral RNA binding to VPg are discussed on the basis of the model structure of protein-RNA complex.
Resumo:
Efficient and reliable classification of visual stimuli requires that their representations reside a low-dimensional and, therefore, computationally manageable feature space. We investigated the ability of the human visual system to derive such representations from the sensory input-a highly nontrivial task, given the million or so dimensions of the visual signal at its entry point to the cortex. In a series of experiments, subjects were presented with sets of parametrically defined shapes; the points in the common high-dimensional parameter space corresponding to the individual shapes formed regular planar (two-dimensional) patterns such as a triangle, a square, etc. We then used multidimensional scaling to arrange the shapes in planar configurations, dictated by their experimentally determined perceived similarities. The resulting configurations closely resembled the original arrangements of the stimuli in the parameter space. This achievement of the human visual system was replicated by a computational model derived from a theory of object representation in the brain, according to which similarities between objects, and not the geometry of each object, need to be faithfully represented.
Resumo:
The basement membrane (BM) extracellular matrix induces differentiation and suppresses apoptosis in mammary epithelial cells, whereas cells lacking BM lose their differentiated phenotype and undergo apoptosis. Addition of purified BM components, which are known to induce beta-casein expression, did not prevent apoptosis, indicating that a more complex BM was necessary. A comparison of culture conditions where apoptosis would or would not occur allowed us to relate inhibition of apoptosis to a complete withdrawal from the cell cycle, which was observed only when cells acquired a three-dimensional alveolar structure in response to BM. In the absence of this morphology, both the GI cyclin kinase inhibitor p21/WAF-1 and positive proliferative signals including c-myc and cyclin DI were expressed and the retinoblastoma protein (Rb) continued to be hyperphosphorylated. When we overexpressed either c-myc in quiescent cells or p21 when cells were still cycling, apoptosis was induced. In the absence of three-dimensional alveolar structures, mammary epithelial cells secrete a number of factors including transforming growth factor alpha and tenascin, which when added exogenously to quiescent cells induced expression of c-myc and interleukin-beta1-converting enzyme (ICE) mRNA and led to apoptosis. These experiments demonstrate that a correct tissue architecture is crucial for long-range homeostasis, suppression of apoptosis, and maintenance of differentiated phenotype.
Resumo:
She is a widely expressed adapter protein that plays an important role in signaling via a variety of cell surface receptors and has been implicated in coupling the stimulation of growth factor, cytokine, and antigen receptors to the Ras signaling pathway. She interacts with several tyrosine-phosphorylated receptors through its C-terminal SH2 domain, and one of the mechanisms of T-cell receptor-mediated Ras activation involves the interaction of the Shc SH2 domain with the tyrosine-phosphorylated zeta chain of the T-cell receptor. Here we describe a high-resolution NMR structure of the Shc SH2 domain complexed to a phosphopeptide (GHDGLpYQGLSTATK) corresponding to a portion of the zeta chain of the T-cell receptor. Although the overall architecture of the protein is similar to other SH2 domains, distinct structural differences were observed in the smaller beta-sheet, BG loop, (pY + 3) phosphopeptide-binding site, and relative position of the bound phosphopeptide.
Resumo:
The FANOVA (or “Sobol’-Hoeffding”) decomposition of multivariate functions has been used for high-dimensional model representation and global sensitivity analysis. When the objective function f has no simple analytic form and is costly to evaluate, computing FANOVA terms may be unaffordable due to numerical integration costs. Several approximate approaches relying on Gaussian random field (GRF) models have been proposed to alleviate these costs, where f is substituted by a (kriging) predictor or by conditional simulations. Here we focus on FANOVA decompositions of GRF sample paths, and we notably introduce an associated kernel decomposition into 4 d 4d terms called KANOVA. An interpretation in terms of tensor product projections is obtained, and it is shown that projected kernels control both the sparsity of GRF sample paths and the dependence structure between FANOVA effects. Applications on simulated data show the relevance of the approach for designing new classes of covariance kernels dedicated to high-dimensional kriging.