913 resultados para structure, analysis, modeling
Resumo:
Single nucleotide polymorphisms (SNPs) may be used in biodiversity studies and commercial tasks like traceability, paternity testing and selection for suitable genotypes. Twenty-seven SNPs were characterized and genotyped on 250 individuals belonging to eight Italian goat breeds. Multilocus genotype data were used to infer population structure and assign individuals to populations. To estimate the number of groups (K) to test in population structure analysis we used likelihood values and variance of the bootstrap samples, deriving optimal K from a drop in the likelihood and a rise in the variance plots against K.
Resumo:
Multiple outcomes data are commonly used to characterize treatment effects in medical research, for instance, multiple symptoms to characterize potential remission of a psychiatric disorder. Often either a global, i.e. symptom-invariant, treatment effect is evaluated. Such a treatment effect may over generalize the effect across the outcomes. On the other hand individual treatment effects, varying across all outcomes, are complicated to interpret, and their estimation may lose precision relative to a global summary. An effective compromise to summarize the treatment effect may be through patterns of the treatment effects, i.e. "differentiated effects." In this paper we propose a two-category model to differentiate treatment effects into two groups. A model fitting algorithm and simulation study are presented, and several methods are developed to analyze heterogeneity presenting in the treatment effects. The method is illustrated using an analysis of schizophrenia symptom data.
Resumo:
Mitochondrial F(1)F(o)-ATP synthase is a molecular motor that couples the energy generated by oxidative metabolism to the synthesis of ATP. Direct visualization of the rotary action of the bacterial ATP synthase has been well characterized. However, direct observation of rotation of the mitochondrial enzyme has not been reported yet. Here, we describe two methods to reconstitute mitochondrial F(1)F(o)-ATP synthase into lipid bilayers suitable for structure analysis by electron and atomic force microscopy (AFM). Proteoliposomes densely packed with bovine heart mitochondria F(1)F(o)-ATP synthase were obtained upon detergent removal from ternary mixtures (lipid, detergent and protein). Two-dimensional crystals of recombinant hexahistidine-tagged yeast F(1)F(o)-ATP synthase were grown using the supported monolayer technique. Because the hexahistidine-tag is located at the F(1) catalytic subcomplex, ATP synthases were oriented unidirectionally in such two-dimensional crystals, exposing F(1) to the lipid monolayer and the F(o) membrane region to the bulk solution. This configuration opens a new avenue for the determination of the c-ring stoichiometry of unknown hexahistidine-tagged ATP synthases and the organization of the membrane intrinsic subunits within F(o) by electron microscopy and AFM.
Resumo:
ABSTRACT ONTOLOGIES AND METHODS FOR INTEROPERABILITY OF ENGINEERING ANALYSIS MODELS (EAMS) IN AN E-DESIGN ENVIRONMENT SEPTEMBER 2007 NEELIMA KANURI, B.S., BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCES PILANI INDIA M.S., UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Professor Ian Grosse Interoperability is the ability of two or more systems to exchange and reuse information efficiently. This thesis presents new techniques for interoperating engineering tools using ontologies as the basis for representing, visualizing, reasoning about, and securely exchanging abstract engineering knowledge between software systems. The specific engineering domain that is the primary focus of this report is the modeling knowledge associated with the development of engineering analysis models (EAMs). This abstract modeling knowledge has been used to support integration of analysis and optimization tools in iSIGHT FD , a commercial engineering environment. ANSYS , a commercial FEA tool, has been wrapped as an analysis service available inside of iSIGHT-FD. Engineering analysis modeling (EAM) ontology has been developed and instantiated to form a knowledge base for representing analysis modeling knowledge. The instances of the knowledge base are the analysis models of real world applications. To illustrate how abstract modeling knowledge can be exploited for useful purposes, a cantilever I-Beam design optimization problem has been used as a test bed proof-of-concept application. Two distinct finite element models of the I-beam are available to analyze a given beam design- a beam-element finite element model with potentially lower accuracy but significantly reduced computational costs and a high fidelity, high cost, shell-element finite element model. The goal is to obtain an optimized I-beam design at minimum computational expense. An intelligent KB tool was developed and implemented in FiPER . This tool reasons about the modeling knowledge to intelligently shift between the beam and the shell element models during an optimization process to select the best analysis model for a given optimization design state. In addition to improved interoperability and design optimization, methods are developed and presented that demonstrate the ability to operate on ontological knowledge bases to perform important engineering tasks. One such method is the automatic technical report generation method which converts the modeling knowledge associated with an analysis model to a flat technical report. The second method is a secure knowledge sharing method which allocates permissions to portions of knowledge to control knowledge access and sharing. Both the methods acting together enable recipient specific fine grain controlled knowledge viewing and sharing in an engineering workflow integration environment, such as iSIGHT-FD. These methods together play a very efficient role in reducing the large scale inefficiencies existing in current product design and development cycles due to poor knowledge sharing and reuse between people and software engineering tools. This work is a significant advance in both understanding and application of integration of knowledge in a distributed engineering design framework.
Resumo:
Morphometric investigations using a point and intersection counting strategy in the lung often are not able to reveal the full set of morphologic changes. This happens particularly when structural modifications are not expressed in terms of volume density changes and when rough and fine surface density alterations cancel each other at different magnifications. Making use of digital image processing, we present a methodological approach that allows to easily and quickly quantify changes of the geometrical properties of the parenchymal lung structure and reflects closely the visual appreciation of the changes. Randomly sampled digital images from light microscopic sections of lung parenchyma are filtered, binarized, and skeletonized. The lung septa are thus represented as a single-pixel wide line network with nodal points and end points and the corresponding internodal and end segments. By automatically counting the number of points and measuring the lengths of the skeletal segments, the lung architecture can be characterized and very subtle structural changes can be detected. This new methodological approach to lung structure analysis is highly sensitive to morphological changes in the parenchyma: it detected highly significant quantitative alterations in the structure of lungs of rats treated with a glucocorticoid hormone, where the classical morphometry had partly failed.
Resumo:
A class of potent nonpeptidic inhibitors of human immunodeficiency virus protease has been designed by using the three-dimensional structure of the enzyme as a guide. By employing iterative protein cocrystal structure analysis, design, and synthesis the binding affinity of the lead compound was incrementally improved by over four orders of magnitude. An inversion in inhibitor binding mode was observed crystallographically, providing information critical for subsequent design and highlighting the utility of structural feedback in inhibitor optimization. These inhibitors are selective for the viral protease enzyme, possess good antiviral activity, and are orally available in three species.
Resumo:
Com o escopo de fornecer subsídios para compreender como o processo de colaboração científica ocorre e se desenvolve em uma instituição de pesquisas, particularmente o IPEN, o trabalho utilizou duas abordagens metodológicas. A primeira utilizou a técnica de análise de redes sociais (ARS) para mapear as redes de colaboração científica em P&D do IPEN. Os dados utilizados na ARS foram extraídos da base de dados digitais de publicações técnico-científicas do IPEN, com o auxílio de um programa computacional, e basearam-se em coautoria compreendendo o período de 2001 a 2010. Esses dados foram agrupados em intervalos consecutivos de dois anos gerando cinco redes bienais. Essa primeira abordagem revelou várias características estruturais relacionadas às redes de colaboração, destacando-se os autores mais proeminentes, distribuição dos componentes, densidade, boundary spanners e aspectos relacionados à distância e agrupamento para definir um estado de redes mundo pequeno (small world). A segunda utilizou o método dos mínimos quadrados parciais, uma variante da técnica de modelagem por equações estruturais, para avaliar e testar um modelo conceitual, apoiado em fatores pessoais, sociais, culturais e circunstanciais, para identificar aqueles que melhor explicam a propensão de um autor do IPEN em estabelecer vínculos de colaboração em ambientes de P&D. A partir do modelo consolidado, avaliou-se o quanto ele explica a posição estrutural que um autor ocupa na rede com base em indicadores de ARS. Nesta segunda parte, os dados foram coletados por meio de uma pesquisa de levantamento com a utilização de um questionário. Os resultados mostraram que o modelo explica aproximadamente 41% da propensão de um autor do IPEN em colaborar com outros autores e em relação à posição estrutural de um autor na rede o poder de explicação variou entre 3% e 3,6%. Outros resultados mostraram que a colaboração entre autores do IPEN tem uma correlação positiva com intensidade moderada com a produtividade, da mesma forma que, os autores mais centrais na rede tendem a ampliar a sua visibilidade. Por fim, vários outros indicadores estatísticos bibliométricos referentes à rede de colaboração em P&D do IPEN foram determinados e revelados, como, a média de autores por publicação, média de publicações por autores do IPEN, total de publicações, total de autores e não autores do IPEN, entre outros. Com isso, esse trabalho fornece uma contribuição teórica e empírica aos estudos relacionados à colaboração científica e ao processo de transferência e preservação de conhecimento, assim como, vários subsídios que contribuem para o contexto de tomada de decisão em ambientes de P&D.
Resumo:
Much research has been devoted over the years to investigating and advancing the techniques and tools used by analysts when they model. As opposed to what academics, software providers and their resellers promote as should be happening, the aim of this research was to determine whether practitioners still embraced conceptual modeling seriously. In addition, what are the most popular techniques and tools used for conceptual modeling? What are the major purposes for which conceptual modeling is used? The study found that the top six most frequently used modeling techniques and methods were ER diagramming, data flow diagramming, systems flowcharting, workflow modeling, UML, and structured charts. Modeling technique use was found to decrease significantly from smaller to medium-sized organizations, but then to increase significantly in larger organizations (proxying for large, complex projects). Technique use was also found to significantly follow an inverted U-shaped curve, contrary to some prior explanations. Additionally, an important contribution of this study was the identification of the factors that uniquely influence the decision of analysts to continue to use modeling, viz., communication (using diagrams) to/from stakeholders, internal knowledge (lack of) of techniques, user expectations management, understanding models' integration into the business, and tool/software deficiencies. The highest ranked purposes for which modeling was undertaken were database design and management, business process documentation, business process improvement, and software development. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.
Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.
One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.
Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.
In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.
Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.
The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.
Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.
Resumo:
Themarine environment seems, at first sight, to be a homogeneousmediumlacking barriers to species dispersal. Nevertheless, populations of marine species show varying levels of gene flow and population differentiation, so barriers to gene flow can often be detected. Weaimto elucidate the role of oceanographical factors ingenerating connectivity among populations shaping the phylogeographical patterns in the marine realm, which is not only a topic of considerable interest for understanding the evolution ofmarine biodiversity but also formanagement and conservation of marine life. For this proposal,we investigate the genetic structure and connectivity between continental and insular populations ofwhite seabreamin North East Atlantic (NEA) and Mediterranean Sea (MS) aswell as the influence of historical and contemporary factors in this scenario using mitochondrial (cytochrome b) and nuclear (a set of 9 microsatellite) molecular markers. Azores population appeared genetically differentiated in a single cluster using Structure analysis. This result was corroborated by Principal Component Analysis (PCA) and Monmonier algorithm which suggested a boundary to gene flow, isolating this locality. Azorean population also shows the highest significant values of FST and genetic distances for both molecular markers (microsatellites and mtDNA). We suggest that the breakdown of effective genetic exchange between Azores and the others' samples could be explained simultaneously by hydrographic (deep water) and hydrodynamic (isolating current regimes) factors acting as barriers to the free dispersal of white seabream(adults and larvae) and by historical factors which could be favoured for the survival of Azorean white seabream population at the last glaciation. Mediterranean islands show similar genetic diversity to the neighbouring continental samples and nonsignificant genetic differences. Proximity to continental coasts and the current system could promote an optimal larval dispersion among Mediterranean islands (Mallorca and Castellamare) and coasts with high gene flow.
Resumo:
The molecular structure of the uranyl mineral rutherfordine has been investigated by the measurement of the NIR and Raman spectra and complemented with infrared spectra including their interpretation. The spectra of the rutherfordine show the presence of both water and hydroxyl units in the structure as evidenced by IR bands at 3562 and 3465 cm-1 (OH) and 3343, 3185 and 2980 cm-1 (H2O). Raman spectra show the presence of four sharp bands at 3511, 3460, 3329 and 3151 cm-1. Corresponding molecular water bending vibrations were only observed in both Raman and infrared spectra of one of two studied rutherfordine samples. The second rutherfordine sample studied contained only hydroxyl ions in the equatorial uranyl plane and did not contain any molecular water. The infrared spectra of the (CO3)2- units in the antisymmetric stretching region show complexity with three sets of carbonate bands observed. This combined with the observation of multiple bands in the (CO3)2- bending region in both the Raman and IR spectra suggests that both monodentate and bidentate (CO3)2- units may be present in the structure. This cannot be exactly proved and inferred from the spectra; however, it is in accordance with the X-ray crystallographic studies. Complexity is also observed in the IR spectra of (UO2)2+ antisymmetric stretching region and is attributed to non-identical UO bonds. U-O bond lengths were calculated using wavenumbers of the 3 and 1 (UO2)2+ and compared with data from X-ray single crystal structure analysis of rutherfordine. Existence of solid solution having a general formula (UO2)(CO3)1-x(OH)2x.yH2O ( x, y 0) is supported in the crystal structure of rutherfordine samples studied.
Resumo:
Structural health monitoring (SHM) is the term applied to the procedure of monitoring a structure’s performance, assessing its condition and carrying out appropriate retrofitting so that it performs reliably, safely and efficiently. Bridges form an important part of a nation’s infrastructure. They deteriorate due to age and changing load patterns and hence early detection of damage helps in prolonging the lives and preventing catastrophic failures. Monitoring of bridges has been traditionally done by means of visual inspection. With recent developments in sensor technology and availability of advanced computing resources, newer techniques have emerged for SHM. Acoustic emission (AE) is one such technology that is attracting attention of engineers and researchers all around the world. This paper discusses the use of AE technology in health monitoring of bridge structures, with a special focus on analysis of recorded data. AE waves are stress waves generated by mechanical deformation of material and can be recorded by means of sensors attached to the surface of the structure. Analysis of the AE signals provides vital information regarding the nature of the source of emission. Signal processing of the AE waveform data can be carried out in several ways and is predominantly based on time and frequency domains. Short time Fourier transform and wavelet analysis have proved to be superior alternatives to traditional frequency based analysis in extracting information from recorded waveform. Some of the preliminary results of the application of these analysis tools in signal processing of recorded AE data will be presented in this paper.
Resumo:
Raman spectra of two well-defined types of koritnigite crystals from the Jáchymov ore district, Czech Republic, were recorded and interpreted. No substantial differences were observed between both crystal types. Observed Raman bands were attributed to the (AsO3OH)2- stretching and bending vibrations, stretching and bending vibrations of water molecules and hydroxyl ions. Non-interpreted Raman spectra of koritnigite from the RRUFF database, and published infrared spectra of cobaltkoritnigite were used for comparison. The O-H...O hydrogen bond lengths in the crystal structure of koritnigite were inferred from the Raman spectra and compared with those derived from the X-ray single crystal refinement. The presence of (AsO3OH)2- units in the crystal structure of koritnigite was proved from the Raman spectra which supports the conclusions of the X-ray structure analysis.
Resumo:
Automatic Call Recognition is vital for environmental monitoring. Patten recognition has been applied in automatic species recognition for years. However, few studies have applied formal syntactic methods to species call structure analysis. This paper introduces a novel method to adopt timed and probabilistic automata in automatic species recognition based upon acoustic components as the primitives. We demonstrate this through one kind of birds in Australia: Eastern Yellow Robin.
Resumo:
Standard Monte Carlo (sMC) simulation models have been widely used in AEC industry research to address system uncertainties. Although the benefits of probabilistic simulation analyses over deterministic methods are well documented, the sMC simulation technique is quite sensitive to the probability distributions of the input variables. This phenomenon becomes highly pronounced when the region of interest within the joint probability distribution (a function of the input variables) is small. In such cases, the standard Monte Carlo approach is often impractical from a computational standpoint. In this paper, a comparative analysis of standard Monte Carlo simulation to Markov Chain Monte Carlo with subset simulation (MCMC/ss) is presented. The MCMC/ss technique constitutes a more complex simulation method (relative to sMC), wherein a structured sampling algorithm is employed in place of completely randomized sampling. Consequently, gains in computational efficiency can be made. The two simulation methods are compared via theoretical case studies.