3 resultados para Continuous Variable Systems
em Duke University
Resumo:
Abstract
Continuous variable is one of the major data types collected by the survey organizations. It can be incomplete such that the data collectors need to fill in the missingness. Or, it can contain sensitive information which needs protection from re-identification. One of the approaches to protect continuous microdata is to sum them up according to different cells of features. In this thesis, I represents novel methods of multiple imputation (MI) that can be applied to impute missing values and synthesize confidential values for continuous and magnitude data.
The first method is for limiting the disclosure risk of the continuous microdata whose marginal sums are fixed. The motivation for developing such a method comes from the magnitude tables of non-negative integer values in economic surveys. I present approaches based on a mixture of Poisson distributions to describe the multivariate distribution so that the marginals of the synthetic data are guaranteed to sum to the original totals. At the same time, I present methods for assessing disclosure risks in releasing such synthetic magnitude microdata. The illustration on a survey of manufacturing establishments shows that the disclosure risks are low while the information loss is acceptable.
The second method is for releasing synthetic continuous micro data by a nonstandard MI method. Traditionally, MI fits a model on the confidential values and then generates multiple synthetic datasets from this model. Its disclosure risk tends to be high, especially when the original data contain extreme values. I present a nonstandard MI approach conditioned on the protective intervals. Its basic idea is to estimate the model parameters from these intervals rather than the confidential values. The encouraging results of simple simulation studies suggest the potential of this new approach in limiting the posterior disclosure risk.
The third method is for imputing missing values in continuous and categorical variables. It is extended from a hierarchically coupled mixture model with local dependence. However, the new method separates the variables into non-focused (e.g., almost-fully-observed) and focused (e.g., missing-a-lot) ones. The sub-model structure of focused variables is more complex than that of non-focused ones. At the same time, their cluster indicators are linked together by tensor factorization and the focused continuous variables depend locally on non-focused values. The model properties suggest that moving the strongly associated non-focused variables to the side of focused ones can help to improve estimation accuracy, which is examined by several simulation studies. And this method is applied to data from the American Community Survey.
Resumo:
Dynamics of biomolecules over various spatial and time scales are essential for biological functions such as molecular recognition, catalysis and signaling. However, reconstruction of biomolecular dynamics from experimental observables requires the determination of a conformational probability distribution. Unfortunately, these distributions cannot be fully constrained by the limited information from experiments, making the problem an ill-posed one in the terminology of Hadamard. The ill-posed nature of the problem comes from the fact that it has no unique solution. Multiple or even an infinite number of solutions may exist. To avoid the ill-posed nature, the problem needs to be regularized by making assumptions, which inevitably introduce biases into the result.
Here, I present two continuous probability density function approaches to solve an important inverse problem called the RDC trigonometric moment problem. By focusing on interdomain orientations we reduced the problem to determination of a distribution on the 3D rotational space from residual dipolar couplings (RDCs). We derived an analytical equation that relates alignment tensors of adjacent domains, which serves as the foundation of the two methods. In the first approach, the ill-posed nature of the problem was avoided by introducing a continuous distribution model, which enjoys a smoothness assumption. To find the optimal solution for the distribution, we also designed an efficient branch-and-bound algorithm that exploits the mathematical structure of the analytical solutions. The algorithm is guaranteed to find the distribution that best satisfies the analytical relationship. We observed good performance of the method when tested under various levels of experimental noise and when applied to two protein systems. The second approach avoids the use of any model by employing maximum entropy principles. This 'model-free' approach delivers the least biased result which presents our state of knowledge. In this approach, the solution is an exponential function of Lagrange multipliers. To determine the multipliers, a convex objective function is constructed. Consequently, the maximum entropy solution can be found easily by gradient descent methods. Both algorithms can be applied to biomolecular RDC data in general, including data from RNA and DNA molecules.
Resumo:
Background: Organophosphate (OP) pesticides are well-known developmental neurotoxicants that have been linked to abnormal cognitive and behavioral endpoints through both epidemiological studies and animal models of behavioral teratology, and are implicated in the dysfunction of multiple neurotransmitters, including dopamine. Chemical similarities between OP pesticides and organophosphate flame retardants (OPFRs), a class of compounds growing in use and environmental relevance, have produced concern regarding whether developmental exposures to OPFRs and OP pesticides may share behavioral outcomes, impacts on dopaminergic systems, or both. Methods: Using the zebrafish animal model, we exposed developing fish to two OPFRs, TDCIPP and TPHP, as well as the OP pesticide chlorpyrifos, during the first 5 days following fertilization. From there, the exposed fish were assayed for behavioral abnormalities and effects on monoamine neurochemistry as both larvae and adults. An experiment conducted in parallel examined how antagonism of the dopamine system during an identical window of development could alter later life behavior in the same assays. Finally, we investigated the interaction between developmental exposure to an OPFR and acute dopamine antagonism in larval behavior. Results: Developmental exposure to all three OP compounds altered zebrafish behavior, with effects persisting into adulthood. Additionally, exposure to an OPFR decreased the behavioral response to acute D2 receptor antagonism in larvae. However, the pattern of behavioral effects diverged substantially from those seen following developmental dopamine antagonism, and the investigations into dopamine neurochemistry were too variable to be conclusive. Thus, although the results support the hypothesis that OPFRs, as with OP pesticides such as chlorpyrifos, may present a risk to normal behavioral development, we were unable to directly link these effects to any dopaminergic dysfunction.