11 resultados para Data-Driven Behavior Modeling
em DRUM (Digital Repository at the University of Maryland)
Resumo:
The goal of this study is to provide a framework for future researchers to understand and use the FARSITE wildfire-forecasting model with data assimilation. Current wildfire models lack the ability to provide accurate prediction of fire front position faster than real-time. When FARSITE is coupled with a recursive ensemble filter, the data assimilation forecast method improves. The scope includes an explanation of the standalone FARSITE application, technical details on FARSITE integration with a parallel program coupler called OpenPALM, and a model demonstration of the FARSITE-Ensemble Kalman Filter software using the FireFlux I experiment by Craig Clements. The results show that the fire front forecast is improved with the proposed data-driven methodology than with the standalone FARSITE model.
Resumo:
Cancer and cardio-vascular diseases are the leading causes of death world-wide. Caused by systemic genetic and molecular disruptions in cells, these disorders are the manifestation of profound disturbance of normal cellular homeostasis. People suffering or at high risk for these disorders need early diagnosis and personalized therapeutic intervention. Successful implementation of such clinical measures can significantly improve global health. However, development of effective therapies is hindered by the challenges in identifying genetic and molecular determinants of the onset of diseases; and in cases where therapies already exist, the main challenge is to identify molecular determinants that drive resistance to the therapies. Due to the progress in sequencing technologies, the access to a large genome-wide biological data is now extended far beyond few experimental labs to the global research community. The unprecedented availability of the data has revolutionized the capabilities of computational researchers, enabling them to collaboratively address the long standing problems from many different perspectives. Likewise, this thesis tackles the two main public health related challenges using data driven approaches. Numerous association studies have been proposed to identify genomic variants that determine disease. However, their clinical utility remains limited due to their inability to distinguish causal variants from associated variants. In the presented thesis, we first propose a simple scheme that improves association studies in supervised fashion and has shown its applicability in identifying genomic regulatory variants associated with hypertension. Next, we propose a coupled Bayesian regression approach -- eQTeL, which leverages epigenetic data to estimate regulatory and gene interaction potential, and identifies combinations of regulatory genomic variants that explain the gene expression variance. On human heart data, eQTeL not only explains a significantly greater proportion of expression variance in samples, but also predicts gene expression more accurately than other methods. We demonstrate that eQTeL accurately detects causal regulatory SNPs by simulation, particularly those with small effect sizes. Using various functional data, we show that SNPs detected by eQTeL are enriched for allele-specific protein binding and histone modifications, which potentially disrupt binding of core cardiac transcription factors and are spatially proximal to their target. eQTeL SNPs capture a substantial proportion of genetic determinants of expression variance and we estimate that 58% of these SNPs are putatively causal. The challenge of identifying molecular determinants of cancer resistance so far could only be dealt with labor intensive and costly experimental studies, and in case of experimental drugs such studies are infeasible. Here we take a fundamentally different data driven approach to understand the evolving landscape of emerging resistance. We introduce a novel class of genetic interactions termed synthetic rescues (SR) in cancer, which denotes a functional interaction between two genes where a change in the activity of one vulnerable gene (which may be a target of a cancer drug) is lethal, but subsequently altered activity of its partner rescuer gene restores cell viability. Next we describe a comprehensive computational framework --termed INCISOR-- for identifying SR underlying cancer resistance. Applying INCISOR to mine The Cancer Genome Atlas (TCGA), a large collection of cancer patient data, we identified the first pan-cancer SR networks, composed of interactions common to many cancer types. We experimentally test and validate a subset of these interactions involving the master regulator gene mTOR. We find that rescuer genes become increasingly activated as breast cancer progresses, testifying to pervasive ongoing rescue processes. We show that SRs can be utilized to successfully predict patients' survival and response to the majority of current cancer drugs, and importantly, for predicting the emergence of drug resistance from the initial tumor biopsy. Our analysis suggests a potential new strategy for enhancing the effectiveness of existing cancer therapies by targeting their rescuer genes to counteract resistance. The thesis provides statistical frameworks that can harness ever increasing high throughput genomic data to address challenges in determining the molecular underpinnings of hypertension, cardiovascular disease and cancer resistance. We discover novel molecular mechanistic insights that will advance the progress in early disease prevention and personalized therapeutics. Our analyses sheds light on the fundamental biological understanding of gene regulation and interaction, and opens up exciting avenues of translational applications in risk prediction and therapeutics.
Resumo:
Americans are accustomed to a wide range of data collection in their lives: census, polls, surveys, user registrations, and disclosure forms. When logging onto the Internet, users’ actions are being tracked everywhere: clicking, typing, tapping, swiping, searching, and placing orders. All of this data is stored to create data-driven profiles of each user. Social network sites, furthermore, set the voluntarily sharing of personal data as the default mode of engagement. But people’s time and energy devoted to creating this massive amount of data, on paper and online, are taken for granted. Few people would consider their time and energy spent on data production as labor. Even if some people do acknowledge their labor for data, they believe it is accessory to the activities at hand. In the face of pervasive data collection and the rising time spent on screens, why do people keep ignoring their labor for data? How has labor for data been become invisible, as something that is disregarded by many users? What does invisible labor for data imply for everyday cultural practices in the United States? Invisible Labor for Data addresses these questions. I argue that three intertwined forces contribute to framing data production as being void of labor: data production institutions throughout history, the Internet’s technological infrastructure (especially with the implementation of algorithms), and the multiplication of virtual spaces. There is a common tendency in the framework of human interactions with computers to deprive data and bodies of their materiality. My Introduction and Chapter 1 offer theoretical interventions by reinstating embodied materiality and redefining labor for data as an ongoing process. The middle Chapters present case studies explaining how labor for data is pushed to the margin of the narratives about data production. I focus on a nationwide debate in the 1960s on whether the U.S. should build a databank, contemporary Big Data practices in the data broker and the Internet industries, and the group of people who are hired to produce data for other people’s avatars in the virtual games. I conclude with a discussion on how the new development of crowdsourcing projects may usher in the new chapter in exploiting invisible and discounted labor for data.
Resumo:
Human relationships have long been studied by scientists from domains like sociology, psychology, literature, etc. for understanding people's desires, goals, actions and expected behaviors. In this dissertation we study inter-personal relationships as expressed in natural language text. Modeling inter-personal relationships from text finds application in general natural language understanding, as well as real-world domains such as social networks, discussion forums, intelligent virtual agents, etc. We propose that the study of relationships should incorporate not only linguistic cues in text, but also the contexts in which these cues appear. Our investigations, backed by empirical evaluation, support this thesis, and demonstrate that the task benefits from using structured models that incorporate both types of information. We present such structured models to address the task of modeling the nature of relationships between any two given characters from a narrative. To begin with, we assume that relationships are of two types: cooperative and non-cooperative. We first describe an approach to jointly infer relationships between all characters in the narrative, and demonstrate how the task of characterizing the relationship between two characters can benefit from including information about their relationships with other characters in the narrative. We next formulate the relationship-modeling problem as a sequence prediction task to acknowledge the evolving nature of human relationships, and demonstrate the need to model the history of a relationship in predicting its evolution. Thereafter, we present a data-driven method to automatically discover various types of relationships such as familial, romantic, hostile, etc. Like before, we address the task of modeling evolving relationships but don't restrict ourselves to two types of relationships. We also demonstrate the need to incorporate not only local historical but also global context while solving this problem. Lastly, we demonstrate a practical application of modeling inter-personal relationships in the domain of online educational discussion forums. Such forums offer opportunities for its users to interact and form deeper relationships. With this view, we address the task of identifying initiation of such deeper relationships between a student and the instructor. Specifically, we analyze contents of the forums to automatically suggest threads to the instructors that require their intervention. By highlighting scenarios that need direct instructor-student interactions, we alleviate the need for the instructor to manually peruse all threads of the forum and also assist students who have limited avenues for communicating with instructors. We do this by incorporating the discourse structure of the thread through latent variables that abstractly represent contents of individual posts and model the flow of information in the thread. Such latent structured models that incorporate the linguistic cues without losing their context can be helpful in other related natural language understanding tasks as well. We demonstrate this by using the model for a very different task: identifying if a stated desire has been fulfilled by the end of a story.
Resumo:
This dissertation investigates customer behavior modeling in service outsourcing and revenue management in the service sector (i.e., airline and hotel industries). In particular, it focuses on a common theme of improving firms’ strategic decisions through the understanding of customer preferences. Decisions concerning degrees of outsourcing, such as firms’ capacity choices, are important to performance outcomes. These choices are especially important in high-customer-contact services (e.g., airline industry) because of the characteristics of services: simultaneity of consumption and production, and intangibility and perishability of the offering. Essay 1 estimates how outsourcing affects customer choices and market share in the airline industry, and consequently the revenue implications from outsourcing. However, outsourcing decisions are typically endogenous. A firm may choose whether to outsource or not based on what a firm expects to be the best outcome. Essay 2 contributes to the literature by proposing a structural model which could capture a firm’s profit-maximizing decision-making behavior in a market. This makes possible the prediction of consequences (i.e., performance outcomes) of future strategic moves. Another emerging area in service operations management is revenue management. Choice-based revenue systems incorporate discrete choice models into traditional revenue management algorithms. To successfully implement a choice-based revenue system, it is necessary to estimate customer preferences as a valid input to optimization algorithms. The third essay investigates how to estimate customer preferences when part of the market is consistently unobserved. This issue is especially prominent in choice-based revenue management systems. Normally a firm only has its own observed purchases, while those customers who purchase from competitors or do not make purchases are unobserved. Most current estimation procedures depend on unrealistic assumptions about customer arriving. This study proposes a new estimation methodology, which does not require any prior knowledge about the customer arrival process and allows for arbitrary demand distributions. Compared with previous methods, this model performs superior when the true demand is highly variable.
Resumo:
Experimental and analytical studies were conducted to explore thermo-acoustic coupling during the onset of combustion instability in various air-breathing combustor configurations. These include a laboratory-scale 200-kW dump combustor and a 100-kW augmentor featuring a v-gutter flame holder. They were used to simulate main combustion chambers and afterburners in aero engines, respectively. The three primary themes of this work includes: 1) modeling heat release fluctuations for stability analysis, 2) conducting active combustion control with alternative fuels, and 3) demonstrating practical active control for augmentor instability suppression. The phenomenon of combustion instabilities remains an unsolved problem in propulsion engines, mainly because of the difficulty in predicting the fluctuating component of heat release without extensive testing. A hybrid model was developed to describe both the temporal and spatial variations in dynamic heat release, using a separation of variables approach that requires only a limited amount of experimental data. The use of sinusoidal basis functions further reduced the amount of data required. When the mean heat release behavior is known, the only experimental data needed for detailed stability analysis is one instantaneous picture of heat release at the peak pressure phase. This model was successfully tested in the dump combustor experiments, reproducing the correct sign of the overall Rayleigh index as well as the remarkably accurate spatial distribution pattern of fluctuating heat release. Active combustion control was explored for fuel-flexible combustor operation using twelve different jet fuels including bio-synthetic and Fischer-Tropsch types. Analysis done using an actuated spray combustion model revealed that the combustion response times of these fuels were similar. Combined with experimental spray characterizations, this suggested that controller performance should remain effective with various alternative fuels. Active control experiments validated this analysis while demonstrating 50-70\% reduction in the peak spectral amplitude. A new model augmentor was built and tested for combustion dynamics using schlieren and chemiluminescence techniques. Novel active control techniques including pulsed air injection were implemented and the results were compared with the pulsed fuel injection approach. The pulsed injection of secondary air worked just as effectively for suppressing the augmentor instability, setting up the possibility of more efficient actuation strategy.
Resumo:
Symbolic execution is a powerful program analysis technique, but it is very challenging to apply to programs built using event-driven frameworks, such as Android. The main reason is that the framework code itself is too complex to symbolically execute. The standard solution is to manually create a framework model that is simpler and more amenable to symbolic execution. However, developing and maintaining such a model by hand is difficult and error-prone. We claim that we can leverage program synthesis to introduce a high-degree of automation to the process of framework modeling. To support this thesis, we present three pieces of work. First, we introduced SymDroid, a symbolic executor for Android. While Android apps are written in Java, they are compiled to Dalvik bytecode format. Instead of analyzing an app’s Java source, which may not be available, or decompiling from Dalvik back to Java, which requires significant engineering effort and introduces yet another source of potential bugs in an analysis, SymDroid works directly on Dalvik bytecode. Second, we introduced Pasket, a new system that takes a first step toward automatically generating Java framework models to support symbolic execution. Pasket takes as input the framework API and tutorial programs that exercise the framework. From these artifacts and Pasket's internal knowledge of design patterns, Pasket synthesizes an executable framework model by instantiating design patterns, such that the behavior of a synthesized model on the tutorial programs matches that of the original framework. Lastly, in order to scale program synthesis to framework models, we devised adaptive concretization, a novel program synthesis algorithm that combines the best of the two major synthesis strategies: symbolic search, i.e., using SAT or SMT solvers, and explicit search, e.g., stochastic enumeration of possible solutions. Adaptive concretization parallelizes multiple sub-synthesis problems by partially concretizing highly influential unknowns in the original synthesis problem. Thanks to adaptive concretization, Pasket can generate a large-scale model, e.g., thousands lines of code. In addition, we have used an Android model synthesized by Pasket and found that the model is sufficient to allow SymDroid to execute a range of apps.
Resumo:
Deficits in social communication and interaction have been identified as distinguishing impairments for individuals with an autism spectrum disorder (ASD). As a pivotal skill, the successful development of social communication and interaction in individuals with ASD is a lifelong objective. Point-of-view video modeling has the potential to address these deficits. This type of video involves filming the completion of a targeted skill or behavior from a first-person perspective. By presenting only what a person might see from his or her viewpoint, it has been identified to be more effective in limiting irrelevant stimuli by providing a clear frame of reference to facilitate imitation. The current study investigated the use of point-of-view video modeling in teaching social initiations (e.g., greetings). Using a multiple baseline across participants design, five kindergarten participants were taught social initiations using point-of-view video modeling and video priming. Immediately before and after viewing the entire point-of-view video model, the participants were evaluated on their social initiations with a trained, typically developing peer serving as a communication partner. Specifically, the social initiations involved participants’ abilities to shift their attention toward the peer who entered the classroom, maintain attention toward the peer, and engage in an appropriate social initiation (e.g., hi, hello). Both generalization and maintenance were tested. Overall, the data suggest point-of-view video modeling is an effective intervention for increasing social initiations in young students with ASD. However, retraining was necessary for acquisition of skills in the classroom environment. Generalization in novel environments and with a novel communication partner, and generalization to other social initiation skills was limited. Additionally, maintenance of gained social initiation skills only occurred in the intervention room. Despite the limitations of the study and variable results, there are a number of implications moving forward for both practitioners and future researchers examining point-of-view modeling and its potential impact on the social initiation skills of individuals with ASD.
Resumo:
Terrestrial planets produce crusts as they differentiate. The Earth’s bi-modal crust, with a high-standing granitic continental crust and a low-standing basaltic oceanic crust, is unique in our solar system and links the evolution of the interior and exterior of this planet. Here I present geochemical observations to constrain processes accompanying crustal formation and evolution. My approach includes geochemical analyses, quantitative modeling, and experimental studies. The Archean crustal evolution project represents my perspective on when Earth’s continental crust began forming. In this project, I utilized critical element ratios in sedimentary records to track the evolution of the MgO content in the upper continental crust as a function time. The early Archean subaerial crust had >11 wt. % MgO, whereas by the end of Archean its composition had evolved to about 4 wt. % MgO, suggesting a transition of the upper crust from a basalt-like to a more granite-like bulk composition. Driving this fundamental change of the upper crustal composition is the widespread operation of subduction processes, suggesting the onset of global plate tectonics at ~ 3 Ga (Abstract figure). Three of the chapters in this dissertation leverage the use of Eu anomalies to track the recycling of crustal materials back into the mantle, where Eu anomaly is a sensitive measure of the element’s behavior relative to neighboring lanthanoids (Sm and Gd) during crustal differentiation. My compilation of Sm-Eu-Gd data for the continental crust shows that the average crust has a net negative Eu anomaly. This result requires recycling of Eu-enriched lower continental crust to the mantle. Mass balance calculations require that about three times the mass of the modern continental crust was returned into the mantle over Earth history, possibly via density-driven recycling. High precision measurements of Eu/Eu* in selected primitive glasses of mid-ocean ridge basalt (MORB) from global MORs, combined with numerical modeling, suggests that the recycled lower crustal materials are not found within the MORB source and may have at least partially sank into the lower mantle where they can be sampled by hot spot volcanoes. The Lesser Antilles Li isotope project provides insights into the Li systematics of this young island arc, a representative section of proto-continental crust. Martinique Island lavas, to my knowledge, represent the only clear case in which crustal Li is recycled back into their mantle source, as documented by the isotopically light Li isotopes in Lesser Antilles sediments that feed into the fore arc subduction trench. By corollary, the mantle-like Li signal in global arc lavas is likely the result of broadly similar Li isotopic compositions between the upper mantle and bulk subducting sediments in most arcs. My PhD project on Li diffusion mechanism in zircon is being carried out in extensive collaboration with multiple institutes and employs analytical, experimental and modeling studies. This ongoing project, finds that REE and Y play an important role in controlling Li diffusion in natural zircons, with Li partially coupling to REE and Y to maintain charge balance. Access to state-of-art instrumentation presented critical opportunities to identify the mechanisms that cause elemental fractionation during laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) analysis. My work here elucidates the elemental fractionation associated with plasma plume condensation during laser ablation and particle-ion conversion in the ICP.
Resumo:
This dissertation focuses on gaining understanding of cell migration and collective behavior through a combination of experiment, analysis, and modeling techniques. Cell migration is a ubiquitous process that plays an important role during embryonic development and wound healing as well as in diseases like cancer, which is a particular focus of this work. As cancer cells become increasingly malignant, they acquire the ability to migrate away from the primary tumor and spread throughout the body to form metastatic tumors. During this process, changes in gene expression and the surrounding tumor environment can lead to changes in cell migration characteristics. In this thesis, I analyze how cells are guided by the texture of their environment and how cells cooperate with their neighbors to move collectively. The emergent properties of collectively moving groups are a particular focus of this work as collective cell dynamics are known to change in diseases such as cancer. The internal machinery for cell migration involves polymerization of the actin cytoskeleton to create protrusions that---in coordination with retraction of the rear of the cell---lead to cell motion. This actin machinery has been previously shown to respond to the topography of the surrounding surface, leading to guided migration of amoeboid cells. Here we show that epithelial cells on nanoscale ridge structures also show changes in the morphology of their cytoskeletons; actin is found to align with the ridge structures. The migration of the cells is also guided preferentially along the ridge length. These ridge structures are on length scales similar to those found in tumor microenvironments and as such provide a system for studying the response of the cells' internal migration machinery to physiologically relevant topographical cues. In addition to sensing surface topography, individual cells can also be influenced by the pushing and pulling of neighboring cells. The emergent properties of collectively migrating cells show interesting dynamics and are relevant for cancer progression, but have been less studied than the motion of individual cells. We use Particle Image Velocimetry (PIV) to extract the motion of a collectively migrating cell sheet from time lapse images. The resulting flow fields allow us to analyze collective behavior over multiple length and time scales. To analyze the connection between individual cell properties and collective migration behavior, we compare experimental flow fields with the migration of simulated cell groups. Our collective migration metrics allow for a quantitative comparison between experimental and simulated results. This comparison shows that tissue-scale decreases in collective behavior can result from changes in individual cell activity without the need to postulate the existence of subpopulations of leader cells or global gradients. In addition to tissue-scale trends in collective behavior, the migration of cell groups includes localized dynamic features such as cell rearrangements. An individual cell may smoothly follow the motion of its neighbors (affine motion) or move in a more individualistic manner (non-affine motion). By decomposing individual motion into both affine and non-affine components, we measure cell rearrangements within a collective sheet. Finally, finite-time Lyapunov exponent (FTLE) values capture the stretching of the flow field and reflect its chaotic character. Applying collective migration analysis techniques to experimental data on both malignant and non-malignant human breast epithelial cells reveals differences in collective behavior that are not found from analyzing migration speeds alone. Non-malignant cells show increased cooperative motion on long time scales whereas malignant cells remain uncooperative as time progresses. Combining multiple analysis techniques also shows that these two cell types differ in their response to a perturbation of cell-cell adhesion through the molecule E-cadherin. Non-malignant MCF10A cells use E-cadherin for short time coordination of collective motion, yet even with decreased E-cadherin expression, the cells remain coordinated over long time scales. In contrast, the migration behavior of malignant and invasive MCF10CA1a cells, which already shows decreased collective dynamics on both time scales, is insensitive to the change in E-cadherin expression.
Resumo:
This dissertation focuses on gaining understanding of cell migration and collective behavior through a combination of experiment, analysis, and modeling techniques. Cell migration is a ubiquitous process that plays an important role during embryonic development and wound healing as well as in diseases like cancer, which is a particular focus of this work. As cancer cells become increasingly malignant, they acquire the ability to migrate away from the primary tumor and spread throughout the body to form metastatic tumors. During this process, changes in gene expression and the surrounding tumor environment can lead to changes in cell migration characteristics. In this thesis, I analyze how cells are guided by the texture of their environment and how cells cooperate with their neighbors to move collectively. The emergent properties of collectively moving groups are a particular focus of this work as collective cell dynamics are known to change in diseases such as cancer. The internal machinery for cell migration involves polymerization of the actin cytoskeleton to create protrusions that---in coordination with retraction of the rear of the cell---lead to cell motion. This actin machinery has been previously shown to respond to the topography of the surrounding surface, leading to guided migration of amoeboid cells. Here we show that epithelial cells on nanoscale ridge structures also show changes in the morphology of their cytoskeletons; actin is found to align with the ridge structures. The migration of the cells is also guided preferentially along the ridge length. These ridge structures are on length scales similar to those found in tumor microenvironments and as such provide a system for studying the response of the cells' internal migration machinery to physiologically relevant topographical cues. In addition to sensing surface topography, individual cells can also be influenced by the pushing and pulling of neighboring cells. The emergent properties of collectively migrating cells show interesting dynamics and are relevant for cancer progression, but have been less studied than the motion of individual cells. We use Particle Image Velocimetry (PIV) to extract the motion of a collectively migrating cell sheet from time lapse images. The resulting flow fields allow us to analyze collective behavior over multiple length and time scales. To analyze the connection between individual cell properties and collective migration behavior, we compare experimental flow fields with the migration of simulated cell groups. Our collective migration metrics allow for a quantitative comparison between experimental and simulated results. This comparison shows that tissue-scale decreases in collective behavior can result from changes in individual cell activity without the need to postulate the existence of subpopulations of leader cells or global gradients. In addition to tissue-scale trends in collective behavior, the migration of cell groups includes localized dynamic features such as cell rearrangements. An individual cell may smoothly follow the motion of its neighbors (affine motion) or move in a more individualistic manner (non-affine motion). By decomposing individual motion into both affine and non-affine components, we measure cell rearrangements within a collective sheet. Finally, finite-time Lyapunov exponent (FTLE) values capture the stretching of the flow field and reflect its chaotic character. Applying collective migration analysis techniques to experimental data on both malignant and non-malignant human breast epithelial cells reveals differences in collective behavior that are not found from analyzing migration speeds alone. Non-malignant cells show increased cooperative motion on long time scales whereas malignant cells remain uncooperative as time progresses. Combining multiple analysis techniques also shows that these two cell types differ in their response to a perturbation of cell-cell adhesion through the molecule E-cadherin. Non-malignant MCF10A cells use E-cadherin for short time coordination of collective motion, yet even with decreased E-cadherin expression, the cells remain coordinated over long time scales. In contrast, the migration behavior of malignant and invasive MCF10CA1a cells, which already shows decreased collective dynamics on both time scales, is insensitive to the change in E-cadherin expression.