816 resultados para Pattern Analysis Statistical Modeling and Computational Learning (PASCAL)
Resumo:
Synthetic Biology is a relatively new discipline, born at the beginning of the New Millennium, that brings the typical engineering approach (abstraction, modularity and standardization) to biotechnology. These principles aim to tame the extreme complexity of the various components and aid the construction of artificial biological systems with specific functions, usually by means of synthetic genetic circuits implemented in bacteria or simple eukaryotes like yeast. The cell becomes a programmable machine and its low-level programming language is made of strings of DNA. This work was performed in collaboration with researchers of the Department of Electrical Engineering of the University of Washington in Seattle and also with a student of the Corso di Laurea Magistrale in Ingegneria Biomedica at the University of Bologna: Marilisa Cortesi. During the collaboration I contributed to a Synthetic Biology project already started in the Klavins Laboratory. In particular, I modeled and subsequently simulated a synthetic genetic circuit that was ideated for the implementation of a multicelled behavior in a growing bacterial microcolony. In the first chapter the foundations of molecular biology are introduced: structure of the nucleic acids, transcription, translation and methods to regulate gene expression. An introduction to Synthetic Biology completes the section. In the second chapter is described the synthetic genetic circuit that was conceived to make spontaneously emerge, from an isogenic microcolony of bacteria, two different groups of cells, termed leaders and followers. The circuit exploits the intrinsic stochasticity of gene expression and intercellular communication via small molecules to break the symmetry in the phenotype of the microcolony. The four modules of the circuit (coin flipper, sender, receiver and follower) and their interactions are then illustrated. In the third chapter is derived the mathematical representation of the various components of the circuit and the several simplifying assumptions are made explicit. Transcription and translation are modeled as a single step and gene expression is function of the intracellular concentration of the various transcription factors that act on the different promoters of the circuit. A list of the various parameters and a justification for their value closes the chapter. In the fourth chapter are described the main characteristics of the gro simulation environment, developed by the Self Organizing Systems Laboratory of the University of Washington. Then, a sensitivity analysis performed to pinpoint the desirable characteristics of the various genetic components is detailed. The sensitivity analysis makes use of a cost function that is based on the fraction of cells in each one of the different possible states at the end of the simulation and the wanted outcome. Thanks to a particular kind of scatter plot, the parameters are ranked. Starting from an initial condition in which all the parameters assume their nominal value, the ranking suggest which parameter to tune in order to reach the goal. Obtaining a microcolony in which almost all the cells are in the follower state and only a few in the leader state seems to be the most difficult task. A small number of leader cells struggle to produce enough signal to turn the rest of the microcolony in the follower state. It is possible to obtain a microcolony in which the majority of cells are followers by increasing as much as possible the production of signal. Reaching the goal of a microcolony that is split in half between leaders and followers is comparatively easy. The best strategy seems to be increasing slightly the production of the enzyme. To end up with a majority of leaders, instead, it is advisable to increase the basal expression of the coin flipper module. At the end of the chapter, a possible future application of the leader election circuit, the spontaneous formation of spatial patterns in a microcolony, is modeled with the finite state machine formalism. The gro simulations provide insights into the genetic components that are needed to implement the behavior. In particular, since both the examples of pattern formation rely on a local version of Leader Election, a short-range communication system is essential. Moreover, new synthetic components that allow to reliably downregulate the growth rate in specific cells without side effects need to be developed. In the appendix are listed the gro code utilized to simulate the model of the circuit, a script in the Python programming language that was used to split the simulations on a Linux cluster and the Matlab code developed to analyze the data.
Resumo:
In this thesis, we extend some ideas of statistical physics to describe the properties of human mobility. By using a database containing GPS measures of individual paths (position, velocity and covered space at a spatial scale of 2 Km or a time scale of 30 sec), which includes the 2% of the private vehicles in Italy, we succeed in determining some statistical empirical laws pointing out "universal" characteristics of human mobility. Developing simple stochastic models suggesting possible explanations of the empirical observations, we are able to indicate what are the key quantities and cognitive features that are ruling individuals' mobility. To understand the features of individual dynamics, we have studied different aspects of urban mobility from a physical point of view. We discuss the implications of the Benford's law emerging from the distribution of times elapsed between successive trips. We observe how the daily travel-time budget is related with many aspects of the urban environment, and describe how the daily mobility budget is then spent. We link the scaling properties of individual mobility networks to the inhomogeneous average durations of the activities that are performed, and those of the networks describing people's common use of space with the fractional dimension of the urban territory. We study entropy measures of individual mobility patterns, showing that they carry almost the same information of the related mobility networks, but are also influenced by a hierarchy among the activities performed. We discover that Wardrop's principles are violated as drivers have only incomplete information on traffic state and therefore rely on knowledge on the average travel-times. We propose an assimilation model to solve the intrinsic scattering of GPS data on the street network, permitting the real-time reconstruction of traffic state at a urban scale.
Resumo:
For crime scene investigation in cases of homicide, the pattern of bloodstains at the incident site is of critical importance. The morphology of the bloodstain pattern serves to determine the approximate blood source locations, the minimum number of blows and the positioning of the victim. In the present work, the benefits of the three-dimensional bloodstain pattern analysis, including the ballistic approximation of the trajectories of the blood drops, will be demonstrated using two illustrative cases. The crime scenes were documented in 3D, using the non-contact methods digital photogrammetry, tachymetry and laser scanning. Accurate, true-to-scale 3D models of the crime scenes, including the bloodstain pattern and the traces, were created. For the determination of the areas of origin of the bloodstain pattern, the trajectories of up to 200 well-defined bloodstains were analysed in CAD and photogrammetry software. The ballistic determination of the trajectories was performed using ballistics software. The advantages of this method are the short preparation time on site, the non-contact measurement of the bloodstains and the high accuracy of the bloodstain analysis. It should be expected that this method delivers accurate results regarding the number and position of the areas of origin of bloodstains, in particular the vertical component is determined more precisely than using conventional methods. In both cases relevant forensic conclusions regarding the course of events were enabled by the ballistic bloodstain pattern analysis.
Resumo:
The hydraulic fracturing of the Marcellus Formation creates a byproduct known as frac water. Five frac water samples were collected in Bradford County, PA. Inorganic chemical analysis, field parameters analysis, alkalinity titrations, total dissolved solids(TDS), total suspended solids (TSS), biological oxygen demand (BOD), and chemical oxygen demand (COD) were conducted on each sample to characterize frac water. A database of frac water chemistry results from across the state of Pennsylvania from multiple sources was compiled in order to provide the public and research communitywith an accurate characterization of frac water. Four geochemical models were created to model the reactions between frac water and the Marcellus Formation, Purcell Limestone, and the oil field brines presumed present in the formations. The average concentrations of chloride and TDS in the five frac water samples were 1.1 �± 0.5 x 105 mg/L (5.5X average seawater) and 140,000 mg/L (4X average seawater). BOD values for frac water immediately upon flow back were over 10X greater than the BOD of typical wastewater, but decreased into the range of typical wastewater after a short period of time. The COD of frac water decreases dramatically with an increase in elapsed time from flow back, but remain considerably higher than typicalwastewater. Different alkalinity calculation methods produced a range of alkalinity values for frac water: this result is most likely due to high concentrations of aliphatic acid anions present in the samples. Laboratory analyses indicate that the frac watercomposition is quite variable depending on the companies from which the water was collected, the geology of the local area, and number of fracturing jobs in which the frac water was used, but will require more treatment than typical wastewater regardless of theprecise composition of each sample. The geochemical models created suggest that the presence of organic complexes in an oil field brine and Marcellus Formation aid in the dissolution of ions such as bariumand strontium into the solution. Although equilibration reactions between the Marcellus Formation and the slickwater account for some of the final frac water composition, the predominant control of frac water composition appears to be the ratio of the mixture between the oil field brine and slickwater. The high concentration of barium in the frac water is likely due to the abundance of barite nodules in the Purcell Limestone, and the lack of sulfate in the frac water samples is due to the reducing, anoxic conditions in the earth's subsurface that allow for the degassing of H2S(g).
Resumo:
For various reasons, it is important, if not essential, to integrate the computations and code used in data analyses, methodological descriptions, simulations, etc. with the documents that describe and rely on them. This integration allows readers to both verify and adapt the statements in the documents. Authors can easily reproduce them in the future, and they can present the document's contents in a different medium, e.g. with interactive controls. This paper describes a software framework for authoring and distributing these integrated, dynamic documents that contain text, code, data, and any auxiliary content needed to recreate the computations. The documents are dynamic in that the contents, including figures, tables, etc., can be recalculated each time a view of the document is generated. Our model treats a dynamic document as a master or ``source'' document from which one can generate different views in the form of traditional, derived documents for different audiences. We introduce the concept of a compendium as both a container for the different elements that make up the document and its computations (i.e. text, code, data, ...), and as a means for distributing, managing and updating the collection. The step from disseminating analyses via a compendium to reproducible research is a small one. By reproducible research, we mean research papers with accompanying software tools that allow the reader to directly reproduce the results and employ the methods that are presented in the research paper. Some of the issues involved in paradigms for the production, distribution and use of such reproducible research are discussed.
Resumo:
Mobile Mesh Network based In-Transit Visibility (MMN-ITV) system facilitates global real-time tracking capability for the logistics system. In-transit containers form a multi-hop mesh network to forward the tracking information to the nearby sinks, which further deliver the information to the remote control center via satellite. The fundamental challenge to the MMN-ITV system is the energy constraint of the battery-operated containers. Coupled with the unique mobility pattern, cross-MMN behavior, and the large-spanned area, it is necessary to investigate the energy-efficient communication of the MMN-ITV system thoroughly. First of all, this dissertation models the energy-efficient routing under the unique pattern of the cross-MMN behavior. A new modeling approach, pseudo-dynamic modeling approach, is proposed to measure the energy-efficiency of the routing methods in the presence of the cross-MMN behavior. With this approach, it could be identified that the shortest-path routing and the load-balanced routing is energy-efficient in mobile networks and static networks respectively. For the MMN-ITV system with both mobile and static MMNs, an energy-efficient routing method, energy-threshold routing, is proposed to achieve the best tradeoff between them. Secondly, due to the cross-MMN behavior, neighbor discovery is executed frequently to help the new containers join the MMN, hence, consumes similar amount of energy as that of the data communication. By exploiting the unique pattern of the cross-MMN behavior, this dissertation proposes energy-efficient neighbor discovery wakeup schedules to save up to 60% of the energy for neighbor discovery. Vehicular Ad Hoc Networks (VANETs)-based inter-vehicle communications is by now growingly believed to enhance traffic safety and transportation management with low cost. The end-to-end delay is critical for the time-sensitive safety applications in VANETs, and can be a decisive performance metric for VANETs. This dissertation presents a complete analytical model to evaluate the end-to-end delay against the transmission range and the packet arrival rate. This model illustrates a significant end-to-end delay increase from non-saturated networks to saturated networks. It hence suggests that the distributed power control and admission control protocols for VANETs should aim at improving the real-time capacity (the maximum packet generation rate without causing saturation), instead of the delay itself. Based on the above model, it could be determined that adopting uniform transmission range for every vehicle may hinder the delay performance improvement, since it does not allow the coexistence of the short path length and the low interference. Clusters are proposed to configure non-uniform transmission range for the vehicles. Analysis and simulation confirm that such configuration can enhance the real-time capacity. In addition, it provides an improved trade off between the end-to-end delay and the network capacity. A distributed clustering protocol with minimum message overhead is proposed, which achieves low convergence time.
Resumo:
We present a program (Ragu; Randomization Graphical User interface) for statistical analyses of multichannel event-related EEG and MEG experiments. Based on measures of scalp field differences including all sensors, and using powerful, assumption-free randomization statistics, the program yields robust, physiologically meaningful conclusions based on the entire, untransformed, and unbiased set of measurements. Ragu accommodates up to two within-subject factors and one between-subject factor with multiple levels each. Significance is computed as function of time and can be controlled for type II errors with overall analyses. Results are displayed in an intuitive visual interface that allows further exploration of the findings. A sample analysis of an ERP experiment illustrates the different possibilities offered by Ragu. The aim of Ragu is to maximize statistical power while minimizing the need for a-priori choices of models and parameters (like inverse models or sensors of interest) that interact with and bias statistics.
Resumo:
Stemmatology, or the reconstruction of the transmission history of texts, is a field that stands particularly to gain from digital methods. Many scholars already take stemmatic approaches that rely heavily on computational analysis of the collated text (e.g. Robinson and O’Hara 1996; Salemans 2000; Heikkilä 2005; Windram et al. 2008 among many others). Although there is great value in computationally assisted stemmatology, providing as it does a reproducible result and allowing access to the relevant methodological process in related fields such as evolutionary biology, computational stemmatics is not without its critics. The current state-of-the-art effectively forces scholars to choose between a preconceived judgment of the significance of textual differences (the Lachmannian or neo-Lachmannian approach, and the weighted phylogenetic approach) or to make no judgment at all (the unweighted phylogenetic approach). Some basis for judgment of the significance of variation is sorely needed for medieval text criticism in particular. By this, we mean that there is a need for a statistical empirical profile of the text-genealogical significance of the different sorts of variation in different sorts of medieval texts. The rules that apply to copies of Greek and Latin classics may not apply to copies of medieval Dutch story collections; the practices of copying authoritative texts such as the Bible will most likely have been different from the practices of copying the Lives of local saints and other commonly adapted texts. It is nevertheless imperative that we have a consistent, flexible, and analytically tractable model for capturing these phenomena of transmission. In this article, we present a computational model that captures most of the phenomena of text variation, and a method for analysis of one or more stemma hypotheses against the variation model. We apply this method to three ‘artificial traditions’ (i.e. texts copied under laboratory conditions by scholars to study the properties of text variation) and four genuine medieval traditions whose transmission history is known or deduced in varying degrees. Although our findings are necessarily limited by the small number of texts at our disposal, we demonstrate here some of the wide variety of calculations that can be made using our model. Certain of our results call sharply into question the utility of excluding ‘trivial’ variation such as orthographic and spelling changes from stemmatic analysis.
Resumo:
The attentional blink (AB) is a fundamental limitation of the ability to select relevant information from irrelevant information. It can be observed with the detection rate in an AB task as well as with the corresponding P300 amplitude of the event-related potential. In previous research, however, correlations between these two levels of observation were weak and rather inconsistent. A possible explanation of this finding might be that multiple processes underlie the AB and, thus, obscure a possible relationship between AB-related detection rate and the corresponding P300 amplitude. The present study investigated this assumption by applying a fixed-links modeling approach to represent behavioral individual differences in the AB as a latent variable. Concurrently, this approach enabled us to control for additional sources of variance in AB performance by deriving two additional latent variables. The correlation between the latent variable reflecting behavioral individual differences in AB magnitude and a corresponding latent variable derived from the P300 amplitude was high (r=.70). Furthermore, this correlation was considerably stronger than the correlations of other behavioral measures of the AB magnitude with their psychophysiological counterparts (all rs<.40). Our findings clearly indicate that the systematic disentangling of various sources of variance by utilizing the fixed-links modeling approach is a promising tool to investigate behavioral individual differences in the AB and possible psychophysiological correlates of these individual differences.
Resumo:
Colorectal cancer is the forth most common diagnosed cancer in the United States. Every year about a hundred forty-seven thousand people will be diagnosed with colorectal cancer and fifty-six thousand people lose their lives due to this disease. Most of the hereditary nonpolyposis colorectal cancer (HNPCC) and 12% of the sporadic colorectal cancer show microsatellite instability. Colorectal cancer is a multistep progressive disease. It starts from a mutation in a normal colorectal cell and grows into a clone of cells that further accumulates mutations and finally develops into a malignant tumor. In terms of molecular evolution, the process of colorectal tumor progression represents the acquisition of sequential mutations. ^ Clinical studies use biomarkers such as microsatellite or single nucleotide polymorphisms (SNPs) to study mutation frequencies in colorectal cancer. Microsatellite data obtained from single genome equivalent PCR or small pool PCR can be used to infer tumor progression. Since tumor progression is similar to population evolution, we used an approach known as coalescent, which is well established in population genetics, to analyze this type of data. Coalescent theory has been known to infer the sample's evolutionary path through the analysis of microsatellite data. ^ The simulation results indicate that the constant population size pattern and the rapid tumor growth pattern have different genetic polymorphic patterns. The simulation results were compared with experimental data collected from HNPCC patients. The preliminary result shows the mutation rate in 6 HNPCC patients range from 0.001 to 0.01. The patients' polymorphic patterns are similar to the constant population size pattern which implies the tumor progression is through multilineage persistence instead of clonal sequential evolution. The results should be further verified using a larger dataset. ^