6 resultados para dynamic time warping
em Duke University
Resumo:
With the popularization of GPS-enabled devices such as mobile phones, location data are becoming available at an unprecedented scale. The locations may be collected from many different sources such as vehicles moving around a city, user check-ins in social networks, and geo-tagged micro-blogging photos or messages. Besides the longitude and latitude, each location record may also have a timestamp and additional information such as the name of the location. Time-ordered sequences of these locations form trajectories, which together contain useful high-level information about people's movement patterns.
The first part of this thesis focuses on a few geometric problems motivated by the matching and clustering of trajectories. We first give a new algorithm for computing a matching between a pair of curves under existing models such as dynamic time warping (DTW). The algorithm is more efficient than standard dynamic programming algorithms both theoretically and practically. We then propose a new matching model for trajectories that avoids the drawbacks of existing models. For trajectory clustering, we present an algorithm that computes clusters of subtrajectories, which correspond to common movement patterns. We also consider trajectories of check-ins, and propose a statistical generative model, which identifies check-in clusters as well as the transition patterns between the clusters.
The second part of the thesis considers the problem of covering shortest paths in a road network, motivated by an EV charging station placement problem. More specifically, a subset of vertices in the road network are selected to place charging stations so that every shortest path contains enough charging stations and can be traveled by an EV without draining the battery. We first introduce a general technique for the geometric set cover problem. This technique leads to near-linear-time approximation algorithms, which are the state-of-the-art algorithms for this problem in either running time or approximation ratio. We then use this technique to develop a near-linear-time algorithm for this
shortest-path cover problem.
Resumo:
Human use of the oceans is increasingly in conflict with conservation of endangered species. Methods for managing the spatial and temporal placement of industries such as military, fishing, transportation and offshore energy, have historically been post hoc; i.e. the time and place of human activity is often already determined before assessment of environmental impacts. In this dissertation, I build robust species distribution models in two case study areas, US Atlantic (Best et al. 2012) and British Columbia (Best et al. 2015), predicting presence and abundance respectively, from scientific surveys. These models are then applied to novel decision frameworks for preemptively suggesting optimal placement of human activities in space and time to minimize ecological impacts: siting for offshore wind energy development, and routing ships to minimize risk of striking whales. Both decision frameworks relate the tradeoff between conservation risk and industry profit with synchronized variable and map views as online spatial decision support systems.
For siting offshore wind energy development (OWED) in the U.S. Atlantic (chapter 4), bird density maps are combined across species with weights of OWED sensitivity to collision and displacement and 10 km2 sites are compared against OWED profitability based on average annual wind speed at 90m hub heights and distance to transmission grid. A spatial decision support system enables toggling between the map and tradeoff plot views by site. A selected site can be inspected for sensitivity to a cetaceans throughout the year, so as to capture months of the year which minimize episodic impacts of pre-operational activities such as seismic airgun surveying and pile driving.
Routing ships to avoid whale strikes (chapter 5) can be similarly viewed as a tradeoff, but is a different problem spatially. A cumulative cost surface is generated from density surface maps and conservation status of cetaceans, before applying as a resistance surface to calculate least-cost routes between start and end locations, i.e. ports and entrance locations to study areas. Varying a multiplier to the cost surface enables calculation of multiple routes with different costs to conservation of cetaceans versus cost to transportation industry, measured as distance. Similar to the siting chapter, a spatial decisions support system enables toggling between the map and tradeoff plot view of proposed routes. The user can also input arbitrary start and end locations to calculate the tradeoff on the fly.
Essential to the input of these decision frameworks are distributions of the species. The two preceding chapters comprise species distribution models from two case study areas, U.S. Atlantic (chapter 2) and British Columbia (chapter 3), predicting presence and density, respectively. Although density is preferred to estimate potential biological removal, per Marine Mammal Protection Act requirements in the U.S., all the necessary parameters, especially distance and angle of observation, are less readily available across publicly mined datasets.
In the case of predicting cetacean presence in the U.S. Atlantic (chapter 2), I extracted datasets from the online OBIS-SEAMAP geo-database, and integrated scientific surveys conducted by ship (n=36) and aircraft (n=16), weighting a Generalized Additive Model by minutes surveyed within space-time grid cells to harmonize effort between the two survey platforms. For each of 16 cetacean species guilds, I predicted the probability of occurrence from static environmental variables (water depth, distance to shore, distance to continental shelf break) and time-varying conditions (monthly sea-surface temperature). To generate maps of presence vs. absence, Receiver Operator Characteristic (ROC) curves were used to define the optimal threshold that minimizes false positive and false negative error rates. I integrated model outputs, including tables (species in guilds, input surveys) and plots (fit of environmental variables, ROC curve), into an online spatial decision support system, allowing for easy navigation of models by taxon, region, season, and data provider.
For predicting cetacean density within the inner waters of British Columbia (chapter 3), I calculated density from systematic, line-transect marine mammal surveys over multiple years and seasons (summer 2004, 2005, 2008, and spring/autumn 2007) conducted by Raincoast Conservation Foundation. Abundance estimates were calculated using two different methods: Conventional Distance Sampling (CDS) and Density Surface Modelling (DSM). CDS generates a single density estimate for each stratum, whereas DSM explicitly models spatial variation and offers potential for greater precision by incorporating environmental predictors. Although DSM yields a more relevant product for the purposes of marine spatial planning, CDS has proven to be useful in cases where there are fewer observations available for seasonal and inter-annual comparison, particularly for the scarcely observed elephant seal. Abundance estimates are provided on a stratum-specific basis. Steller sea lions and harbour seals are further differentiated by ‘hauled out’ and ‘in water’. This analysis updates previous estimates (Williams & Thomas 2007) by including additional years of effort, providing greater spatial precision with the DSM method over CDS, novel reporting for spring and autumn seasons (rather than summer alone), and providing new abundance estimates for Steller sea lion and northern elephant seal. In addition to providing a baseline of marine mammal abundance and distribution, against which future changes can be compared, this information offers the opportunity to assess the risks posed to marine mammals by existing and emerging threats, such as fisheries bycatch, ship strikes, and increased oil spill and ocean noise issues associated with increases of container ship and oil tanker traffic in British Columbia’s continental shelf waters.
Starting with marine animal observations at specific coordinates and times, I combine these data with environmental data, often satellite derived, to produce seascape predictions generalizable in space and time. These habitat-based models enable prediction of encounter rates and, in the case of density surface models, abundance that can then be applied to management scenarios. Specific human activities, OWED and shipping, are then compared within a tradeoff decision support framework, enabling interchangeable map and tradeoff plot views. These products make complex processes transparent for gaming conservation, industry and stakeholders towards optimal marine spatial management, fundamental to the tenets of marine spatial planning, ecosystem-based management and dynamic ocean management.
Resumo:
A class of multi-process models is developed for collections of time indexed count data. Autocorrelation in counts is achieved with dynamic models for the natural parameter of the binomial distribution. In addition to modeling binomial time series, the framework includes dynamic models for multinomial and Poisson time series. Markov chain Monte Carlo (MCMC) and Po ́lya-Gamma data augmentation (Polson et al., 2013) are critical for fitting multi-process models of counts. To facilitate computation when the counts are high, a Gaussian approximation to the P ́olya- Gamma random variable is developed.
Three applied analyses are presented to explore the utility and versatility of the framework. The first analysis develops a model for complex dynamic behavior of themes in collections of text documents. Documents are modeled as a “bag of words”, and the multinomial distribution is used to characterize uncertainty in the vocabulary terms appearing in each document. State-space models for the natural parameters of the multinomial distribution induce autocorrelation in themes and their proportional representation in the corpus over time.
The second analysis develops a dynamic mixed membership model for Poisson counts. The model is applied to a collection of time series which record neuron level firing patterns in rhesus monkeys. The monkey is exposed to two sounds simultaneously, and Gaussian processes are used to smoothly model the time-varying rate at which the neuron’s firing pattern fluctuates between features associated with each sound in isolation.
The third analysis presents a switching dynamic generalized linear model for the time-varying home run totals of professional baseball players. The model endows each player with an age specific latent natural ability class and a performance enhancing drug (PED) use indicator. As players age, they randomly transition through a sequence of ability classes in a manner consistent with traditional aging patterns. When the performance of the player significantly deviates from the expected aging pattern, he is identified as a player whose performance is consistent with PED use.
All three models provide a mechanism for sharing information across related series locally in time. The models are fit with variations on the P ́olya-Gamma Gibbs sampler, MCMC convergence diagnostics are developed, and reproducible inference is emphasized throughout the dissertation.
Resumo:
Urban problems have several features that make them inherently dynamic. Large transaction costs all but guarantee that homeowners will do their best to consider how a neighborhood might change before buying a house. Similarly, stores face large sunk costs when opening, and want to be sure that their investment will pay off in the long run. In line with those concerns, different areas of Economics have made recent advances in modeling those questions within a dynamic framework. This dissertation contributes to those efforts.
Chapter 2 discusses how to model an agent’s location decision when the agent must learn about an exogenous amenity that may be changing over time. The model is applied to estimating the marginal willingness to pay to avoid crime, in which agents are learning about the crime rate in a neighborhood, and the crime rate can change in predictable (Markovian) ways.
Chapters 3 and 4 concentrate on location decision problems when there are externalities between decision makers. Chapter 3 focuses on the decision of business owners to open a store, when its demand is a function of other nearby stores, either through competition, or through spillovers on foot traffic. It uses a dynamic model in continuous time to model agents’ decisions. A particular challenge is isolating the contribution of spillovers from the contribution of other unobserved neighborhood attributes that could also lead to agglomeration. A key contribution of this chapter is showing how we can use information on storefront ownership to help separately identify spillovers.
Finally, chapter 4 focuses on a class of models in which families prefer to live
close to similar neighbors. This chapter provides the first simulation of such a model in which agents are forward looking, and shows that this leads to more segregation than it would have been observed with myopic agents, which is the standard in this literature. The chapter also discusses several extensions of the model that can be used to investigate relevant questions such as the arrival of a large contingent high skilled tech workers in San Francisco, the immigration of hispanic families to several southern American cities, large changes in local amenities, such as the construction of magnet schools or metro stations, and the flight of wealthy residents from cities in the Rust belt, such as Detroit.
Resumo:
Urinary tract infections (UTIs) are typically caused by bacteria that colonize different regions of the urinary tract, mainly the bladder and the kidney. Approximately 25% of women that suffer from UTIs experience a recurrent infection within 6 months of the initial bout, making UTIs a serious economic burden resulting in more than 10 million hospital visits and $3.5 billion in healthcare costs in the United States alone. Type-1 fimbriated Uropathogenic E. coli (UPEC) is the major causative agent of UTIs, accounting for almost 90 % of bacterial UTIs. The unique ability of UPEC to bind and invade the superficial bladder epithelium allows the bacteria to persist inside epithelial niches and survive antibiotic treatment. Persistent, intracellular UPEC are retained in the bladder epithelium for long periods, making them a source of recurrent UTIs. Hence, the ability of UPEC to persist in the bladder is a matter of major health and economic concern, making studies exploring the underlying mechanism of UPEC persistence highly relevant.
In my thesis, I will describe how intracellular Uropathogenic E.coli (UPEC) evade host defense mechanisms in the superficial bladder epithelium. I will also describe some of the unique traits of persistent UPEC and explore strategies to induce their clearance from the bladder. I have discovered that the UPEC virulence factor Alpha-hemolysin (HlyA) plays a key role in the survival and persistence of UPEC in the superficial bladder epithelium. In-vitro and in-vivo studies comparing intracellular survival of wild type (WT) and hemolysin deficient UPEC suggested that HlyA is vital for UPEC persistence in the superficial bladder epithelium. Further in-vitro studies revealed that hemolysin helped UPEC persist intracellularly by evading the bacterial expulsion actions of the bladder cells and remarkably, this virulence factor also helped bacteria avoid t degradation in lysosomes.
To elucidate the mechanistic basis for how hemolysin promotes UPEC persistence in the urothelium, we initially focused on how hemolysin facilitates the evasion of UPEC expulsion from bladder cells. We found that upon entry, UPEC were encased in “exocytic vesicles” but as a result of HlyA expression these bacteria escaped these vesicles and entered the cytosol. Consequently, these bacteria were able to avoid expulsion by the cellular export machinery.
Since bacteria found in the cytosol of host cells are typically recognized by the cellular autophagy pathway and transported to the lysosomes where they are degraded, we explored why this was not the case here. We observed that although cytosolic HlyA expressing UPEC were recognized and encased by the autophagy system and transported to lysosomes, the bacteria appeared to avoid degradation in these normally degradative compartments. A closer examination of the bacteria containing lysosomes revealed that they lacked V-ATPase. V-ATPase is a well-known proton pump essential for the acidification of mammalian intracellular degradative compartments, allowing for the proper functioning of degradative proteases. The absence of V-ATPase appeared to be due to hemolysin mediated alteration of the bladder cell F-actin network. From these studies, it is clear that UPEC hemolysin facilitates UPEC persistence in the superficial bladder epithelium by helping bacteria avoid expulsion by the exocytic machinery of the cell and at the same time enabling the bacteria avoid degradation when the bacteria are shuttled into the lysosomes.
Interestingly even though UPEC appear to avoid elimination from the bladder cell their ability to multiple in bladder cells seem limited.. Indeed, our in-vitro and in-vivo experiments reveal that UPEC survive in superficial bladder epithelium for extended periods of time without a significantly change in CFU numbers. Indeed, we observed these bacteria appeared quiescent in nature. This observation was supported by the observation that UPEC genetically unable to enter a quiescence phase exhibited limited ability to persist in bladder cells in vitro and in vivo, in the mouse bladder.
The studies elucidated in this thesis reveal how UPEC toxin, Alpha-hemolysin plays a significant role in promoting UPEC persistence via the modulation of the vesicular compartmentalization of UPEC at two different stages of the infection in the superficial bladder epithelium. These results highlight the importance of UPEC Alpha-hemolysin as an essential determinant of UPEC persistence in the urinary bladder.
Resumo:
Into the Bends of Time is a 40-minute work in seven movements for a large chamber orchestra with electronics, utilizing real-time computer-assisted processing of music performed by live musicians. The piece explores various combinations of interactive relationships between players and electronics, ranging from relatively basic processing effects to musical gestures achieved through stages of computer analysis, in which resulting sounds are crafted according to parameters of the incoming musical material. Additionally, some elements of interaction are multi-dimensional, in that they rely on the participation of two or more performers fulfilling distinct roles in the interactive process with the computer in order to generate musical material. Through processes of controlled randomness, several electronic effects induce elements of chance into their realization so that no two performances of this work are exactly alike. The piece gets its name from the notion that real-time computer-assisted processing, in which sound pressure waves are transduced into electrical energy, converted to digital data, artfully modified, converted back into electrical energy and transduced into sound waves, represents a “bending” of time.
The Bill Evans Trio featuring bassist Scott LaFaro and drummer Paul Motian is widely regarded as one of the most important and influential piano trios in the history of jazz, lauded for its unparalleled level of group interaction. Most analyses of Bill Evans’ recordings, however, focus on his playing alone and fail to take group interaction into account. This paper examines one performance in particular, of Victor Young’s “My Foolish Heart” as recorded in a live performance by the Bill Evans Trio in 1961. In Part One, I discuss Steve Larson’s theory of musical forces (expanded by Robert S. Hatten) and its applicability to jazz performance. I examine other recordings of ballads by this same trio in order to draw observations about normative ballad performance practice. I discuss meter and phrase structure and show how the relationship between the two is fixed in a formal structure of repeated choruses. I then develop a model of perpetual motion based on the musical forces inherent in this structure. In Part Two, I offer a full transcription and close analysis of “My Foolish Heart,” showing how elements of group interaction work with and against the musical forces inherent in the model of perpetual motion to achieve an unconventional, dynamic use of double-time. I explore the concept of a unified agential persona and discuss its role in imparting the song’s inherent rhetorical tension to the instrumental musical discourse.