8 resultados para Database search Evidential value Bayesian decision theory Influence diagrams
em Duke University
Resumo:
The supplementary eye fields (SEFs) are located in dorsomedial frontal cortex and contribute to high-level control of eye movements. Recordings in the SEF reveal neural activity related to vision, saccades, and fixations, and electrical stimulation in the SEF evokes saccades and fixations. Inactivations and lesions of the SEF, however, cause minimal oculomotor deficits. The SEF thus processes information relevant to eye movements and influences critical oculomotor centers but seems unnecessary for generating action. Instead, the SEF has overarching, subtle functions that include limb-eye coordination, the timing and sequencing of actions, learning, monitoring conflict, prediction, supervising behavior, value-based decision making, and the monitoring of decisions.
Resumo:
The supplementary eye fields (SEFs) are located in dorsomedial frontal cortex and contribute to high-level control of eye movements. Recordings in the SEF reveal neural activity related to vision, saccades, and fixations, and electrical stimulation in the SEF evokes saccades and fixations. Inactivations and lesions of the SEF, however, cause minimal oculomotor deficits. The SEF thus processes information relevant to eye movements and influences critical oculomotor centers but seems unnecessary for generating action. Instead, the SEF has overarching, subtle functions that include limb-eye coordination, the timing and sequencing of actions, learning, monitoring conflict, prediction, supervising behavior, value-based decision making, and the monitoring of decisions.
Resumo:
As the world population continues to grow past seven billion people and global challenges continue to persist including resource availability, biodiversity loss, climate change and human well-being, a new science is required that can address the integrated nature of these challenges and the multiple scales on which they are manifest. Sustainability science has emerged to fill this role. In the fifteen years since it was first called for in the pages of Science, it has rapidly matured, however its place in the history of science and the way it is practiced today must be continually evaluated. In Part I, two chapters address this theoretical and practical grounding. Part II transitions to the applied practice of sustainability science in addressing the urban heat island (UHI) challenge wherein the climate of urban areas are warmer than their surrounding rural environs. The UHI has become increasingly important within the study of earth sciences given the increased focus on climate change and as the balance of humans now live in urban areas.
In Chapter 2 a novel contribution to the historical context of sustainability is argued. Sustainability as a concept characterizing the relationship between humans and nature emerged in the mid to late 20th century as a response to findings used to also characterize the Anthropocene. Emerging from the human-nature relationships that came before it, evidence is provided that suggests Sustainability was enabled by technology and a reorientation of world-view and is unique in its global boundary, systematic approach and ambition for both well being and the continued availability of resources and Earth system function. Sustainability is further an ambition that has wide appeal, making it one of the first normative concepts of the Anthropocene.
Despite its widespread emergence and adoption, sustainability science continues to suffer from definitional ambiguity within the academe. In Chapter 3, a review of efforts to provide direction and structure to the science reveals a continuum of approaches anchored at either end by differing visions of how the science interfaces with practice (solutions). At one end, basic science of societally defined problems informs decisions about possible solutions and their application. At the other end, applied research directly affects the options available to decision makers. While clear from the literature, survey data further suggests that the dichotomy does not appear to be as apparent in the minds of practitioners.
In Chapter 4, the UHI is first addressed at the synoptic, mesoscale. Urban climate is the most immediate manifestation of the warming global climate for the majority of people on earth. Nearly half of those people live in small to medium sized cities, an understudied scale in urban climate research. Widespread characterization would be useful to decision makers in planning and design. Using a multi-method approach, the mesoscale UHI in the study region is characterized and the secular trend over the last sixty years evaluated. Under isolated ideal conditions the findings indicate a UHI of 5.3 ± 0.97 °C to be present in the study area, the magnitude of which is growing over time.
Although urban heat islands (UHI) are well studied, there remain no panaceas for local scale mitigation and adaptation methods, therefore continued attention to characterization of the phenomenon in urban centers of different scales around the globe is required. In Chapter 5, a local scale analysis of the canopy layer and surface UHI in a medium sized city in North Carolina, USA is conducted using multiple methods including stationary urban sensors, mobile transects and remote sensing. Focusing on the ideal conditions for UHI development during an anticyclonic summer heat event, the study observes a range of UHI intensity depending on the method of observation: 8.7 °C from the stationary urban sensors; 6.9 °C from mobile transects; and, 2.2 °C from remote sensing. Additional attention is paid to the diurnal dynamics of the UHI and its correlation with vegetation indices, dewpoint and albedo. Evapotranspiration is shown to drive dynamics in the study region.
Finally, recognizing that a bridge must be established between the physical science community studying the Urban Heat Island (UHI) effect, and the planning community and decision makers implementing urban form and development policies, Chapter 6 evaluates multiple urban form characterization methods. Methods evaluated include local climate zones (LCZ), national land cover database (NCLD) classes and urban cluster analysis (UCA) to determine their utility in describing the distribution of the UHI based on three standard observation types 1) fixed urban temperature sensors, 2) mobile transects and, 3) remote sensing. Bivariate, regression and ANOVA tests are used to conduct the analyses. Findings indicate that the NLCD classes are best correlated to the UHI intensity and distribution in the study area. Further, while the UCA method is not useful directly, the variables included in the method are predictive based on regression analysis so the potential for better model design exists. Land cover variables including albedo, impervious surface fraction and pervious surface fraction are found to dominate the distribution of the UHI in the study area regardless of observation method.
Chapter 7 provides a summary of findings, and offers a brief analysis of their implications for both the scientific discourse generally, and the study area specifically. In general, the work undertaken does not achieve the full ambition of sustainability science, additional work is required to translate findings to practice and more fully evaluate adoption. The implications for planning and development in the local region are addressed in the context of a major light-rail infrastructure project including several systems level considerations like human health and development. Finally, several avenues for future work are outlined. Within the theoretical development of sustainability science, these pathways include more robust evaluations of the theoretical and actual practice. Within the UHI context, these include development of an integrated urban form characterization model, application of study methodology in other geographic areas and at different scales, and use of novel experimental methods including distributed sensor networks and citizen science.
Resumo:
Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.
Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.
One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.
Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.
In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.
Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.
The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.
Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.
Resumo:
The advances in three related areas of state-space modeling, sequential Bayesian learning, and decision analysis are addressed, with the statistical challenges of scalability and associated dynamic sparsity. The key theme that ties the three areas is Bayesian model emulation: solving challenging analysis/computational problems using creative model emulators. This idea defines theoretical and applied advances in non-linear, non-Gaussian state-space modeling, dynamic sparsity, decision analysis and statistical computation, across linked contexts of multivariate time series and dynamic networks studies. Examples and applications in financial time series and portfolio analysis, macroeconomics and internet studies from computational advertising demonstrate the utility of the core methodological innovations.
Chapter 1 summarizes the three areas/problems and the key idea of emulating in those areas. Chapter 2 discusses the sequential analysis of latent threshold models with use of emulating models that allows for analytical filtering to enhance the efficiency of posterior sampling. Chapter 3 examines the emulator model in decision analysis, or the synthetic model, that is equivalent to the loss function in the original minimization problem, and shows its performance in the context of sequential portfolio optimization. Chapter 4 describes the method for modeling the steaming data of counts observed on a large network that relies on emulating the whole, dependent network model by independent, conjugate sub-models customized to each set of flow. Chapter 5 reviews those advances and makes the concluding remarks.
Resumo:
Humans are natural politicians. We obsessively collect social information that is both observable (e.g., about third-party relationships) and unobservable (e.g., about others’ psychological states), and we strategically employ that information to manage our cooperative and competitive relationships. To what extent are these abilities unique to our species, and how did they evolve? The present dissertation seeks to contribute to these two questions. To do so, I take a comparative perspective, investigating social decision-making in humans’ closest living relatives, bonobos and chimpanzees. In Chapter 1, I review existing literature on theory of mind—or the ability to understand others’ psychological states—in these species. I also present a theoretical framework to guide further investigation of social cognition in bonobos and chimpanzees based on hypotheses about the proximate and ultimate origins of their species differences. In Chapter 2, I experimentally investigate differences in the prosocial behavior of bonobos and chimpanzees, revealing species-specific prosocial motivations that appear to be less flexible than those exhibited by humans. In Chapter 3, I explore through decision-making experiments bonobos’ ability to evaluate others based on their prosocial or antisocial behavior during third-party interactions. Bonobos do track the interactions of third-parties and evaluate actors based on these interactions. However, they do not exhibit the human preference for those who are prosocial towards others, instead consistently favoring an antisocial individual. The motivation to prefer those who demonstrate a prosocial disposition may be a unique feature of human psychology that contributes to our ultra-cooperative nature. In Chapter 4, I investigate the adaptive value of social cognition in wild primates. I show that the recruitment behavior of wild chimpanzees at Gombe National Park, Tanzania is consistent with the use of third-party knowledge, and that those who appear to use third-party knowledge receive immediate proximate benefits. They escape further aggression from their opponents. These findings directly support the social intelligence hypothesis that social cognition has evolved in response to the demands of competing with one’s own group-mates. Thus, the studies presented here help to better characterize the features of social decision-making that are unique to humans, and how these abilities evolved.
Resumo:
Marketers have long looked for observables that could explain differences in consumer behavior. Initial attempts have centered on demographic factors, such as age, gender, and race. Although such variables are able to provide some useful information for segmentation (Bass, Tigert, and Longdale 1968), more recent studies have shown that variables that tap into consumers’ social classes and personal values have more predictive accuracy and also provide deeper insights into consumer behavior. I argue that one demographic construct, religion, merits further consideration as a factor that has a profound impact on consumer behavior. In this dissertation, I focus on two types of religious guidance that may influence consumer behaviors: religious teachings (being content with one’s belongings), and religious problem-solving styles (reliance on God).
Essay 1 focuses on the well-established endowment effect and introduces a new moderator (religious teachings on contentment) that influences both owner and buyers’ pricing behaviors. Through fifteen experiments, I demonstrate that when people are primed with religion or characterized by stronger religious beliefs, they tend to value their belongings more than people who are not primed with religion or who have weaker religious beliefs. These effects are caused by religious teachings on being content with one’s belongings, which lead to the overvaluation of one’s own possessions.
Essay 2 focuses on self-control behaviors, specifically healthy eating, and introduces a new moderator (God’s role in the decision-making process) that determines the relationship between religiosity and the healthiness of food choices. My findings demonstrate that consumers who indicate that they defer to God in their decision-making make unhealthier food choices as their religiosity increases. The opposite is true for consumers who rely entirely on themselves. Importantly, this relationship is mediated by the consumer’s consideration of future consequences. This essay provides an explanation to the existing mixed findings on the relationship between religiosity and obesity.
Resumo:
Periods of drought and low streamflow can have profound impacts on both human and natural systems. People depend on a reliable source of water for numerous reasons including potable water supply and to produce economic value through agriculture or energy production. Aquatic ecosystems depend on water in addition to the economic benefits they provide to society through ecosystem services. Given that periods of low streamflow may become more extreme and frequent in the future, it is important to study the factors that control water availability during these times. In the absence of precipitation the slower hydrological response of groundwater systems will play an amplified role in water supply. Understanding the variability of the fraction of streamflow contribution from baseflow or groundwater during periods of drought provides insight into what future water availability may look like and how it can best be managed. The Mills River Basin in North Carolina is chosen as a case-study to test this understanding. First, obtaining a physically meaningful estimation of baseflow from USGS streamflow data via computerized hydrograph analysis techniques is carried out. Then applying a method of time series analysis including wavelet analysis can highlight signals of non-stationarity and evaluate the changes in variance required to better understand the natural variability of baseflow and low flows. In addition to natural variability, human influence must be taken into account in order to accurately assess how the combined system reacts to periods of low flow. Defining a combined demand that consists of both natural and human demand allows us to be more rigorous in assessing the level of sustainable use of a shared resource, in this case water. The analysis of baseflow variability can differ based on regional location and local hydrogeology, but it was found that baseflow varies from multiyear scales such as those associated with ENSO (3.5, 7 years) up to multi decadal time scales, but with most of the contributing variance coming from decadal or multiyear scales. It was also found that the behavior of baseflow and subsequently water availability depends a great deal on overall precipitation, the tracks of hurricanes or tropical storms and associated climate indices, as well as physiography and hydrogeology. Evaluating and utilizing the Duke Combined Hydrology Model (DCHM), reasonably accurate estimates of streamflow during periods of low flow were obtained in part due to the model’s ability to capture subsurface processes. Being able to accurately simulate streamflow levels and subsurface interactions during periods of drought can be very valuable to water suppliers, decision makers, and ultimately impact citizens. Knowledge of future droughts and periods of low flow in addition to tracking customer demand will allow for better management practices on the part of water suppliers such as knowing when they should withdraw more water during a surplus so that the level of stress on the system is minimized when there is not ample water supply.