Biblioteca Digital

963 resultados para count data models

Adaptation and gain pool summation:Alternative models and masking data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Foley [J. Opt. Soc. Am. A 11 (1994) 1710] has proposed an influential psychophysical model of masking in which mask components in a contrast gain pool are raised to an exponent before summation and divisive inhibition. We tested this summation rule in experiments in which contrast detection thresholds were measured for a vertical 1 c/deg (or 2 c/deg) sine-wave component in the presence of a 3 c/deg (or 6 c/deg) mask that had either a single component oriented at -45° or a pair of components oriented at ±45°. Contrary to the predictions of Foley's model 3, we found that for masks of moderate contrast and above, threshold elevation was predicted by linear summation of the mask components in the inhibitory stage of the contrast gain pool. We built this feature into two new models, referred to as the early adaptation model and the hybrid model. In the early adaptation model, contrast adaptation controls a threshold-like nonlinearity on the output of otherwise linear pathways that provide the excitatory and inhibitory inputs to a gain control stage. The hybrid model involves nonlinear and nonadaptable routes to excitatory and inhibitory stages as well as an adaptable linear route. With only six free parameters, both models provide excellent fits to the masking and adaptation data of Foley and Chen [Vision Res. 37 (1997) 2779] but unlike Foley and Chen's model, are able to do so with only one adaptation parameter. However, only the hybrid model is able to capture the features of Foley's (1994) pedestal plus orthogonal fixed mask data. We conclude that (1) linear summation of inhibitory components is a feature of contrast masking, and (2) that the main aftereffect of spatial adaptation on contrast increment thresholds can be assigned to a single site. © 2002 Elsevier Science Ltd. All rights reserved.

Modelling data and voice traffic over IP networks using continuous-time Markov models

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Common approaches to IP-traffic modelling have featured the use of stochastic models, based on the Markov property, which can be classified into black box and white box models based on the approach used for modelling traffic. White box models, are simple to understand, transparent and have a physical meaning attributed to each of the associated parameters. To exploit this key advantage, this thesis explores the use of simple classic continuous-time Markov models based on a white box approach, to model, not only the network traffic statistics but also the source behaviour with respect to the network and application. The thesis is divided into two parts: The first part focuses on the use of simple Markov and Semi-Markov traffic models, starting from the simplest two-state model moving upwards to n-state models with Poisson and non-Poisson statistics. The thesis then introduces the convenient to use, mathematically derived, Gaussian Markov models which are used to model the measured network IP traffic statistics. As one of the most significant contributions, the thesis establishes the significance of the second-order density statistics as it reveals that, in contrast to first-order density, they carry much more unique information on traffic sources and behaviour. The thesis then exploits the use of Gaussian Markov models to model these unique features and finally shows how the use of simple classic Markov models coupled with use of second-order density statistics provides an excellent tool for capturing maximum traffic detail, which in itself is the essence of good traffic modelling. The second part of the thesis, studies the ON-OFF characteristics of VoIP traffic with reference to accurate measurements of the ON and OFF periods, made from a large multi-lingual database of over 100 hours worth of VoIP call recordings. The impact of the language, prosodic structure and speech rate of the speaker on the statistics of the ON-OFF periods is analysed and relevant conclusions are presented. Finally, an ON-OFF VoIP source model with log-normal transitions is contributed as an ideal candidate to model VoIP traffic and the results of this model are compared with those of previously published work.

A comparison of sales response predictions from demand models applied to store-level versus panel data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In order to generate sales promotion response predictions, marketing analysts estimate demand models using either disaggregated (consumer-level) or aggregated (store-level) scanner data. Comparison of predictions from these demand models is complicated by the fact that models may accommodate different forms of consumer heterogeneity depending on the level of data aggregation. This study shows via simulation that demand models with various heterogeneity specifications do not produce more accurate sales response predictions than a homogeneous demand model applied to store-level data, with one major exception: a random coefficients model designed to capture within-store heterogeneity using store-level data produced significantly more accurate sales response predictions (as well as better fit) compared to other model specifications. An empirical application to the paper towel product category adds additional insights. This article has supplementary material online.

General and multiplicative non-parametric corporate performance models with interval ratio data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The increasing intensity of global competition has led organizations to utilize various types of performance measurement tools for improving the quality of their products and services. Data envelopment analysis (DEA) is a methodology for evaluating and measuring the relative efficiencies of a set of decision making units (DMUs) that use multiple inputs to produce multiple outputs. All the data in the conventional DEA with input and/or output ratios assumes the form of crisp numbers. However, the observed values of data in real-world problems are sometimes expressed as interval ratios. In this paper, we propose two new models: general and multiplicative non-parametric ratio models for DEA problems with interval data. The contributions of this paper are fourfold: (1) we consider input and output data expressed as interval ratios in DEA; (2) we address the gap in DEA literature for problems not suitable or difficult to model with crisp values; (3) we propose two new DEA models for evaluating the relative efficiencies of DMUs with interval ratios, and (4) we present a case study involving 20 banks with three interval ratios to demonstrate the applicability and efficacy of the proposed models where the traditional indicators are mostly financial ratios. © 2011 Elsevier Inc.

On the boundedness of the SORM DEA models with negative data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Emrouznejad et al. (2010) proposed a Semi-Oriented Radial Measure (SORM) model for assessing the efficiency of Decision Making Units (DMUs) by Data Envelopment Analysis (DEA) with negative data. This paper provides a necessary and sufficient condition for boundedness of the input and output oriented SORM models.

Semisupervised learning of hierarchical latent trait models for data visualization

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Recently, we have developed the hierarchical Generative Topographic Mapping (HGTM), an interactive method for visualization of large high-dimensional real-valued data sets. In this paper, we propose a more general visualization system by extending HGTM in three ways, which allows the user to visualize a wider range of data sets and better support the model development process. 1) We integrate HGTM with noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM). This enables us to visualize data of inherently discrete nature, e.g., collections of documents, in a hierarchical manner. 2) We give the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode, the user selects "regions of interest," whereas in the automatic mode, an unsupervised minimum message length (MML)-inspired construction of a mixture of LTMs is employed. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. 3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualization plots, since they can highlight the boundaries between data clusters. We illustrate our approach on a toy example and evaluate it on three more complex real data sets. © 2005 IEEE.

Comparison of Multivariate and Univariate Models for Genetic Evaluation of Milk Yield based on Test Day Data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62H12, 62P99

Development of prediction models for freeway incident durations using data mining techniques

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The nation's freeway systems are becoming increasingly congested. A major contribution to traffic congestion on freeways is due to traffic incidents. Traffic incidents are non-recurring events such as accidents or stranded vehicles that cause a temporary roadway capacity reduction, and they can account for as much as 60 percent of all traffic congestion on freeways. One major freeway incident management strategy involves diverting traffic to avoid incident locations by relaying timely information through Intelligent Transportation Systems (ITS) devices such as dynamic message signs or real-time traveler information systems. The decision to divert traffic depends foremost on the expected duration of an incident, which is difficult to predict. In addition, the duration of an incident is affected by many contributing factors. Determining and understanding these factors can help the process of identifying and developing better strategies to reduce incident durations and alleviate traffic congestion. A number of research studies have attempted to develop models to predict incident durations, yet with limited success. ^ This dissertation research attempts to improve on this previous effort by applying data mining techniques to a comprehensive incident database maintained by the District 4 ITS Office of the Florida Department of Transportation (FDOT). Two categories of incident duration prediction models were developed: "offline" models designed for use in the performance evaluation of incident management programs, and "online" models for real-time prediction of incident duration to aid in the decision making of traffic diversion in the event of an ongoing incident. Multiple data mining analysis techniques were applied and evaluated in the research. The multiple linear regression analysis and decision tree based method were applied to develop the offline models, and the rule-based method and a tree algorithm called M5P were used to develop the online models. ^ The results show that the models in general can achieve high prediction accuracy within acceptable time intervals of the actual durations. The research also identifies some new contributing factors that have not been examined in past studies. As part of the research effort, software code was developed to implement the models in the existing software system of District 4 FDOT for actual applications. ^

Improving resource management in virtualized data centers using application performance models

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The rapid growth of virtualized data centers and cloud hosting services is making the management of physical resources such as CPU, memory, and I/O bandwidth in data center servers increasingly important. Server management now involves dealing with multiple dissimilar applications with varying Service-Level-Agreements (SLAs) and multiple resource dimensions. The multiplicity and diversity of resources and applications are rendering administrative tasks more complex and challenging. This thesis aimed to develop a framework and techniques that would help substantially reduce data center management complexity.^ We specifically addressed two crucial data center operations. First, we precisely estimated capacity requirements of client virtual machines (VMs) while renting server space in cloud environment. Second, we proposed a systematic process to efficiently allocate physical resources to hosted VMs in a data center. To realize these dual objectives, accurately capturing the effects of resource allocations on application performance is vital. The benefits of accurate application performance modeling are multifold. Cloud users can size their VMs appropriately and pay only for the resources that they need; service providers can also offer a new charging model based on the VMs performance instead of their configured sizes. As a result, clients will pay exactly for the performance they are actually experiencing; on the other hand, administrators will be able to maximize their total revenue by utilizing application performance models and SLAs. ^ This thesis made the following contributions. First, we identified resource control parameters crucial for distributing physical resources and characterizing contention for virtualized applications in a shared hosting environment. Second, we explored several modeling techniques and confirmed the suitability of two machine learning tools, Artificial Neural Network and Support Vector Machine, to accurately model the performance of virtualized applications. Moreover, we suggested and evaluated modeling optimizations necessary to improve prediction accuracy when using these modeling tools. Third, we presented an approach to optimal VM sizing by employing the performance models we created. Finally, we proposed a revenue-driven resource allocation algorithm which maximizes the SLA-generated revenue for a data center.^

Learning Data-Driven Models of Non-Verbal Behaviors for Building Rapport Using an Intelligent Virtual Agent

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There is a growing societal need to address the increasing prevalence of behavioral health issues, such as obesity, alcohol or drug use, and general lack of treatment adherence for a variety of health problems. The statistics, worldwide and in the USA, are daunting. Excessive alcohol use is the third leading preventable cause of death in the United States (with 79,000 deaths annually), and is responsible for a wide range of health and social problems. On the positive side though, these behavioral health issues (and associated possible diseases) can often be prevented with relatively simple lifestyle changes, such as losing weight with a diet and/or physical exercise, or learning how to reduce alcohol consumption. Medicine has therefore started to move toward finding ways of preventively promoting wellness, rather than solely treating already established illness. Evidence-based patient-centered Brief Motivational Interviewing (BMI) interven- tions have been found particularly effective in helping people find intrinsic motivation to change problem behaviors after short counseling sessions, and to maintain healthy lifestyles over the long-term. Lack of locally available personnel well-trained in BMI, however, often limits access to successful interventions for people in need. To fill this accessibility gap, Computer-Based Interventions (CBIs) have started to emerge. Success of the CBIs, however, critically relies on insuring engagement and retention of CBI users so that they remain motivated to use these systems and come back to use them over the long term as necessary. Because of their text-only interfaces, current CBIs can therefore only express limited empathy and rapport, which are the most important factors of health interventions. Fortunately, in the last decade, computer science research has progressed in the design of simulated human characters with anthropomorphic communicative abilities. Virtual characters interact using humans’ innate communication modalities, such as facial expressions, body language, speech, and natural language understanding. By advancing research in Artificial Intelligence (AI), we can improve the ability of artificial agents to help us solve CBI problems. To facilitate successful communication and social interaction between artificial agents and human partners, it is essential that aspects of human social behavior, especially empathy and rapport, be considered when designing human-computer interfaces. Hence, the goal of the present dissertation is to provide a computational model of rapport to enhance an artificial agent’s social behavior, and to provide an experimental tool for the psychological theories shaping the model. Parts of this thesis were already published in [LYL+12, AYL12, AL13, ALYR13, LAYR13, YALR13, ALY14].

Surface velocity fields, digital elevation models, ice front positions and grounding line derived from remote sensing data at Dinsmoor-Bombardier-Edgeworth glacier system, Antarctic Peninsula (1992-2014)

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The northern Antarctic Peninsula is one of the fastest changing regions on Earth. The disintegration of the Larsen-A Ice Shelf in 1995 caused tributary glaciers to adjust by speeding up, surface lowering, and overall increased ice-mass discharge. In this study, we investigate the temporal variation of these changes at the Dinsmoor-Bombardier-Edgeworth glacier system by analyzing dense time series from various spaceborne and airborne Earth observation missions. Precollapse ice shelf conditions and subsequent adjustments through 2014 were covered. Our results show a response of the glacier system some months after the breakup, reaching maximum surface velocities at the glacier front of up to 8.8 m/d in 1999 and a subsequent decrease to ~1.5 m/d in 2014. Using a dense time series of interferometrically derived TanDEM-X digital elevation models and photogrammetric data, an exponential function was fitted for the decrease in surface elevation. Elevation changes in areas below 1000 m a.s.l. amounted to at least 130±15 m130±15 m between 1995 and 2014, with change rates of ~3.15 m/a between 2003 and 2008. Current change rates (2010-2014) are in the range of 1.7 m/a. Mass imbalances were computed with different scenarios of boundary conditions. The most plausible results amount to -40.7±3.9 Gt-40.7±3.9 Gt. The contribution to sea level rise was estimated to be 18.8±1.8 Gt18.8±1.8 Gt, corresponding to a 0.052±0.005 mm0.052±0.005 mm sea level equivalent, for the period 1995-2014. Our analysis and scenario considerations revealed that major uncertainties still exist due to insufficiently accurate ice-thickness information. The second largest uncertainty in the computations was the glacier surface mass balance, which is still poorly known. Our time series analysis facilitates an improved comparison with GRACE data and as input to modeling of glacio-isostatic uplift in this region. The study contributed to a better understanding of how glacier systems adjust to ice shelf disintegration.

Multilevel structural equation models for longitudinal data where predictors are measured more frequently than outcomes : an application to the effects of stress on the cognitive function of nurses

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Funded by Chief Scientist Office, Scotland. Grant Number: CZH/4/394 Economic and Social Research Council grant as part of the National Centre for Research Methods. Grant Number: RES-576-25-0032

Using individual tracking data to validate the predictions of species distribution models

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The authors would like to thank the College of Life Sciences of Aberdeen University and Marine Scotland Science which funded CP's PhD project. Skate tagging experiments were undertaken as part of Scottish Government project SP004. We thank Ian Burrett for help in catching the fish and the other fishermen and anglers who returned tags. We thank José Manuel Gonzalez-Irusta for extracting and making available the environmental layers used as environmental covariates in the environmental suitability modelling procedure. We also thank Jason Matthiopoulos for insightful suggestions on habitat utilization metrics as well as Stephen C.F. Palmer, and three anonymous reviewers for useful suggestions to improve the clarity and quality of the manuscript.

Models and empirical data for the production of referring expressions

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Article Accepted Date: 29 May 2014 Acknowledgements The authors gratefully acknowledge the support of the Cognitive Science Society for the organisation of the Workshop on Production of Referring Expressions: Bridging the Gap between Cognitive and Computational Approaches to Reference, from which this special issue originated. Funding Emiel Krahmer and Albert Gatt thank The Netherlands Organisation for Scientific Research (NWO) for VICI grant Bridging the Gap between Computational Linguistics and Psycholinguistics: The Case of Referring Expressions (grant number 277-70-007).

Data to Decision in a Dynamic Ocean: Robust Species Distribution Models and Spatial Decision Frameworks

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Human use of the oceans is increasingly in conflict with conservation of endangered species. Methods for managing the spatial and temporal placement of industries such as military, fishing, transportation and offshore energy, have historically been post hoc; i.e. the time and place of human activity is often already determined before assessment of environmental impacts. In this dissertation, I build robust species distribution models in two case study areas, US Atlantic (Best et al. 2012) and British Columbia (Best et al. 2015), predicting presence and abundance respectively, from scientific surveys. These models are then applied to novel decision frameworks for preemptively suggesting optimal placement of human activities in space and time to minimize ecological impacts: siting for offshore wind energy development, and routing ships to minimize risk of striking whales. Both decision frameworks relate the tradeoff between conservation risk and industry profit with synchronized variable and map views as online spatial decision support systems.

For siting offshore wind energy development (OWED) in the U.S. Atlantic (chapter 4), bird density maps are combined across species with weights of OWED sensitivity to collision and displacement and 10 km2 sites are compared against OWED profitability based on average annual wind speed at 90m hub heights and distance to transmission grid. A spatial decision support system enables toggling between the map and tradeoff plot views by site. A selected site can be inspected for sensitivity to a cetaceans throughout the year, so as to capture months of the year which minimize episodic impacts of pre-operational activities such as seismic airgun surveying and pile driving.

Routing ships to avoid whale strikes (chapter 5) can be similarly viewed as a tradeoff, but is a different problem spatially. A cumulative cost surface is generated from density surface maps and conservation status of cetaceans, before applying as a resistance surface to calculate least-cost routes between start and end locations, i.e. ports and entrance locations to study areas. Varying a multiplier to the cost surface enables calculation of multiple routes with different costs to conservation of cetaceans versus cost to transportation industry, measured as distance. Similar to the siting chapter, a spatial decisions support system enables toggling between the map and tradeoff plot view of proposed routes. The user can also input arbitrary start and end locations to calculate the tradeoff on the fly.

Essential to the input of these decision frameworks are distributions of the species. The two preceding chapters comprise species distribution models from two case study areas, U.S. Atlantic (chapter 2) and British Columbia (chapter 3), predicting presence and density, respectively. Although density is preferred to estimate potential biological removal, per Marine Mammal Protection Act requirements in the U.S., all the necessary parameters, especially distance and angle of observation, are less readily available across publicly mined datasets.

In the case of predicting cetacean presence in the U.S. Atlantic (chapter 2), I extracted datasets from the online OBIS-SEAMAP geo-database, and integrated scientific surveys conducted by ship (n=36) and aircraft (n=16), weighting a Generalized Additive Model by minutes surveyed within space-time grid cells to harmonize effort between the two survey platforms. For each of 16 cetacean species guilds, I predicted the probability of occurrence from static environmental variables (water depth, distance to shore, distance to continental shelf break) and time-varying conditions (monthly sea-surface temperature). To generate maps of presence vs. absence, Receiver Operator Characteristic (ROC) curves were used to define the optimal threshold that minimizes false positive and false negative error rates. I integrated model outputs, including tables (species in guilds, input surveys) and plots (fit of environmental variables, ROC curve), into an online spatial decision support system, allowing for easy navigation of models by taxon, region, season, and data provider.

For predicting cetacean density within the inner waters of British Columbia (chapter 3), I calculated density from systematic, line-transect marine mammal surveys over multiple years and seasons (summer 2004, 2005, 2008, and spring/autumn 2007) conducted by Raincoast Conservation Foundation. Abundance estimates were calculated using two different methods: Conventional Distance Sampling (CDS) and Density Surface Modelling (DSM). CDS generates a single density estimate for each stratum, whereas DSM explicitly models spatial variation and offers potential for greater precision by incorporating environmental predictors. Although DSM yields a more relevant product for the purposes of marine spatial planning, CDS has proven to be useful in cases where there are fewer observations available for seasonal and inter-annual comparison, particularly for the scarcely observed elephant seal. Abundance estimates are provided on a stratum-specific basis. Steller sea lions and harbour seals are further differentiated by ‘hauled out’ and ‘in water’. This analysis updates previous estimates (Williams & Thomas 2007) by including additional years of effort, providing greater spatial precision with the DSM method over CDS, novel reporting for spring and autumn seasons (rather than summer alone), and providing new abundance estimates for Steller sea lion and northern elephant seal. In addition to providing a baseline of marine mammal abundance and distribution, against which future changes can be compared, this information offers the opportunity to assess the risks posed to marine mammals by existing and emerging threats, such as fisheries bycatch, ship strikes, and increased oil spill and ocean noise issues associated with increases of container ship and oil tanker traffic in British Columbia’s continental shelf waters.

Starting with marine animal observations at specific coordinates and times, I combine these data with environmental data, often satellite derived, to produce seascape predictions generalizable in space and time. These habitat-based models enable prediction of encounter rates and, in the case of density surface models, abundance that can then be applied to management scenarios. Specific human activities, OWED and shipping, are then compared within a tradeoff decision support framework, enabling interchangeable map and tradeoff plot views. These products make complex processes transparent for gaming conservation, industry and stakeholders towards optimal marine spatial management, fundamental to the tenets of marine spatial planning, ecosystem-based management and dynamic ocean management.

«
1
2
...
28
29
30
31
32
33
34
...
64
65
»