875 resultados para modeling of data sources
Resumo:
Mode of access: Internet.
Resumo:
We model nongraphitized carbon black surfaces and investigate adsorption of argon on these surfaces by using the grand canonical Monte Carlo simulation. In this model, the nongraphitized surface is modeled as a stack of graphene layers with some carbon atoms of the top graphene layer being randomly removed. The percentage of the surface carbon atoms being removed and the effective size of the defect ( created by the removal) are the key parameters to characterize the nongraphitized surface. The patterns of adsorption isotherm and isosteric heat are particularly studied, as a function of these surface parameters as well as pressure and temperature. It is shown that the adsorption isotherm shows a steplike behavior on a perfect graphite surface and becomes smoother on nongraphitized surfaces. Regarding the isosteric heat versus loading, we observe for the case of graphitized thermal carbon black the increase of heat in the submonolayer coverage and then a sharp decline in the heat when the second layer is starting to form, beyond which it increases slightly. On the other hand, the isosteric heat versus loading for a highly nongraphitized surface shows a general decline with respect to loading, which is due to the energetic heterogeneity of the surface. It is only when the fluid-fluid interaction is greater than the surface energetic factor that we see a minimum-maximum in the isosteric heat versus loading. These simulation results of isosteric heat agree well with the experimental results of graphitization of Spheron 6 (Polley, M. H.; Schaeffer, W. D.; Smith, W. R. J. Phys. Chem. 1953, 57, 469; Beebe, R. A.; Young, D. M. J. Phys. Chem. 1954, 58, 93). Adsorption isotherms and isosteric heat in pores whose walls have defects are also studied from the simulation, and the pattern of isotherm and isosteric heat could be used to identify the fingerprint of the surface.
Resumo:
Objective To compare mortality burden estimates based on direct measurement of levels and causes in communities with indirect estimates based on combining health facility cause-specific mortality structures with community measurement of mortality levels. Methods. Data from sentinel vital registration (SVR) with verbal autopsy (VA) were used to determine the cause-specific mortality burden at the community level in two areas of the United Republic of Tanzania. Proportional cause-specific mortality structures from health facilities were applied to counts of deaths obtained by SVR to produce modelled estimates. The burden was expressed in years of life lost. Findings. A total of 2884 deaths were recorded from health facilities and 2167 recorded from SVR/VAs. In the perinatal and neonatal age group cause-specific mortality rates were dominated by perinatal conditions and stillbirths in both the community and the facility data. The modelled estimates for chronic causes were very similar to those from SVR/VA. Acute febrile illnesses were coded more specifically in the facility data than in the VA. Injuries were more prevalent in the SVR/VA data than in that from the facilities. Conclusion. In this setting, improved International classification of diseases and health related problems, tenth revision (ICD-10) coding practices and applying facility-based cause structures to counts of deaths from communities, derived from SVR, appears to produce reasonable estimates of the cause-specific mortality burden in those aged 5 years and older determined directly from VA. For the perinatal and neonatal age group, VA appears to be required. Use of this approach in a nationally representative sample of facilities may produce reliable national estimates of the cause-specific mortality burden for leading causes of death in adults.
Resumo:
Retrieving large amounts of information over wide area networks, including the Internet, is problematic due to issues arising from latency of response, lack of direct memory access to data serving resources, and fault tolerance. This paper describes a design pattern for solving the issues of handling results from queries that return large amounts of data. Typically these queries would be made by a client process across a wide area network (or Internet), with one or more middle-tiers, to a relational database residing on a remote server. The solution involves implementing a combination of data retrieval strategies, including the use of iterators for traversing data sets and providing an appropriate level of abstraction to the client, double-buffering of data subsets, multi-threaded data retrieval, and query slicing. This design has recently been implemented and incorporated into the framework of a commercial software product developed at Oracle Corporation.
Resumo:
Use of modern object-oriented methods of designing of information systems (IS) both descriptions of interrelations IS and automated with its help business-processes of the enterprises leads to necessity of construction uniform complete IS on the basis of set of local models of such system. As a result of use of such approach there are the contradictions caused by inconsistency of actions of separate developers IS with each other and that is much more important, inconsistency of the points of view of separate users IS. Besides similar contradictions arise while in service IS at the enterprise because of constant change separate business- processes of the enterprise. It is necessary to note also, that now overwhelming majority IS is developed and maintained as set of separate functional modules. Each of such modules can function as independent IS. However the problem of integration of separate functional modules in uniform system can lead to a lot of problems. Among these problems it is possible to specify, for example, presence in modules of functions which are not used by the enterprise to destination, to complexity of information and program integration of modules of various manufacturers, etc. In most cases these contradictions and the reasons, their caused, are consequence of primary representation IS as equilibrium steady system. In work [1] representation IS as dynamic multistable system which is capable to carry out following actions has been considered:
Resumo:
This paper is dedicated to modelling of network maintaining based on live example – maintaining ATM banking network, where any problems are mean money loss. A full analysis is made in order to estimate valuable and not-valuable parameters based on complex analysis of available data. Correlation analysis helps to estimate provided data and to produce a complex solution of increasing network maintaining effectiveness.
Resumo:
Groundwater systems of different densities are often mathematically modeled to understand and predict environmental behavior such as seawater intrusion or submarine groundwater discharge. Additional data collection may be justified if it will cost-effectively aid in reducing the uncertainty of a model's prediction. The collection of salinity, as well as, temperature data could aid in reducing predictive uncertainty in a variable-density model. However, before numerical models can be created, rigorous testing of the modeling code needs to be completed. This research documents the benchmark testing of a new modeling code, SEAWAT Version 4. The benchmark problems include various combinations of density-dependent flow resulting from variations in concentration and temperature. The verified code, SEAWAT, was then applied to two different hydrological analyses to explore the capacity of a variable-density model to guide data collection. ^ The first analysis tested a linear method to guide data collection by quantifying the contribution of different data types and locations toward reducing predictive uncertainty in a nonlinear variable-density flow and transport model. The relative contributions of temperature and concentration measurements, at different locations within a simulated carbonate platform, for predicting movement of the saltwater interface were assessed. Results from the method showed that concentration data had greater worth than temperature data in reducing predictive uncertainty in this case. Results also indicated that a linear method could be used to quantify data worth in a nonlinear model. ^ The second hydrological analysis utilized a model to identify the transient response of the salinity, temperature, age, and amount of submarine groundwater discharge to changes in tidal ocean stage, seasonal temperature variations, and different types of geology. The model was compared to multiple kinds of data to (1) calibrate and verify the model, and (2) explore the potential for the model to be used to guide the collection of data using techniques such as electromagnetic resistivity, thermal imagery, and seepage meters. Results indicated that the model can be used to give insight to submarine groundwater discharge and be used to guide data collection. ^
Resumo:
Managed lane strategies are innovative road operation schemes for addressing congestion problems. These strategies operate a lane (lanes) adjacent to a freeway that provides congestion-free trips to eligible users, such as transit or toll-payers. To ensure the successful implementation of managed lanes, the demand on these lanes need to be accurately estimated. Among different approaches for predicting this demand, the four-step demand forecasting process is most common. Managed lane demand is usually estimated at the assignment step. Therefore, the key to reliably estimating the demand is the utilization of effective assignment modeling processes. ^ Managed lanes are particularly effective when the road is functioning at near-capacity. Therefore, capturing variations in demand and network attributes and performance is crucial for their modeling, monitoring and operation. As a result, traditional modeling approaches, such as those used in static traffic assignment of demand forecasting models, fail to correctly predict the managed lane demand and the associated system performance. The present study demonstrates the power of the more advanced modeling approach of dynamic traffic assignment (DTA), as well as the shortcomings of conventional approaches, when used to model managed lanes in congested environments. In addition, the study develops processes to support an effective utilization of DTA to model managed lane operations. ^ Static and dynamic traffic assignments consist of demand, network, and route choice model components that need to be calibrated. These components interact with each other, and an iterative method for calibrating them is needed. In this study, an effective standalone framework that combines static demand estimation and dynamic traffic assignment has been developed to replicate real-world traffic conditions. ^ With advances in traffic surveillance technologies collecting, archiving, and analyzing traffic data is becoming more accessible and affordable. The present study shows how data from multiple sources can be integrated, validated, and best used in different stages of modeling and calibration of managed lanes. Extensive and careful processing of demand, traffic, and toll data, as well as proper definition of performance measures, result in a calibrated and stable model, which closely replicates real-world congestion patterns, and can reasonably respond to perturbations in network and demand properties.^
Resumo:
Visual cluster analysis provides valuable tools that help analysts to understand large data sets in terms of representative clusters and relationships thereof. Often, the found clusters are to be understood in context of belonging categorical, numerical or textual metadata which are given for the data elements. While often not part of the clustering process, such metadata play an important role and need to be considered during the interactive cluster exploration process. Traditionally, linked-views allow to relate (or loosely speaking: correlate) clusters with metadata or other properties of the underlying cluster data. Manually inspecting the distribution of metadata for each cluster in a linked-view approach is tedious, specially for large data sets, where a large search problem arises. Fully interactive search for potentially useful or interesting cluster to metadata relationships may constitute a cumbersome and long process. To remedy this problem, we propose a novel approach for guiding users in discovering interesting relationships between clusters and associated metadata. Its goal is to guide the analyst through the potentially huge search space. We focus in our work on metadata of categorical type, which can be summarized for a cluster in form of a histogram. We start from a given visual cluster representation, and compute certain measures of interestingness defined on the distribution of metadata categories for the clusters. These measures are used to automatically score and rank the clusters for potential interestingness regarding the distribution of categorical metadata. Identified interesting relationships are highlighted in the visual cluster representation for easy inspection by the user. We present a system implementing an encompassing, yet extensible, set of interestingness scores for categorical metadata, which can also be extended to numerical metadata. Appropriate visual representations are provided for showing the visual correlations, as well as the calculated ranking scores. Focusing on clusters of time series data, we test our approach on a large real-world data set of time-oriented scientific research data, demonstrating how specific interesting views are automatically identified, supporting the analyst discovering interesting and visually understandable relationships.
Resumo:
Microturbines are among the most successfully commercialized distributed energy resources, especially when they are used for combined heat and power generation. However, the interrelated thermal and electrical system dynamic behaviors have not been fully investigated. This is technically challenging due to the complex thermo-fluid-mechanical energy conversion processes which introduce multiple time-scale dynamics and strong nonlinearity into the analysis. To tackle this problem, this paper proposes a simplified model which can predict the coupled thermal and electric output dynamics of microturbines. Considering the time-scale difference of various dynamic processes occuring within microturbines, the electromechanical subsystem is treated as a fast quasi-linear process while the thermo-mechanical subsystem is treated as a slow process with high nonlinearity. A three-stage subspace identification method is utilized to capture the dominant dynamics and predict the electric power output. For the thermo-mechanical process, a radial basis function model trained by the particle swarm optimization method is employed to handle the strong nonlinear characteristics. Experimental tests on a Capstone C30 microturbine show that the proposed modeling method can well capture the system dynamics and produce a good prediction of the coupled thermal and electric outputs in various operating modes.
Resumo:
The protein lysate array is an emerging technology for quantifying the protein concentration ratios in multiple biological samples. It is gaining popularity, and has the potential to answer questions about post-translational modifications and protein pathway relationships. Statistical inference for a parametric quantification procedure has been inadequately addressed in the literature, mainly due to two challenges: the increasing dimension of the parameter space and the need to account for dependence in the data. Each chapter of this thesis addresses one of these issues. In Chapter 1, an introduction to the protein lysate array quantification is presented, followed by the motivations and goals for this thesis work. In Chapter 2, we develop a multi-step procedure for the Sigmoidal models, ensuring consistent estimation of the concentration level with full asymptotic efficiency. The results obtained in this chapter justify inferential procedures based on large-sample approximations. Simulation studies and real data analysis are used to illustrate the performance of the proposed method in finite-samples. The multi-step procedure is simpler in both theory and computation than the single-step least squares method that has been used in current practice. In Chapter 3, we introduce a new model to account for the dependence structure of the errors by a nonlinear mixed effects model. We consider a method to approximate the maximum likelihood estimator of all the parameters. Using the simulation studies on various error structures, we show that for data with non-i.i.d. errors the proposed method leads to more accurate estimates and better confidence intervals than the existing single-step least squares method.
Resumo:
This dissertation comprises three chapters. The first chapter motivates the use of a novel data set combining survey and administrative sources for the study of internal labor migration. By following a sample of individuals from the American Community Survey (ACS) across their employment outcomes over time according to the Longitudinal Employer-Household Dynamics (LEHD) database, I construct a measure of geographic labor mobility that allows me to exploit information about individuals prior to their move. This enables me to explore aspects of the migration decision, such as homeownership and employment status, in ways that have not previously been possible. In the second chapter, I use this data set to test the theory that falling home prices affect a worker’s propensity to take a job in a different metropolitan area from where he is currently located. Employing a within-CBSA and time estimation that compares homeowners to renters in their propensities to relocate for jobs, I find that homeowners who have experienced declines in the nominal value of their homes are approximately 12% less likely than average to take a new job in a location outside of the metropolitan area where they currently reside. This evidence is consistent with the hypothesis that housing lock-in has contributed to the decline in labor mobility of homeowners during the recent housing bust. The third chapter focuses on a sample of unemployed workers in the same data set, in order to compare the unemployment durations of those who find subsequent employment by relocating to a new metropolitan area, versus those who find employment in their original location. Using an instrumental variables strategy to address the endogeneity of the migration decision, I find that out-migrating for a new job significantly reduces the time to re-employment. These results stand in contrast to OLS estimates, which suggest that those who move have longer unemployment durations. This implies that those who migrate for jobs in the data may be particularly disadvantaged in their ability to find employment, and thus have strong short-term incentives to relocate.
Resumo:
Secondary microseism sources are pressure fluctuations close to the ocean surface. They generate acoustic P-waves that propagate in water down to the ocean bottom where they are partly reflected, and partly transmitted into the crust to continue their propagation through the Earth. We present the theory for computing the displacement power spectral density of secondary microseism P-waves recorded by receivers in the far field. In the frequency domain, the P-wave displacement can be modeled as the product of (1) the pressure source, (2) the source site effect that accounts for the constructive interference of multiply reflected P-waves in the ocean, (3) the propagation from the ocean bottom to the stations, (4) the receiver site effect. Secondary microseism P-waves have weak amplitudes, but they can be investigated by beamforming analysis. We validate our approach by analyzing the seismic signals generated by Typhoon Ioke (2006) and recorded by the Southern California Seismic Network. Back projecting the beam onto the ocean surface enables to follow the source motion. The observed beam centroid is in the vicinity of the pressure source derived from the ocean wave model WAVEWATCH IIIR. The pressure source is then used for modeling the beam and a good agreement is obtained between measured and modeled beam amplitude variation over time. This modeling approach can be used to invert P-wave noise data and retrieve the source intensity and lateral extent.
Resumo:
Managed lane strategies are innovative road operation schemes for addressing congestion problems. These strategies operate a lane (lanes) adjacent to a freeway that provides congestion-free trips to eligible users, such as transit or toll-payers. To ensure the successful implementation of managed lanes, the demand on these lanes need to be accurately estimated. Among different approaches for predicting this demand, the four-step demand forecasting process is most common. Managed lane demand is usually estimated at the assignment step. Therefore, the key to reliably estimating the demand is the utilization of effective assignment modeling processes. Managed lanes are particularly effective when the road is functioning at near-capacity. Therefore, capturing variations in demand and network attributes and performance is crucial for their modeling, monitoring and operation. As a result, traditional modeling approaches, such as those used in static traffic assignment of demand forecasting models, fail to correctly predict the managed lane demand and the associated system performance. The present study demonstrates the power of the more advanced modeling approach of dynamic traffic assignment (DTA), as well as the shortcomings of conventional approaches, when used to model managed lanes in congested environments. In addition, the study develops processes to support an effective utilization of DTA to model managed lane operations. Static and dynamic traffic assignments consist of demand, network, and route choice model components that need to be calibrated. These components interact with each other, and an iterative method for calibrating them is needed. In this study, an effective standalone framework that combines static demand estimation and dynamic traffic assignment has been developed to replicate real-world traffic conditions. With advances in traffic surveillance technologies collecting, archiving, and analyzing traffic data is becoming more accessible and affordable. The present study shows how data from multiple sources can be integrated, validated, and best used in different stages of modeling and calibration of managed lanes. Extensive and careful processing of demand, traffic, and toll data, as well as proper definition of performance measures, result in a calibrated and stable model, which closely replicates real-world congestion patterns, and can reasonably respond to perturbations in network and demand properties.
Resumo:
In recent years, IoT technology has radically transformed many crucial industrial and service sectors such as healthcare. The multi-facets heterogeneity of the devices and the collected information provides important opportunities to develop innovative systems and services. However, the ubiquitous presence of data silos and the poor semantic interoperability in the IoT landscape constitute a significant obstacle in the pursuit of this goal. Moreover, achieving actionable knowledge from the collected data requires IoT information sources to be analysed using appropriate artificial intelligence techniques such as automated reasoning. In this thesis work, Semantic Web technologies have been investigated as an approach to address both the data integration and reasoning aspect in modern IoT systems. In particular, the contributions presented in this thesis are the following: (1) the IoT Fitness Ontology, an OWL ontology that has been developed in order to overcome the issue of data silos and enable semantic interoperability in the IoT fitness domain; (2) a Linked Open Data web portal for collecting and sharing IoT health datasets with the research community; (3) a novel methodology for embedding knowledge in rule-defined IoT smart home scenarios; and (4) a knowledge-based IoT home automation system that supports a seamless integration of heterogeneous devices and data sources.