901 resultados para = least-squares fit to flow-through data
Resumo:
We study the problem of testing the error distribution in a multivariate linear regression (MLR) model. The tests are functions of appropriately standardized multivariate least squares residuals whose distribution is invariant to the unknown cross-equation error covariance matrix. Empirical multivariate skewness and kurtosis criteria are then compared to simulation-based estimate of their expected value under the hypothesized distribution. Special cases considered include testing multivariate normal, Student t; normal mixtures and stable error models. In the Gaussian case, finite-sample versions of the standard multivariate skewness and kurtosis tests are derived. To do this, we exploit simple, double and multi-stage Monte Carlo test methods. For non-Gaussian distribution families involving nuisance parameters, confidence sets are derived for the the nuisance parameters and the error distribution. The procedures considered are evaluated in a small simulation experi-ment. Finally, the tests are applied to an asset pricing model with observable risk-free rates, using monthly returns on New York Stock Exchange (NYSE) portfolios over five-year subperiods from 1926-1995.
Resumo:
It is well known that there is a dynamic relationship between cerebral blood flow (CBF) and cerebral blood volume (CBV). With increasing applications of functional MRI, where the blood oxygen-level-dependent signals are recorded, the understanding and accurate modeling of the hemodynamic relationship between CBF and CBV becomes increasingly important. This study presents an empirical and data-based modeling framework for model identification from CBF and CBV experimental data. It is shown that the relationship between the changes in CBF and CBV can be described using a parsimonious autoregressive with exogenous input model structure. It is observed that neither the ordinary least-squares (LS) method nor the classical total least-squares (TLS) method can produce accurate estimates from the original noisy CBF and CBV data. A regularized total least-squares (RTLS) method is thus introduced and extended to solve such an error-in-the-variables problem. Quantitative results show that the RTLS method works very well on the noisy CBF and CBV data. Finally, a combination of RTLS with a filtering method can lead to a parsimonious but very effective model that can characterize the relationship between the changes in CBF and CBV.
Resumo:
4-Dimensional Variational Data Assimilation (4DVAR) assimilates observations through the minimisation of a least-squares objective function, which is constrained by the model flow. We refer to 4DVAR as strong-constraint 4DVAR (sc4DVAR) in this thesis as it assumes the model is perfect. Relaxing this assumption gives rise to weak-constraint 4DVAR (wc4DVAR), leading to a different minimisation problem with more degrees of freedom. We consider two wc4DVAR formulations in this thesis, the model error formulation and state estimation formulation. The 4DVAR objective function is traditionally solved using gradient-based iterative methods. The principle method used in Numerical Weather Prediction today is the Gauss-Newton approach. This method introduces a linearised `inner-loop' objective function, which upon convergence, updates the solution of the non-linear `outer-loop' objective function. This requires many evaluations of the objective function and its gradient, which emphasises the importance of the Hessian. The eigenvalues and eigenvectors of the Hessian provide insight into the degree of convexity of the objective function, while also indicating the difficulty one may encounter while iterative solving 4DVAR. The condition number of the Hessian is an appropriate measure for the sensitivity of the problem to input data. The condition number can also indicate the rate of convergence and solution accuracy of the minimisation algorithm. This thesis investigates the sensitivity of the solution process minimising both wc4DVAR objective functions to the internal assimilation parameters composing the problem. We gain insight into these sensitivities by bounding the condition number of the Hessians of both objective functions. We also precondition the model error objective function and show improved convergence. We show that both formulations' sensitivities are related to error variance balance, assimilation window length and correlation length-scales using the bounds. We further demonstrate this through numerical experiments on the condition number and data assimilation experiments using linear and non-linear chaotic toy models.
Resumo:
When missing data occur in studies designed to compare the accuracy of diagnostic tests, a common, though naive, practice is to base the comparison of sensitivity, specificity, as well as of positive and negative predictive values on some subset of the data that fits into methods implemented in standard statistical packages. Such methods are usually valid only under the strong missing completely at random (MCAR) assumption and may generate biased and less precise estimates. We review some models that use the dependence structure of the completely observed cases to incorporate the information of the partially categorized observations into the analysis and show how they may be fitted via a two-stage hybrid process involving maximum likelihood in the first stage and weighted least squares in the second. We indicate how computational subroutines written in R may be used to fit the proposed models and illustrate the different analysis strategies with observational data collected to compare the accuracy of three distinct non-invasive diagnostic methods for endometriosis. The results indicate that even when the MCAR assumption is plausible, the naive partial analyses should be avoided.
Resumo:
The code STATFLUX, implementing a new and simple statistical procedure for the calculation of transfer coefficients in radionuclide transport to animals and plants, is proposed. The method is based on the general multiple-compartment model, which uses a system of linear equations involving geometrical volume considerations. Flow parameters were estimated by employing two different least-squares procedures: Derivative and Gauss-Marquardt methods, with the available experimental data of radionuclide concentrations as the input functions of time. The solution of the inverse problem, which relates a given set of flow parameter with the time evolution of concentration functions, is achieved via a Monte Carlo Simulation procedure.Program summaryTitle of program: STATFLUXCatalogue identifier: ADYS_v1_0Program summary URL: http://cpc.cs.qub.ac.uk/summaries/ADYS_v1_0Program obtainable from: CPC Program Library, Queen's University of Belfast, N. IrelandLicensing provisions: noneComputer for which the program is designed and others on which it has been tested: Micro-computer with Intel Pentium III, 3.0 GHzInstallation: Laboratory of Linear Accelerator, Department of Experimental Physics, University of São Paulo, BrazilOperating system: Windows 2000 and Windows XPProgramming language used: Fortran-77 as implemented in Microsoft Fortran 4.0. NOTE: Microsoft Fortran includes non-standard features which are used in this program. Standard Fortran compilers such as, g77, f77, ifort and NAG95, are not able to compile the code and therefore it has not been possible for the CPC Program Library to test the program.Memory, required to execute with typical data: 8 Mbytes of RAM memory and 100 MB of Hard disk memoryNo. of bits in a word: 16No. of lines in distributed program, including test data, etc.: 6912No. of bytes in distributed Program, including test data, etc.: 229 541Distribution format: tar.gzNature of the physical problem: the investigation of transport mechanisms for radioactive substances, through environmental pathways, is very important for radiological protection of populations. One such pathway, associated with the food chain, is the grass-animal-man sequence. The distribution of trace elements in humans and laboratory animals has been intensively studied over the past 60 years [R.C. Pendlenton, C.W. Mays, R.D. Lloyd, A.L. Brooks, Differential accumulation of iodine-131 from local fallout in people and milk, Health Phys. 9 (1963) 1253-1262]. In addition, investigations on the incidence of cancer in humans, and a possible causal relationship to radioactive fallout, have been undertaken [E.S. Weiss, M.L. Rallison, W.T. London, W.T. Carlyle Thompson, Thyroid nodularity in southwestern Utah school children exposed to fallout radiation, Amer. J. Public Health 61 (1971) 241-249; M.L. Rallison, B.M. Dobyns, F.R. Keating, J.E. Rall, F.H. Tyler, Thyroid diseases in children, Amer. J. Med. 56 (1974) 457-463; J.L. Lyon, M.R. Klauber, J.W. Gardner, K.S. Udall, Childhood leukemia associated with fallout from nuclear testing, N. Engl. J. Med. 300 (1979) 397-402]. From the pathways of entry of radionuclides in the human (or animal) body, ingestion is the most important because it is closely related to life-long alimentary (or dietary) habits. Those radionuclides which are able to enter the living cells by either metabolic or other processes give rise to localized doses which can be very high. The evaluation of these internally localized doses is of paramount importance for the assessment of radiobiological risks and radiological protection. The time behavior of trace concentration in organs is the principal input for prediction of internal doses after acute or chronic exposure. The General Multiple-Compartment Model (GMCM) is the powerful and more accepted method for biokinetical studies, which allows the calculation of concentration of trace elements in organs as a function of time, when the flow parameters of the model are known. However, few biokinetics data exist in the literature, and the determination of flow and transfer parameters by statistical fitting for each system is an open problem.Restriction on the complexity of the problem: This version of the code works with the constant volume approximation, which is valid for many situations where the biological half-live of a trace is lower than the volume rise time. Another restriction is related to the central flux model. The model considered in the code assumes that exist one central compartment (e.g., blood), that connect the flow with all compartments, and the flow between other compartments is not included.Typical running time: Depends on the choice for calculations. Using the Derivative Method the time is very short (a few minutes) for any number of compartments considered. When the Gauss-Marquardt iterative method is used the calculation time can be approximately 5-6 hours when similar to 15 compartments are considered. (C) 2006 Elsevier B.V. All rights reserved.
Resumo:
Generalized linear mixed models (GLMMs) provide an elegant framework for the analysis of correlated data. Due to the non-closed form of the likelihood, GLMMs are often fit by computational procedures like penalized quasi-likelihood (PQL). Special cases of these models are generalized linear models (GLMs), which are often fit using algorithms like iterative weighted least squares (IWLS). High computational costs and memory space constraints often make it difficult to apply these iterative procedures to data sets with very large number of cases. This paper proposes a computationally efficient strategy based on the Gauss-Seidel algorithm that iteratively fits sub-models of the GLMM to subsetted versions of the data. Additional gains in efficiency are achieved for Poisson models, commonly used in disease mapping problems, because of their special collapsibility property which allows data reduction through summaries. Convergence of the proposed iterative procedure is guaranteed for canonical link functions. The strategy is applied to investigate the relationship between ischemic heart disease, socioeconomic status and age/gender category in New South Wales, Australia, based on outcome data consisting of approximately 33 million records. A simulation study demonstrates the algorithm's reliability in analyzing a data set with 12 million records for a (non-collapsible) logistic regression model.
Resumo:
Distributed Brillouin sensing of strain and temperature works by making spatially resolved measurements of the position of the measurand-dependent extremum of the resonance curve associated with the scattering process in the weakly nonlinear regime. Typically, measurements of backscattered Stokes intensity (the dependent variable) are made at a number of predetermined fixed frequencies covering the design measurand range of the apparatus and combined to yield an estimate of the position of the extremum. The measurand can then be found because its relationship to the position of the extremum is assumed known. We present analytical expressions relating the relative error in the extremum position to experimental errors in the dependent variable. This is done for two cases: (i) a simple non-parametric estimate of the mean based on moments and (ii) the case in which a least squares technique is used to fit a Lorentzian to the data. The question of statistical bias in the estimates is discussed and in the second case we go further and present for the first time a general method by which the probability density function (PDF) of errors in the fitted parameters can be obtained in closed form in terms of the PDFs of the errors in the noisy data.
Resumo:
Distributed Brillouin sensing of strain and temperature works by making spatially resolved measurements of the position of the measurand-dependent extremum of the resonance curve associated with the scattering process in the weakly nonlinear regime. Typically, measurements of backscattered Stokes intensity (the dependent variable) are made at a number of predetermined fixed frequencies covering the design measurand range of the apparatus and combined to yield an estimate of the position of the extremum. The measurand can then be found because its relationship to the position of the extremum is assumed known. We present analytical expressions relating the relative error in the extremum position to experimental errors in the dependent variable. This is done for two cases: (i) a simple non-parametric estimate of the mean based on moments and (ii) the case in which a least squares technique is used to fit a Lorentzian to the data. The question of statistical bias in the estimates is discussed and in the second case we go further and present for the first time a general method by which the probability density function (PDF) of errors in the fitted parameters can be obtained in closed form in terms of the PDFs of the errors in the noisy data.
Resumo:
High-speed videokeratoscopy is an emerging technique that enables study of the corneal surface and tear-film dynamics. Unlike its static predecessor, this new technique results in a very large amount of digital data for which storage needs become significant. We aimed to design a compression technique that would use mathematical functions to parsimoniously fit corneal surface data with a minimum number of coefficients. Since the Zernike polynomial functions that have been traditionally used for modeling corneal surfaces may not necessarily correctly represent given corneal surface data in terms of its optical performance, we introduced the concept of Zernike polynomial-based rational functions. Modeling optimality criteria were employed in terms of both the rms surface error as well as the point spread function cross-correlation. The parameters of approximations were estimated using a nonlinear least-squares procedure based on the Levenberg-Marquardt algorithm. A large number of retrospective videokeratoscopic measurements were used to evaluate the performance of the proposed rational-function-based modeling approach. The results indicate that the rational functions almost always outperform the traditional Zernike polynomial approximations with the same number of coefficients.
Resumo:
Data flow analysis techniques can be used to help assess threats to data confidentiality and integrity in security critical program code. However, a fundamental weakness of static analysis techniques is that they overestimate the ways in which data may propagate at run time. Discounting large numbers of these false-positive data flow paths wastes an information security evaluator's time and effort. Here we show how to automatically eliminate some false-positive data flow paths by precisely modelling how classified data is blocked by certain expressions in embedded C code. We present a library of detailed data flow models of individual expression elements and an algorithm for introducing these components into conventional data flow graphs. The resulting models can be used to accurately trace byte-level or even bit-level data flow through expressions that are normally treated as atomic. This allows us to identify expressions that safely downgrade their classified inputs and thereby eliminate false-positive data flow paths from the security evaluation process. To validate the approach we have implemented and tested it in an existing data flow analysis toolkit.
Resumo:
This study proceeds from a central interest in the importance of systematically evaluating operational large-scale integrated information systems (IS) in organisations. The study is conducted within the IS-Impact Research Track at Queensland University of Technology (QUT). The goal of the IS-Impact Track is, "to develop the most widely employed model for benchmarking information systems in organizations for the joint benefit of both research and practice" (Gable et al, 2009). The track espouses programmatic research having the principles of incrementalism, tenacity, holism and generalisability through replication and extension research strategies. Track efforts have yielded the bicameral IS-Impact measurement model; the ‘impact’ half includes Organisational-Impact and Individual-Impact dimensions; the ‘quality’ half includes System-Quality and Information-Quality dimensions. Akin to Gregor’s (2006) analytic theory, the ISImpact model is conceptualised as a formative, multidimensional index and is defined as "a measure at a point in time, of the stream of net benefits from the IS, to date and anticipated, as perceived by all key-user-groups" (Gable et al., 2008, p: 381). The study adopts the IS-Impact model (Gable, et al., 2008) as its core theory base. Prior work within the IS-Impact track has been consciously constrained to Financial IS for their homogeneity. This study adopts a context-extension strategy (Berthon et al., 2002) with the aim "to further validate and extend the IS-Impact measurement model in a new context - i.e. a different IS - Human Resources (HR)". The overarching research question is: "How can the impacts of large-scale integrated HR applications be effectively and efficiently benchmarked?" This managerial question (Cooper & Emory, 1995) decomposes into two more specific research questions – In the new HR context: (RQ1): "Is the IS-Impact model complete?" (RQ2): "Is the ISImpact model valid as a 1st-order formative, 2nd-order formative multidimensional construct?" The study adhered to the two-phase approach of Gable et al. (2008) to hypothesise and validate a measurement model. The initial ‘exploratory phase’ employed a zero base qualitative approach to re-instantiating the IS-Impact model in the HR context. The subsequent ‘confirmatory phase’ sought to validate the resultant hypothesised measurement model against newly gathered quantitative data. The unit of analysis for the study is the application, ‘ALESCO’, an integrated large-scale HR application implemented at Queensland University of Technology (QUT), a large Australian university (with approximately 40,000 students and 5000 staff). Target respondents of both study phases were ALESCO key-user-groups: strategic users, management users, operational users and technical users, who directly use ALESCO or its outputs. An open-ended, qualitative survey was employed in the exploratory phase, with the objective of exploring the completeness and applicability of the IS-Impact model’s dimensions and measures in the new context, and to conceptualise any resultant model changes to be operationalised in the confirmatory phase. Responses from 134 ALESCO users to the main survey question, "What do you consider have been the impacts of the ALESCO (HR) system in your division/department since its implementation?" were decomposed into 425 ‘impact citations.’ Citation mapping using a deductive (top-down) content analysis approach instantiated all dimensions and measures of the IS-Impact model, evidencing its content validity in the new context. Seeking to probe additional (perhaps negative) impacts; the survey included the additional open question "In your opinion, what can be done better to improve the ALESCO (HR) system?" Responses to this question decomposed into a further 107 citations which in the main did not map to IS-Impact, but rather coalesced around the concept of IS-Support. Deductively drawing from relevant literature, and working inductively from the unmapped citations, the new ‘IS-Support’ construct, including the four formative dimensions (i) training, (ii) documentation, (iii) assistance, and (iv) authorisation (each having reflective measures), was defined as: "a measure at a point in time, of the support, the [HR] information system key-user groups receive to increase their capabilities in utilising the system." Thus, a further goal of the study became validation of the IS-Support construct, suggesting the research question (RQ3): "Is IS-Support valid as a 1st-order reflective, 2nd-order formative multidimensional construct?" With the aim of validating IS-Impact within its nomological net (identification through structural relations), as in prior work, Satisfaction was hypothesised as its immediate consequence. The IS-Support construct having derived from a question intended to probe IS-Impacts, too was hypothesised as antecedent to Satisfaction, thereby suggesting the research question (RQ4): "What is the relative contribution of IS-Impact and IS-Support to Satisfaction?" With the goal of testing the above research questions, IS-Impact, IS-Support and Satisfaction were operationalised in a quantitative survey instrument. Partial least squares (PLS) structural equation modelling employing 221 valid responses largely evidenced the validity of the commencing IS-Impact model in the HR context. ISSupport too was validated as operationalised (including 11 reflective measures of its 4 formative dimensions). IS-Support alone explained 36% of Satisfaction; IS-Impact alone 70%; in combination both explaining 71% with virtually all influence of ISSupport subsumed by IS-Impact. Key study contributions to research include: (1) validation of IS-Impact in the HR context, (2) validation of a newly conceptualised IS-Support construct as important antecedent of Satisfaction, and (3) validation of the redundancy of IS-Support when gauging IS-Impact. The study also makes valuable contributions to practice, the research track and the sponsoring organisation.
Resumo:
Abstract An experimental dataset representing a typical flow field in a stormwater gross pollutant trap (GPT) was visualised. A technique was developed to apply the image-based flow visualisation (IBFV) algorithm to the raw dataset. Particle image velocimetry (PIV) software was previously used to capture the flow field data by tracking neutrally buoyant particles with a high speed camera. The dataset consisted of scattered 2D point velocity vectors and the IBFV visualisation facilitates flow feature characterisation within the GPT. The flow features played a pivotal role in understanding stormwater pollutant capture and retention behaviour within the GPT. It was found that the IBFV animations revealed otherwise unnoticed flow features and experimental artefacts. For example, a circular tracer marker in the IBFV program visually highlighted streamlines to investigate the possible flow paths of pollutants entering the GPT. The investigated flow paths were compared with the behaviour of pollutants monitored during experiments.
Resumo:
This paper describes a safety data recording and analysis system that has been developed to capture safety occurrences including precursors using high-definition forward-facing video from train cabs and data from other train-borne systems. The paper describes the data processing model and how events detected through data analysis are related to an underlying socio-technical model of accident causation. The integrated approach to safety data recording and analysis insures systemic factors that condition, influence or potentially contribute to an occurrence are captured both for safety occurrences and precursor events, providing a rich tapestry of antecedent causal factors that can significantly improve learning around accident causation. This can ultimately provide benefit to railways through the development of targeted and more effective countermeasures, better risk models and more effective use and prioritization of safety funds. Level crossing occurrences are a key focus in this paper with data analysis scenarios describing causal factors around near-miss occurrences. The paper concludes with a discussion on how the system can also be applied to other types of railway safety occurrences.
Resumo:
This paper offers numerical modelling of a waste heat recovery system. A thin layer of metal foam is attached to a cold plate to absorb heat from hot gases leaving the system. The heat transferred from the exhaust gas is then transferred to a cold liquid flowing in a secondary loop. Two different foam PPI (Pores Per Inch) values are examined over a range of fluid velocities. Numerical results are then compared to both experimental data and theoretical results available in the literature. Challenges in getting the simulation results to match those of the experiments are addressed and discussed in detail. In particular, interface boundary conditions specified between a porous layer and a fluid layer are investigated. While physically one expects much lower fluid velocity in the pores compared to that of free flow, capturing this sharp gradient at the interface can add to the difficulties of numerical simulation. The existing models in the literature are modified by considering the pressure gradient inside and outside the foam. Comparisons against the numerical modelling are presented. Finally, based on experimentally-validated numerical results, thermo-hydraulic performance of foam heat exchangers as waste heat recovery units is discussed with the main goal of reducing the excess pressure drop and maximising the amount of heat that can be recovered from the hot gas stream.
Resumo:
As a result of the more distributed nature of organisations and the inherently increasing complexity of their business processes, a significant effort is required for the specification and verification of those processes. The composition of the activities into a business process that accomplishes a specific organisational goal has primarily been a manual task. Automated planning is a branch of artificial intelligence (AI) in which activities are selected and organised by anticipating their expected outcomes with the aim of achieving some goal. As such, automated planning would seem to be a natural fit to the BPM domain to automate the specification of control flow. A number of attempts have been made to apply automated planning to the business process and service composition domain in different stages of the BPM lifecycle. However, a unified adoption of these techniques throughout the BPM lifecycle is missing. As such, we propose a new intention-centric BPM paradigm, which aims on minimising the specification effort by exploiting automated planning techniques to achieve a pre-stated goal. This paper provides a vision on the future possibilities of enhancing BPM using automated planning. A research agenda is presented, which provides an overview of the opportunities and challenges for the exploitation of automated planning in BPM.