406 resultados para Optimal Portfolio Selection
Resumo:
We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice.
Resumo:
A modeling paradigm is proposed for covariate, variance and working correlation structure selection for longitudinal data analysis. Appropriate selection of covariates is pertinent to correct variance modeling and selecting the appropriate covariates and variance function is vital to correlation structure selection. This leads to a stepwise model selection procedure that deploys a combination of different model selection criteria. Although these criteria find a common theoretical root based on approximating the Kullback-Leibler distance, they are designed to address different aspects of model selection and have different merits and limitations. For example, the extended quasi-likelihood information criterion (EQIC) with a covariance penalty performs well for covariate selection even when the working variance function is misspecified, but EQIC contains little information on correlation structures. The proposed model selection strategies are outlined and a Monte Carlo assessment of their finite sample properties is reported. Two longitudinal studies are used for illustration.
Resumo:
Sampling strategies are developed based on the idea of ranked set sampling (RSS) to increase efficiency and therefore to reduce the cost of sampling in fishery research. The RSS incorporates information on concomitant variables that are correlated with the variable of interest in the selection of samples. For example, estimating a monitoring survey abundance index would be more efficient if the sampling sites were selected based on the information from previous surveys or catch rates of the fishery. We use two practical fishery examples to demonstrate the approach: site selection for a fishery-independent monitoring survey in the Australian northern prawn fishery (NPF) and fish age prediction by simple linear regression modelling a short-lived tropical clupeoid. The relative efficiencies of the new designs were derived analytically and compared with the traditional simple random sampling (SRS). Optimal sampling schemes were measured by different optimality criteria. For the NPF monitoring survey, the efficiency in terms of variance or mean squared errors of the estimated mean abundance index ranged from 114 to 199% compared with the SRS. In the case of a fish ageing study for Tenualosa ilisha in Bangladesh, the efficiency of age prediction from fish body weight reached 140%.
Resumo:
This paper considers the one-sample sign test for data obtained from general ranked set sampling when the number of observations for each rank are not necessarily the same, and proposes a weighted sign test because observations with different ranks are not identically distributed. The optimal weight for each observation is distribution free and only depends on its associated rank. It is shown analytically that (1) the weighted version always improves the Pitman efficiency for all distributions; and (2) the optimal design is to select the median from each ranked set.
Resumo:
Consider a general regression model with an arbitrary and unknown link function and a stochastic selection variable that determines whether the outcome variable is observable or missing. The paper proposes U-statistics that are based on kernel functions as estimators for the directions of the parameter vectors in the link function and the selection equation, and shows that these estimators are consistent and asymptotically normal.
Resumo:
Yao, Begg, and Livingston (1996, Biometrics 52, 992-1001) considered the optimal group size for testing a series of potentially therapeutic agents to identify a promising one as soon as possible for given error rates. The number of patients to be tested with each agent was fixed as the group size. We consider a sequential design that allows early acceptance and rejection, and we provide an optimal strategy to minimize the sample sizes (patients) required using Markov decision processes. The minimization is under the constraints of the two types (false positive and false negative) of error probabilities, with the Lagrangian multipliers corresponding to the cost parameters for the two types of errors. Numerical studies indicate that there can be a substantial reduction in the number of patients required.
Resumo:
Efficiency of analysis using generalized estimation equations is enhanced when intracluster correlation structure is accurately modeled. We compare two existing criteria (a quasi-likelihood information criterion, and the Rotnitzky-Jewell criterion) to identify the true correlation structure via simulations with Gaussian or binomial response, covariates varying at cluster or observation level, and exchangeable or AR(l) intracluster correlation structure. Rotnitzky and Jewell's approach performs better when the true intracluster correlation structure is exchangeable, while the quasi-likelihood criteria performs better for an AR(l) structure.
Resumo:
Several articles in this journal have studied optimal designs for testing a series of treatments to identify promising ones for further study. These designs formulate testing as an ongoing process until a promising treatment is identified. This formulation is considered to be more realistic but substantially increases the computational complexity. In this article, we show that these new designs, which control the error rates for a series of treatments, can be reformulated as conventional designs that control the error rates for each individual treatment. This reformulation leads to a more meaningful interpretation of the error rates and hence easier specification of the error rates in practice. The reformulation also allows us to use conventional designs from published tables or standard computer programs to design trials for a series of treatments. We illustrate these using a study in soft tissue sarcoma.
Resumo:
Australia is the world’s third largest exporter of raw sugar after Brazil and Thailand, with around $2.0 billion in export earnings. Transport systems play a vital role in the raw sugar production process by transporting the sugarcane crop between farms and mills. In 2013, 87 per cent of sugarcane was transported to mills by cane railway. The total cost of sugarcane transport operations is very high. Over 35% of the total cost of sugarcane production in Australia is incurred in cane transport. A cane railway network mainly involves single track sections and multiple track sections used as passing loops or sidings. The cane railway system performs two main tasks: delivering empty bins from the mill to the sidings for filling by harvesters; and collecting the full bins of cane from the sidings and transporting them to the mill. A typical locomotive run involves an empty train (locomotive and empty bins) departing from the mill, traversing some track sections and delivering bins at specified sidings. The locomotive then, returns to the mill, traversing the same track sections in reverse order, collecting full bins along the way. In practice, a single track section can be occupied by only one train at a time, while more than one train can use a passing loop (parallel sections) at a time. The sugarcane transport system is a complex system that includes a large number of variables and elements. These elements work together to achieve the main system objectives of satisfying both mill and harvester requirements and improving the efficiency of the system in terms of low overall costs. These costs include delay, congestion, operating and maintenance costs. An effective cane rail scheduler will assist the traffic officers at the mill to keep a continuous supply of empty bins to harvesters and full bins to the mill with a minimum cost. This paper addresses the cane rail scheduling problem under rail siding capacity constraints where limited and unlimited siding capacities were investigated with different numbers of trains and different train speeds. The total operating time as a function of the number of trains, train shifts and a limited number of cane bins have been calculated for the different siding capacity constraints. A mathematical programming approach has been used to develop a new scheduler for the cane rail transport system under limited and unlimited constraints. The new scheduler aims to reduce the total costs associated with the cane rail transport system that are a function of the number of bins and total operating costs. The proposed metaheuristic techniques have been used to find near optimal solutions of the cane rail scheduling problem and provide different possible solutions to avoid being stuck in local optima. A numerical investigation and sensitivity analysis study is presented to demonstrate that high quality solutions for large scale cane rail scheduling problems are obtainable in a reasonable time. Keywords: Cane railway, mathematical programming, capacity, metaheuristics
Resumo:
Movement of tephritid flies underpins their survival, reproduction, and ability to establish in new areas and is thus of importance when designing effective management strategies. Much of the knowledge currently available on tephritid movement throughout landscapes comes from the use of direct or indirect methods that rely on the trapping of individuals. Here, we review published experimental designs and methods from mark-release-recapture (MRR) studies, as well as other methods, that have been used to estimate movement of the four major tephritid pest genera (Bactrocera, Ceratitis, Anastrepha, and Rhagoletis). In doing so, we aim to illustrate the theoretical and practical considerations needed to study tephritid movement. MRR studies make use of traps to directly estimate the distance that tephritid species can move within a generation and to evaluate the ecological and physiological factors that influence dispersal patterns. MRR studies, however, require careful planning to ensure that the results obtained are not biased by the methods employed, including marking methods, trap properties, trap spacing, and spatial extent of the trapping array. Despite these obstacles, MRR remains a powerful tool for determining tephritid movement, with data particularly required for understudied species that affect developing countries. To ensure that future MRR studies are successful, we suggest that site selection be carefully considered and sufficient resources be allocated to achieve optimal spacing and placement of traps in line with the stated aims of each study. An alternative to MRR is to make use of indirect methods for determining movement, or more correctly, gene flow, which have become widely available with the development of molecular tools. Key to these methods is the trapping and sequencing of a suitable number of individuals to represent the genetic diversity of the sampled population and investigate population structuring using nuclear genomic markers or non-recombinant mitochondrial DNA markers. Microsatellites are currently the preferred marker for detecting recent population displacement and provide genetic information that may be used in assignment tests for the direct determination of contemporary movement. Neither MRR nor molecular methods, however, are able to monitor fine-scale movements of individual flies. Recent developments in the miniaturization of electronics offer the tantalising possibility to track individual movements of insects using harmonic radar. Computer vision and radio frequency identification tags may also permit the tracking of fine-scale movements by tephritid flies by automated resampling, although these methods come with the same problems as traditional traps used in MRR studies. Although all methods described in this chapter have limitations, a better understanding of tephritid movement far outweighs the drawbacks of the individual methods because of the need for this information to manage tephritid populations.
Resumo:
Typically only a limited number of consortiums are able to competitively bid for Public Private Partnership (PPP) projects. Consequently, this may lead to oligopoly pricing constraints and ineffective competition, thus engendering ex ante market failure. In addressing this issue, this paper aims to determine the optimal number of bidders required to ensure a healthy level of competition is available to procure major infrastructure projects. The theories of Structure-Conduct-Performance (SCP) paradigm; Game Theory and Auction Theory and Transaction Cost Economics are reviewed and discussed and used to produce an optimal level of competition for major infrastructure procurement, that prevents market failure ex ante (lack of competition) and market failure ex post (due to asymmetric lock-in).
Resumo:
The quality of species distribution models (SDMs) relies to a large degree on the quality of the input data, from bioclimatic indices to environmental and habitat descriptors (Austin, 2002). Recent reviews of SDM techniques, have sought to optimize predictive performance e.g. Elith et al., 2006. In general SDMs employ one of three approaches to variable selection. The simplest approach relies on the expert to select the variables, as in environmental niche models Nix, 1986 or a generalized linear model without variable selection (Miller and Franklin, 2002). A second approach explicitly incorporates variable selection into model fitting, which allows examination of particular combinations of variables. Examples include generalized linear or additive models with variable selection (Hastie et al. 2002); or classification trees with complexity or model based pruning (Breiman et al., 1984, Zeileis, 2008). A third approach uses model averaging, to summarize the overall contribution of a variable, without considering particular combinations. Examples include neural networks, boosted or bagged regression trees and Maximum Entropy as compared in Elith et al. 2006. Typically, users of SDMs will either consider a small number of variable sets, via the first approach, or else supply all of the candidate variables (often numbering more than a hundred) to the second or third approaches. Bayesian SDMs exist, with several methods for eliciting and encoding priors on model parameters (see review in Low Choy et al. 2010). However few methods have been published for informative variable selection; one example is Bayesian trees (O’Leary 2008). Here we report an elicitation protocol that helps makes explicit a priori expert judgements on the quality of candidate variables. This protocol can be flexibly applied to any of the three approaches to variable selection, described above, Bayesian or otherwise. We demonstrate how this information can be obtained then used to guide variable selection in classical or machine learning SDMs, or to define priors within Bayesian SDMs.
Resumo:
We carried out a discriminant analysis with identity by descent (IBD) at each marker as inputs, and the sib pair type (affected-affected versus affected-unaffected) as the output. Using simple logistic regression for this discriminant analysis, we illustrate the importance of comparing models with different number of parameters. Such model comparisons are best carried out using either the Akaike information criterion (AIC) or the Bayesian information criterion (BIC). When AIC (or BIC) stepwise variable selection was applied to the German Asthma data set, a group of markers were selected which provide the best fit to the data (assuming an additive effect). Interestingly, these 25-26 markers were not identical to those with the highest (in magnitude) single-locus lod scores.
Resumo:
This research is a step forward in discovering knowledge from databases of complex structure like tree or graph. Several data mining algorithms are developed based on a novel representation called Balanced Optimal Search for extracting implicit, unknown and potentially useful information like patterns, similarities and various relationships from tree data, which are also proved to be advantageous in analysing big data. This thesis focuses on analysing unordered tree data, which is robust to data inconsistency, irregularity and swift information changes, hence, in the era of big data it becomes a popular and widely used data model.
Resumo:
Work ability describes employees' capability to carry out their work with respect to physical and psychological job demands. This study investigated direct and interactive effects of age, job control, and the use of successful aging strategies called selection, optimization, and compensation (SOC) in predicting work ability. We assessed SOC strategies and job control by using employee self-reports, and we measured employees' work ability using supervisor ratings. Data collected from 173 health-care employees showed that job control was positively associated with work ability. Additionally, we found a three-way interaction effect of age, job control, and use of SOC strategies on work ability. Specifically, the negative relationship between age and work ability was weakest for employees with high job control and high use of SOC strategies. These results suggest that the use of successful aging strategies and enhanced control at work are conducive to maintaining the work ability of aging employees. We discuss theoretical and practical implications regarding the beneficial role of the use of SOC strategies utilized by older employees and enhanced contextual resources at work for aging employees.