Biblioteca Digital

7 resultados para binary data

em BORIS: Bern Open Repository and Information System - Berna - Suiça

Efficient binary classification of large data sets

Relevância:

40.00% 40.00%

Publicador:

Veja mais

Index Tracking Using Data-Mining Techniques and Mixed-Binary Linear Programming

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Index tracking has become one of the most common strategies in asset management. The index-tracking problem consists of constructing a portfolio that replicates the future performance of an index by including only a subset of the index constituents in the portfolio. Finding the most representative subset is challenging when the number of stocks in the index is large. We introduce a new three-stage approach that at first identifies promising subsets by employing data-mining techniques, then determines the stock weights in the subsets using mixed-binary linear programming, and finally evaluates the subsets based on cross validation. The best subset is returned as the tracking portfolio. Our approach outperforms state-of-the-art methods in terms of out-of-sample performance and running times.

Veja mais

Index Tracking Using Data-Mining Techniques and Mixed-Binary Linear Programming

Relevância:

40.00% 40.00%

Publicador:

Veja mais

A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Publication bias and related bias in meta-analysis is often examined by visually checking for asymmetry in funnel plots of treatment effect against its standard error. Formal statistical tests of funnel plot asymmetry have been proposed, but when applied to binary outcome data these can give false-positive rates that are higher than the nominal level in some situations (large treatment effects, or few events per trial, or all trials of similar sizes). We develop a modified linear regression test for funnel plot asymmetry based on the efficient score and its variance, Fisher's information. The performance of this test is compared to the other proposed tests in simulation analyses based on the characteristics of published controlled trials. When there is little or no between-trial heterogeneity, this modified test has a false-positive rate close to the nominal level while maintaining similar power to the original linear regression test ('Egger' test). When the degree of between-trial heterogeneity is large, none of the tests that have been proposed has uniformly good properties.

Veja mais

On Graphically Checking Goodness-of-fit of Binary Logistic Regression Models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVES: This paper is concerned with checking goodness-of-fit of binary logistic regression models. For the practitioners of data analysis, the broad classes of procedures for checking goodness-of-fit available in the literature are described. The challenges of model checking in the context of binary logistic regression are reviewed. As a viable solution, a simple graphical procedure for checking goodness-of-fit is proposed. METHODS: The graphical procedure proposed relies on pieces of information available from any logistic analysis; the focus is on combining and presenting these in an informative way. RESULTS: The information gained using this approach is presented with three examples. In the discussion, the proposed method is put into context and compared with other graphical procedures for checking goodness-of-fit of binary logistic models available in the literature. CONCLUSION: A simple graphical method can significantly improve the understanding of any logistic regression analysis and help to prevent faulty conclusions.

Veja mais

Sparse computation for large-scale binary classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Well-known data mining algorithms rely on inputs in the form of pairwise similarities between objects. For large datasets it is computationally impossible to perform all pairwise comparisons. We therefore propose a novel approach that uses approximate Principal Component Analysis to efficiently identify groups of similar objects. The effectiveness of the approach is demonstrated in the context of binary classification using the supervised normalized cut as a classifier. For large datasets from the UCI repository, the approach significantly improves run times with minimal loss in accuracy.

Veja mais

Reliability of operational data from pig herds and performance ratings by veterinarians and pig farmers collected during telephone interviews for the evaluation of a PCV2 piglet vaccination

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BackgroundThe aim of the present study was to evaluate the feasibility of using a telephone survey in gaining an understanding of the possible herd and management factors influencing the performance (i.e. safety and efficacy) of a vaccine against porcine circovirus type 2 (PCV2) in a large number of herds and to estimate customers¿ satisfaction.ResultsDatasets from 227 pig herds that currently applied or have applied a PCV2 vaccine were analysed. Since 1-, 2- and 3-site production systems were surveyed, the herds were allocated in one of two subsets, where only applicable variables out of 180 were analysed. Group 1 was comprised of herds with sows, suckling pigs and nursery pigs, whereas herds in Group 2 in all cases kept fattening pigs. Overall 14 variables evaluating the subjective satisfaction with one particular PCV2 vaccine were comingled to an abstract dependent variable for further models, which was characterized by a binary outcome from a cluster analysis: good/excellent satisfaction (green cluster) and moderate satisfaction (red cluster). The other 166 variables comprised information about diagnostics, vaccination, housing, management, were considered as independent variables. In Group 1, herds using the vaccine due to recognised PCV2 related health problems (wasting, mortality or porcine dermatitis and nephropathy syndrome) had a 2.4-fold increased chance (1/OR) of belonging to the green cluster. In the final model for Group 1, the diagnosis of diseases other than PCV2, the reason for vaccine administration being other than PCV2-associated diseases and using a single injection of iron had significant influence on allocating into the green cluster (P¿<¿0.05). In Group 2, only unchanged time or delay of time of vaccination influenced the satisfaction (P¿<¿0.05).ConclusionThe methodology and statistical approach used in this study were feasible to scientifically assess ¿satisfaction¿, and to determine factors influencing farmers¿ and vets¿ opinion about the safety and efficacy of a new vaccine.

Veja mais

7 resultados para binary data

em BORIS: Bern Open Repository and Information System - Berna - Suiça

Filtro por publicador

Efficient binary classification of large data sets

Index Tracking Using Data-Mining Techniques and Mixed-Binary Linear Programming

Index Tracking Using Data-Mining Techniques and Mixed-Binary Linear Programming

A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints

On Graphically Checking Goodness-of-fit of Binary Logistic Regression Models

Sparse computation for large-scale binary classification

Reliability of operational data from pig herds and performance ratings by veterinarians and pig farmers collected during telephone interviews for the evaluation of a PCV2 piglet vaccination