Biblioteca Digital

Multiple regression analysis is a complex statistical method with many potential uses. It has also become one of the most abused of all statistical procedures since anyone with a data base and suitable software can carry it out. An investigator should always have a clear hypothesis in mind before carrying out such a procedure and knowledge of the limitations of each aspect of the analysis. In addition, multiple regression is probably best used in an exploratory context, identifying variables that might profitably be examined by more detailed studies. Where there are many variables potentially influencing Y, they are likely to be intercorrelated and to account for relatively small amounts of the variance. Any analysis in which R squared is less than 50% should be suspect as probably not indicating the presence of significant variables. A further problem relates to sample size. It is often stated that the number of subjects or patients must be at least 5-10 times the number of variables included in the study.5 This advice should be taken only as a rough guide but it does indicate that the variables included should be selected with great care as inclusion of an obviously unimportant variable may have a significant impact on the sample size required.

Veja mais

Large-scale data for multiple-view stereopsis

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The seminal multiple-view stereo benchmark evaluations from Middlebury and by Strecha et al. have played a major role in propelling the development of multi-view stereopsis (MVS) methodology. The somewhat small size and variability of these data sets, however, limit their scope and the conclusions that can be derived from them. To facilitate further development within MVS, we here present a new and varied data set consisting of 80 scenes, seen from 49 or 64 accurate camera positions. This is accompanied by accurate structured light scans for reference and evaluation. In addition all images are taken under seven different lighting conditions. As a benchmark and to validate the use of our data set for obtaining reasonable and statistically significant findings about MVS, we have applied the three state-of-the-art MVS algorithms by Campbell et al., Furukawa et al., and Tola et al. to the data set. To do this we have extended the evaluation protocol from the Middlebury evaluation, necessitated by the more complex geometry of some of our scenes. The data set and accompanying evaluation framework are made freely available online. Based on this evaluation, we are able to observe several characteristics of state-of-the-art MVS, e.g. that there is a tradeoff between the quality of the reconstructed 3D points (accuracy) and how much of an object’s surface is captured (completeness). Also, several issues that we hypothesized would challenge MVS, such as specularities and changing lighting conditions did not pose serious problems. Our study finds that the two most pressing issues for MVS are lack of texture and meshing (forming 3D points into closed triangulated surfaces).

Veja mais

Improved data visualisation through multiple dissimilarity modelling

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Popular dimension reduction and visualisation algorithms rely on the assumption that input dissimilarities are typically Euclidean, for instance Metric Multidimensional Scaling, t-distributed Stochastic Neighbour Embedding and the Gaussian Process Latent Variable Model. It is well known that this assumption does not hold for most datasets and often high-dimensional data sits upon a manifold of unknown global geometry. We present a method for improving the manifold charting process, coupled with Elastic MDS, such that we no longer assume that the manifold is Euclidean, or of any particular structure. We draw on the benefits of different dissimilarity measures allowing for the relative responsibilities, under a linear combination, to drive the visualisation process.

Veja mais

An R-package for the Estimation and Testing of Multiple Covariates and Biomarker Interactions for Survival Data Based on Local Partial Likelihood

Relevância:

40.00% 40.00%

Publicador:

Resumo:

When we study the variables that a ffect survival time, we usually estimate their eff ects by the Cox regression model. In biomedical research, e ffects of the covariates are often modi ed by a biomarker variable. This leads to covariates-biomarker interactions. Here biomarker is an objective measurement of the patient characteristics at baseline. Liu et al. (2015) has built up a local partial likelihood bootstrap model to estimate and test this interaction e ffect of covariates and biomarker, but the R code developed by Liu et al. (2015) can only handle one variable and one interaction term and can not t the model with adjustment to nuisance variables. In this project, we expand the model to allow adjustment to nuisance variables, expand the R code to take any chosen interaction terms, and we set up many parameters for users to customize their research. We also build up an R package called "lplb" to integrate the complex computations into a simple interface. We conduct numerical simulation to show that the new method has excellent fi nite sample properties under both the null and alternative hypothesis. We also applied the method to analyze data from a prostate cancer clinical trial with acid phosphatase (AP) biomarker.

Veja mais

Joint modelling of multiple longitudinal and competing risks data, with applications in nephrology

Relevância:

40.00% 40.00%

Publicador:

Veja mais

Big data opportunities and challenges for assessing multiple stressors across scales in aquatic ecosystems

Relevância:

40.00% 40.00%

Publicador:

Veja mais

Hierarchical clustering with multiple-height branch-cut applied to short time-series gene expression data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Rigid adherence to pre-specified thresholds and static graphical representations can lead to incorrect decisions on merging of clusters. As an alternative to existing automated or semi-automated methods, we developed a visual analytics approach for performing hierarchical clustering analysis of short time-series gene expression data. Dynamic sliders control parameters such as the similarity threshold at which clusters are merged and the level of relative intra-cluster distinctiveness, which can be used to identify "weak-edges" within clusters. An expert user can drill down to further explore the dendrogram and detect nested clusters and outliers. This is done by using the sliders and by pointing and clicking on the representation to cut the branches of the tree in multiple-heights. A prototype of this tool has been developed in collaboration with a small group of biologists for analysing their own datasets. Initial feedback on the tool has been positive.

Veja mais

Optimal power allocation and load distribution for multiple heterogeneous multicore server processors across clouds and data centers

Relevância:

40.00% 40.00%

Publicador:

Resumo:

For multiple heterogeneous multicore server processors across clouds and data centers, the aggregated performance of the cloud of clouds can be optimized by load distribution and balancing. Energy efficiency is one of the most important issues for large-scale server systems in current and future data centers. The multicore processor technology provides new levels of performance and energy efficiency. The present paper aims to develop power and performance constrained load distribution methods for cloud computing in current and future large-scale data centers. In particular, we address the problem of optimal power allocation and load distribution for multiple heterogeneous multicore server processors across clouds and data centers. Our strategy is to formulate optimal power allocation and load distribution for multiple servers in a cloud of clouds as optimization problems, i.e., power constrained performance optimization and performance constrained power optimization. Our research problems in large-scale data centers are well-defined multivariable optimization problems, which explore the power-performance tradeoff by fixing one factor and minimizing the other, from the perspective of optimal load distribution. It is clear that such power and performance optimization is important for a cloud computing provider to efficiently utilize all the available resources. We model a multicore server processor as a queuing system with multiple servers. Our optimization problems are solved for two different models of core speed, where one model assumes that a core runs at zero speed when it is idle, and the other model assumes that a core runs at a constant speed. Our results in this paper provide new theoretical insights into power management and performance optimization in data centers.

Veja mais

975 resultados para multiple data

Filtro por publicador