980 resultados para Sparse Data


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The biggest challenge in conservation biology is breaking down the gap between research and practical management. A major obstacle is the fact that many researchers are unwilling to tackle projects likely to produce sparse or messy data because the results would be difficult to publish in refereed journals. The obvious solution to sparse data is to build up results from multiple studies. Consequently, we suggest that there needs to be greater emphasis in conservation biology on publishing papers that can be built on by subsequent research rather than on papers that produce clear results individually. This building approach requires: (1) a stronger theoretical framework, in which researchers attempt to anticipate models that will be relevant in future studies and incorporate expected differences among studies into those models; (2) use of modern methods for model selection and multi-model inference, and publication of parameter estimates under a range of plausible models; (3) explicit incorporation of prior information into each case study; and (4) planning management treatments in an adaptive framework that considers treatments applied in other studies. We encourage journals to publish papers that promote this building approach rather than expecting papers to conform to traditional standards of rigor as stand-alone papers, and believe that this shift in publishing philosophy would better encourage researchers to tackle the most urgent conservation problems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Funded by COST (European Cooperation in Science and Technology) CEH projects. Grant Numbers: NEC05264, NEC05100 Natural Environment Research Council UK. Grant Number: NE/J008001/1 © 2016 The Authors. Global Change Biology Published by John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A hydrological–economic model is introduced to describe the dynamics of groundwater-dependent economics (agriculture and tourism) for sustainable use in sparse-data drylands. The Amtoudi Oasis, a remote area in southern Morocco, in the northern Sahara attractive for tourism and with evidence of groundwater degradation, was chosen to show the model operation. Governing system variables were identified and put into action through System Dynamics (SD) modeling causal diagrams to program basic formulations into a model having two modules coupled by the nexus ‘pumping’: (1) the hydrological module represents the net groundwater balance (G) dynamics; and (2) the economic module reproduces the variation in the consumers of water, both the population and tourists. The model was operated under similar influx of tourists and different scenarios of water availability, such as the wet 2009–2010 and the average 2010–2011 hydrological years. The rise in international tourism is identified as the main driving force reducing emigration and introducing new social habits in the population, in particular concerning water consumption. Urban water allotment (PU) was doubled for less than a 100-inhabitant net increase in recent decades. The water allocation for agriculture (PI), the largest consumer of water, had remained constant for decades. Despite that the 2-year monitoring period is not long enough to draw long-term conclusions, groundwater imbalance was reflected by net aquifer recharge (R) less than PI + PU (G < 0) in the average year 2010–2011, with net lateral inflow from adjacent Cambrian formations being the largest recharge component. R is expected to be much less than PI + PU in recurrent dry spells. Some low-technology actions are tentatively proposed to mitigate groundwater degradation, such as: wastewater capture, treatment, and reuse for irrigation; storm-water harvesting for irrigation; and active maintenance of the irrigation system to improve its efficiency.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Maps of kriged soil properties for precision agriculture are often based on a variogram estimated from too few data because the costs of sampling and analysis are often prohibitive. If the variogram has been computed by the usual method of moments, it is likely to be unstable when there are fewer than 100 data. The scale of variation in soil properties should be investigated prior to sampling by computing a variogram from ancillary data, such as an aerial photograph of the bare soil. If the sampling interval suggested by this is large in relation to the size of the field there will be too few data to estimate a reliable variogram for kriging. Standardized variograms from aerial photographs can be used with standardized soil data that are sparse, provided the data are spatially structured and the nugget:sill ratio is similar to that of a reliable variogram of the property. The problem remains of how to set this ratio in the absence of an accurate variogram. Several methods of estimating the nugget:sill ratio for selected soil properties are proposed and evaluated. Standardized variograms with nugget:sill ratios set by these methods are more similar to those computed from intensive soil data than are variograms computed from sparse soil data. The results of cross-validation and mapping show that the standardized variograms provide more accurate estimates, and preserve the main patterns of variation better than those computed from sparse data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A patient-specific surface model of the proximal femur plays an important role in planning and supporting various computer-assisted surgical procedures including total hip replacement, hip resurfacing, and osteotomy of the proximal femur. The common approach to derive 3D models of the proximal femur is to use imaging techniques such as computed tomography (CT) or magnetic resonance imaging (MRI). However, the high logistic effort, the extra radiation (CT-imaging), and the large quantity of data to be acquired and processed make them less functional. In this paper, we present an integrated approach using a multi-level point distribution model (ML-PDM) to reconstruct a patient-specific model of the proximal femur from intra-operatively available sparse data. Results of experiments performed on dry cadaveric bones using dozens of 3D points are presented, as well as experiments using a limited number of 2D X-ray images, which demonstrate promising accuracy of the present approach.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Estimation of Taylor`s power law for species abundance data may be performed by linear regression of the log empirical variances on the log means, but this method suffers from a problem of bias for sparse data. We show that the bias may be reduced by using a bias-corrected Pearson estimating function. Furthermore, we investigate a more general regression model allowing for site-specific covariates. This method may be efficiently implemented using a Newton scoring algorithm, with standard errors calculated from the inverse Godambe information matrix. The method is applied to a set of biomass data for benthic macrofauna from two Danish estuaries. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Monte Carlo simulation was used to evaluate properties of a simple Bayesian MCMC analysis of the random effects model for single group Cormack-Jolly-Seber capture-recapture data. The MCMC method is applied to the model via a logit link, so parameters p, S are on a logit scale, where logit(S) is assumed to have, and is generated from, a normal distribution with mean μ and variance σ2 . Marginal prior distributions on logit(p) and μ were independent normal with mean zero and standard deviation 1.75 for logit(p) and 100 for μ ; hence minimally informative. Marginal prior distribution on σ2 was placed on τ2=1/σ2 as a gamma distribution with α=β=0.001 . The study design has 432 points spread over 5 factors: occasions (t) , new releases per occasion (u), p, μ , and σ . At each design point 100 independent trials were completed (hence 43,200 trials in total), each with sample size n=10,000 from the parameter posterior distribution. At 128 of these design points comparisons are made to previously reported results from a method of moments procedure. We looked at properties of point and interval inference on μ , and σ based on the posterior mean, median, and mode and equal-tailed 95% credibility interval. Bayesian inference did very well for the parameter μ , but under the conditions used here, MCMC inference performance for σ was mixed: poor for sparse data (i.e., only 7 occasions) or σ=0 , but good when there were sufficient data and not small σ .

Relevância:

70.00% 70.00%

Publicador:

Resumo:

An implementation of Sem-ODB—a database management system based on the Semantic Binary Model is presented. A metaschema of Sem-ODB database as well as the top-level architecture of the database engine is defined. A new benchmarking technique is proposed which allows databases built on different database models to compete fairly. This technique is applied to show that Sem-ODB has excellent efficiency comparing to a relational database on a certain class of database applications. A new semantic benchmark is designed which allows evaluation of the performance of the features characteristic of semantic database applications. An application used in the benchmark represents a class of problems requiring databases with sparse data, complex inheritances and many-to-many relations. Such databases can be naturally accommodated by semantic model. A fixed predefined implementation is not enforced allowing the database designer to choose the most efficient structures available in the DBMS tested. The results of the benchmark are analyzed. ^ A new high-level querying model for semantic databases is defined. It is proven adequate to serve as an efficient native semantic database interface, and has several advantages over the existing interfaces. It is optimizable and parallelizable, supports the definition of semantic userviews and the interoperability of semantic databases with other data sources such as World Wide Web, relational, and object-oriented databases. The query is structured as a semantic database schema graph with interlinking conditionals. The query result is a mini-database, accessible in the same way as the original database. The paradigm supports and utilizes the rich semantics and inherent ergonomics of semantic databases. ^ The analysis and high-level design of a system that exploits the superiority of the Semantic Database Model to other data models in expressive power and ease of use to allow uniform access to heterogeneous data sources such as semantic databases, relational databases, web sites, ASCII files, and others via a common query interface is presented. The Sem-ODB engine is used to control all the data sources combined under a unified semantic schema. A particular application of the system to provide an ODBC interface to the WWW as a data source is discussed. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many learning problems require handling high dimensional datasets with a relatively small number of instances. Learning algorithms are thus confronted with the curse of dimensionality, and need to address it in order to be effective. Examples of these types of data include the bag-of-words representation in text classification problems and gene expression data for tumor detection/classification. Usually, among the high number of features characterizing the instances, many may be irrelevant (or even detrimental) for the learning tasks. It is thus clear that there is a need for adequate techniques for feature representation, reduction, and selection, to improve both the classification accuracy and the memory requirements. In this paper, we propose combined unsupervised feature discretization and feature selection techniques, suitable for medium and high-dimensional datasets. The experimental results on several standard datasets, with both sparse and dense features, show the efficiency of the proposed techniques as well as improvements over previous related techniques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Abstract Background: There are sparse data on the performance of different types of drug-eluting stents (DES) in acute and real-life setting. Objective: The aim of the study was to compare the safety and efficacy of first- versus second-generation DES in patients with acute coronary syndromes (ACS). Methods: This all-comer registry enrolled consecutive patients diagnosed with ACS and treated with percutaneous coronary intervention with the implantation of first- or second-generation DES in one-year follow-up. The primary efficacy endpoint was defined as major adverse cardiac and cerebrovascular event (MACCE), a composite of all-cause death, nonfatal myocardial infarction, target-vessel revascularization and stroke. The primary safety outcome was definite stent thrombosis (ST) at one year. Results: From the total of 1916 patients enrolled into the registry, 1328 patients were diagnosed with ACS. Of them, 426 were treated with first- and 902 with second-generation DES. There was no significant difference in the incidence of MACCE between two types of DES at one year. The rate of acute and subacute ST was higher in first- vs. second-generation DES (1.6% vs. 0.1%, p < 0.001, and 1.2% vs. 0.2%, p = 0.025, respectively), but there was no difference regarding late ST (0.7% vs. 0.2%, respectively, p = 0.18) and gastrointestinal bleeding (2.1% vs. 1.1%, p = 0.21). In Cox regression, first-generation DES was an independent predictor for cumulative ST (HR 3.29 [1.30-8.31], p = 0.01). Conclusions: In an all-comer registry of ACS, the one-year rate of MACCE was comparable in groups treated with first- and second-generation DES. The use of first-generation DES was associated with higher rates of acute and subacute ST and was an independent predictor of cumulative ST.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

CONTEXT: Sparse data exist on the combined associations between physical activity and sedentary time with cardiometabolic risk factors in healthy children. OBJECTIVE: To examine the independent and combined associations between objectively measured time in moderate- to vigorous-intensity physical activity (MVPA) and sedentary time with cardiometabolic risk factors. DESIGN, SETTING, AND PARTICIPANTS: Pooled data from 14 studies between 1998 and 2009 comprising 20 871 children (aged 4-18 years) from the International Children's Accelerometry Database. Time spent in MVPA and sedentary time were measured using accelerometry after reanalyzing raw data. The independent associations between time in MVPA and sedentary time, with outcomes, were examined using meta-analysis. Participants were stratified by tertiles of MVPA and sedentary time. MAIN OUTCOME MEASURES: Waist circumference, systolic blood pressure, fasting triglycerides, high-density lipoprotein cholesterol, and insulin. RESULTS: Times (mean [SD] min/d) accumulated by children in MVPA and being sedentary were 30 (21) and 354 (96), respectively. Time in MVPA was significantly associated with all cardiometabolic outcomes independent of sex, age, monitor wear time, time spent sedentary, and waist circumference (when not the outcome). Sedentary time was not associated with any outcome independent of time in MVPA. In the combined analyses, higher levels of MVPA were associated with better cardiometabolic risk factors across tertiles of sedentary time. The differences in outcomes between higher and lower MVPA were greater with lower sedentary time. Mean differences in waist circumference between the bottom and top tertiles of MVPA were 5.6 cm (95% CI, 4.8-6.4 cm) for high sedentary time and 3.6 cm (95% CI, 2.8-4.3 cm) for low sedentary time. Mean differences in systolic blood pressure for high and low sedentary time were 0.7 mm Hg (95% CI, -0.07 to 1.6) and 2.5 mm Hg (95% CI, 1.7-3.3), and for high-density lipoprotein cholesterol, differences were -2.6 mg/dL (95% CI, -1.4 to -3.9) and -4.5 mg/dL (95% CI, -3.3 to -5.6), respectively. Geometric mean differences for insulin and triglycerides showed similar variation. Those in the top tertile of MVPA accumulated more than 35 minutes per day in this intensity level compared with fewer than 18 minutes per day for those in the bottom tertile. In prospective analyses (N = 6413 at 2.1 years' follow-up), MVPA and sedentary time were not associated with waist circumference at follow-up, but a higher waist circumference at baseline was associated with higher amounts of sedentary time at follow-up. CONCLUSION: Higher MVPA time by children and adolescents was associated with better cardiometabolic risk factors regardless of the amount of sedentary time.