837 resultados para Entropy of a sampling design
Resumo:
Les travaux portent sur l’estimation de la variance dans le cas d’une non- réponse partielle traitée par une procédure d’imputation. Traiter les valeurs imputées comme si elles avaient été observées peut mener à une sous-estimation substantielle de la variance des estimateurs ponctuels. Les estimateurs de variance usuels reposent sur la disponibilité des probabilités d’inclusion d’ordre deux, qui sont parfois difficiles (voire impossibles) à calculer. Nous proposons d’examiner les propriétés d’estimateurs de variance obtenus au moyen d’approximations des probabilités d’inclusion d’ordre deux. Ces approximations s’expriment comme une fonction des probabilités d’inclusion d’ordre un et sont généralement valides pour des plans à grande entropie. Les résultats d’une étude de simulation, évaluant les propriétés des estimateurs de variance proposés en termes de biais et d’erreur quadratique moyenne, seront présentés.
Resumo:
An active learning method is proposed for the semi-automatic selection of training sets in remote sensing image classification. The method adds iteratively to the current training set the unlabeled pixels for which the prediction of an ensemble of classifiers based on bagged training sets show maximum entropy. This way, the algorithm selects the pixels that are the most uncertain and that will improve the model if added in the training set. The user is asked to label such pixels at each iteration. Experiments using support vector machines (SVM) on an 8 classes QuickBird image show the excellent performances of the methods, that equals accuracies of both a model trained with ten times more pixels and a model whose training set has been built using a state-of-the-art SVM specific active learning method
Resumo:
To test whether quantitative traits are under directional or homogenizing selection, it is common practice to compare population differentiation estimates at molecular markers (F(ST)) and quantitative traits (Q(ST)). If the trait is neutral and its determinism is additive, then theory predicts that Q(ST) = F(ST), while Q(ST) > F(ST) is predicted under directional selection for different local optima, and Q(ST) < F(ST) is predicted under homogenizing selection. However, nonadditive effects can alter these predictions. Here, we investigate the influence of dominance on the relation between Q(ST) and F(ST) for neutral traits. Using analytical results and computer simulations, we show that dominance generally deflates Q(ST) relative to F(ST). Under inbreeding, the effect of dominance vanishes, and we show that for selfing species, a better estimate of Q(ST) is obtained from selfed families than from half-sib families. We also compare several sampling designs and find that it is always best to sample many populations (>20) with few families (five) rather than few populations with many families. Provided that estimates of Q(ST) are derived from individuals originating from many populations, we conclude that the pattern Q(ST) > F(ST), and hence the inference of directional selection for different local optima, is robust to the effect of nonadditive gene actions.
Resumo:
1. Distance sampling is a widely used technique for estimating the size or density of biological populations. Many distance sampling designs and most analyses use the software Distance. 2. We briefly review distance sampling and its assumptions, outline the history, structure and capabilities of Distance, and provide hints on its use. 3. Good survey design is a crucial prerequisite for obtaining reliable results. Distance has a survey design engine, with a built-in geographic information system, that allows properties of different proposed designs to be examined via simulation, and survey plans to be generated. 4. A first step in analysis of distance sampling data is modeling the probability of detection. Distance contains three increasingly sophisticated analysis engines for this: conventional distance sampling, which models detection probability as a function of distance from the transect and assumes all objects at zero distance are detected; multiple-covariate distance sampling, which allows covariates in addition to distance; and mark–recapture distance sampling, which relaxes the assumption of certain detection at zero distance. 5. All three engines allow estimation of density or abundance, stratified if required, with associated measures of precision calculated either analytically or via the bootstrap. 6. Advanced analysis topics covered include the use of multipliers to allow analysis of indirect surveys (such as dung or nest surveys), the density surface modeling analysis engine for spatial and habitat-modeling, and information about accessing the analysis engines directly from other software. 7. Synthesis and applications. Distance sampling is a key method for producing abundance and density estimates in challenging field conditions. The theory underlying the methods continues to expand to cope with realistic estimation situations. In step with theoretical developments, state-of- the-art software that implements these methods is described that makes the methods accessible to practicing ecologists.
Resumo:
Despite widespread use of species-area relationships (SARs), dispute remains over the most representative SAR model. Using data of small-scale SARs of Estonian dry grassland communities, we address three questions: (1) Which model describes these SARs best when known artifacts are excluded? (2) How do deviating sampling procedures (marginal instead of central position of the smaller plots in relation to the largest plot; single values instead of average values; randomly located subplots instead of nested subplots) influence the properties of the SARs? (3) Are those effects likely to bias the selection of the best model? Our general dataset consisted of 16 series of nested-plots (1 cm(2)-100 m(2), any-part system), each of which comprised five series of subplots located in the four corners and the centre of the 100-m(2) plot. Data for the three pairs of compared sampling designs were generated from this dataset by subsampling. Five function types (power, quadratic power, logarithmic, Michaelis-Menten, Lomolino) were fitted with non-linear regression. In some of the communities, we found extremely high species densities (including bryophytes and lichens), namely up to eight species in 1 cm(2) and up to 140 species in 100 m(2), which appear to be the highest documented values on these scales. For SARs constructed from nested-plot average-value data, the regular power function generally was the best model, closely followed by the quadratic power function, while the logarithmic and Michaelis-Menten functions performed poorly throughout. However, the relative fit of the latter two models increased significantly relative to the respective best model when the single-value or random-sampling method was applied, however, the power function normally remained far superior. These results confirm the hypothesis that both single-value and random-sampling approaches cause artifacts by increasing stochasticity in the data, which can lead to the selection of inappropriate models.
Resumo:
Tree-rings offer one of the few possibilities to empirically quantify and reconstruct forest growth dynamics over years to millennia. Contemporaneously with the growing scientific community employing tree-ring parameters, recent research has suggested that commonly applied sampling designs (i.e. how and which trees are selected for dendrochronological sampling) may introduce considerable biases in quantifications of forest responses to environmental change. To date, a systematic assessment of the consequences of sampling design on dendroecological and-climatological conclusions has not yet been performed. Here, we investigate potential biases by sampling a large population of trees and replicating diverse sampling designs. This is achieved by retroactively subsetting the population and specifically testing for biases emerging for climate reconstruction, growth response to climate variability, long-term growth trends, and quantification of forest productivity. We find that commonly applied sampling designs can impart systematic biases of varying magnitude to any type of tree-ring-based investigations, independent of the total number of samples considered. Quantifications of forest growth and productivity are particularly susceptible to biases, whereas growth responses to short-term climate variability are less affected by the choice of sampling design. The world's most frequently applied sampling design, focusing on dominant trees only, can bias absolute growth rates by up to 459% and trends in excess of 200%. Our findings challenge paradigms, where a subset of samples is typically considered to be representative for the entire population. The only two sampling strategies meeting the requirements for all types of investigations are the (i) sampling of all individuals within a fixed area; and (ii) fully randomized selection of trees. This result advertises the consistent implementation of a widely applicable sampling design to simultaneously reduce uncertainties in tree-ring-based quantifications of forest growth and increase the comparability of datasets beyond individual studies, investigators, laboratories, and geographical boundaries.
Resumo:
Mathematical models and statistical analysis are key instruments in soil science scientific research as they can describe and/or predict the current state of a soil system. These tools allow us to explore the behavior of soil related processes and properties as well as to generate new hypotheses for future experimentation. A good model and analysis of soil properties variations, that permit us to extract suitable conclusions and estimating spatially correlated variables at unsampled locations, is clearly dependent on the amount and quality of data and of the robustness techniques and estimators. On the other hand, the quality of data is obviously dependent from a competent data collection procedure and from a capable laboratory analytical work. Following the standard soil sampling protocols available, soil samples should be collected according to key points such as a convenient spatial scale, landscape homogeneity (or non-homogeneity), land color, soil texture, land slope, land solar exposition. Obtaining good quality data from forest soils is predictably expensive as it is labor intensive and demands many manpower and equipment both in field work and in laboratory analysis. Also, the sampling collection scheme that should be used on a data collection procedure in forest field is not simple to design as the sampling strategies chosen are strongly dependent on soil taxonomy. In fact, a sampling grid will not be able to be followed if rocks at the predicted collecting depth are found, or no soil at all is found, or large trees bar the soil collection. Considering this, a proficient design of a soil data sampling campaign in forest field is not always a simple process and sometimes represents a truly huge challenge. In this work, we present some difficulties that have occurred during two experiments on forest soil that were conducted in order to study the spatial variation of some soil physical-chemical properties. Two different sampling protocols were considered for monitoring two types of forest soils located in NW Portugal: umbric regosol and lithosol. Two different equipments for sampling collection were also used: a manual auger and a shovel. Both scenarios were analyzed and the results achieved have allowed us to consider that monitoring forest soil in order to do some mathematical and statistical investigations needs a sampling procedure to data collection compatible to established protocols but a pre-defined grid assumption often fail when the variability of the soil property is not uniform in space. In this case, sampling grid should be conveniently adapted from one part of the landscape to another and this fact should be taken into consideration of a mathematical procedure.
Resumo:
O presente estudo avaliou a digestibilidade aparente da proteína e da energia de ingredientes (farelo de soja, farinha de peixe, farelo de trigo e milho) por juvenis de apaiari (Astronotus ocellatus) usando dois diferentes intervalos de coleta (30 min. e 12h). Os 160 juvenis de apaiari utilizados (22,37 ± 3,06 g de peso corporal) foram divididos em quatro tanques rede plásticos e cilíndricos, cada um colocado em um tanque de alimentação de 1.000 L. O experimento foi inteiramente casualizado em esquema fatorial 2 x 4 (2 intervalos de coleta de fezes e 4 ingredientes foram) com quatro repetições. Os testes estatísticos não detectaram efeito da interação entre o intervalo de coleta e tipo de ingrediente nos coeficientes de digestibilidade. O intervalo de coleta não afetou a digestibilidade da proteína e da energia. As características físicas das fezes dos juvenis de apaiari aparentemente as tornam menos sensíveis à perda de nutrientes por lixiviação, permitindo intervalos de coleta maiores. A digestibilidade da proteína dos ingredientes avaliados foi semelhante, mostrando que a digestibilidade aparente de ingredientes animais e vegetais por juvenis de apaiari é eficiente. Os coeficientes de digestibilidade da energia foram maiores para a farinha de peixe e o farelo de soja comparado a farelo de trigo e milho. Ingredientes ricos em carboidratos (farelo de trigo e milho) apresentaram os piores coeficientes de digestibilidade da energia e, portanto, não são usados eficientemente pelos juvenis de apaiari.
Resumo:
Mode of access: Internet.
Resumo:
"Prepared by Research Triangle Institute under contract no. OEC-0-73-6666 with U.S. Dept. of Health, Education, and Welfare."
Resumo:
Project officer: William B. Fetters.
Resumo:
Activity of 7-ethoxyresorufin-O-deethylase (EROD) in fish is certainly the best-studied biomarker of exposure applied in the field to evaluate biological effects of contamination in the marine environment. Since 1991, a feasibility study for a monitoring network using this biomarker of exposure has been conducted along French coasts. Using data obtained during several cruises, this study aims to determine the number of fish required to detect a given difference between 2 mean EROD activities, i.e. to achieve an a priori fixed statistical power (l-beta) given significance level (alpha), variance estimations and projected ratio of unequal sample sizes (k). Mean EROD activity and standard error were estimated at each of 82 sampling stations. The inter-individual variance component was dominant in estimating the variance of mean EROD activity. Influences of alpha, beta, k and variability on sample sizes are illustrated and discussed in terms of costs. In particular, sample sizes do not have to be equal, especially if such a requirement would lead to a significant cost in sampling extra material. Finally, the feasibility of longterm monitoring is discussed.
Resumo:
Pollen counts from samples taken from storage pots throughout one year (from October to September) were adjusted by Tasei's volumetric correction coefficient for the determination of pollen sources exploited by two colonies of Nannotrigona testaceicornis in Sao Paulo, Brazil. The results obtained by this sampling technique for seven months (December to June) were compared with those from corbicula load samples taken within the same period. This species visited a large variety of plant species, but few of them were frequently used. As a rule, pollen sources that appeared at frequencies greater than 1% were found with both sampling methods and significant positive correlations (Spearman correlation coefficient) were found between their values. The pollen load sample data showed that N. testaceicornis gathered pollen throughout the external activity period.
Resumo:
Purpose: The aim of this research was to assess the dimensional accuracy of orbital prostheses based on reversed images generated by computer-aided design/computer-assisted manufacturing (CAD/CAM) using computed tomography (CT) scans. Materials and Methods: CT scans of the faces of 15 adults, men and women older than 25 years of age not bearing any congenital or acquired craniofacial defects, were processed using CAD software to produce 30 reversed three-dimensional models of the orbital region. These models were then processed using the CAM system by means of selective laser sintering to generate surface prototypes of the volunteers` orbital regions. Two moulage impressions of the faces of each volunteer were taken to manufacture 15 pairs of casts. Orbital defects were created on the right or left side of each cast. The surface prototypes were adapted to the casts and then flasked to fabricate silicone prostheses. The establishment of anthropometric landmarks on the orbital region and facial midline allowed for the data collection of 31 linear measurements, used to assess the dimensional accuracy of the orbital prostheses and their location on the face. Results: The comparative analyses of the linear measurements taken from the orbital prostheses and the opposite sides that originated the surface prototypes demonstrated that the orbital prostheses presented similar vertical, transversal, and oblique dimensions, as well as similar depth. There was no transverse or oblique displacement of the prostheses. Conclusion: From a clinical perspective, the small differences observed after analyzing all 31 linear measurements did not indicate facial asymmetry. The dimensional accuracy of the orbital prostheses suggested that the CAD/CAM system assessed herein may be applicable for clinical purposes. Int J Prosthodont 2010;23:271-276.