14 resultados para sampling error

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Abstract Background An important challenge for transcript counting methods such as Serial Analysis of Gene Expression (SAGE), "Digital Northern" or Massively Parallel Signature Sequencing (MPSS), is to carry out statistical analyses that account for the within-class variability, i.e., variability due to the intrinsic biological differences among sampled individuals of the same class, and not only variability due to technical sampling error. Results We introduce a Bayesian model that accounts for the within-class variability by means of mixture distribution. We show that the previously available approaches of aggregation in pools ("pseudo-libraries") and the Beta-Binomial model, are particular cases of the mixture model. We illustrate our method with a brain tumor vs. normal comparison using SAGE data from public databases. We show examples of tags regarded as differentially expressed with high significance if the within-class variability is ignored, but clearly not so significant if one accounts for it. Conclusion Using available information about biological replicates, one can transform a list of candidate transcripts showing differential expression to a more reliable one. Our method is freely available, under GPL/GNU copyleft, through a user friendly web-based on-line tool or as R language scripts at supplemental web-site.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Contamination by butyltin compounds (BTs) has been reported in estuarine environments worldwide, with serious impacts on the biota of these areas. Considering that BTs can be degraded by varying environmental conditions such as incident light and salinity, the short-term variations in such factors may lead to inaccurate estimates of BTs concentrations in nature. Therefore, the present study aimed to evaluate the possibility that measurements of BTs in estuarine sediments are influenced by different sampling conditions, including period of the day (day or night), tidal zone (intertidal or subtidal), and tides (high or low). The study area is located on the Brazilian southeastern coast, Sao Vicente Estuary, at Pescadores Beach, where BT contamination was previously detected. Three replicate samples of surface sediment were collected randomly in each combination of period of the day, tidal zone, and tide condition, from three subareas along the beach, totaling 72 samples. BTs were analyzed by GC-PFPD using a tin filter and a VF-5 column, by means of a validated method. The concentrations of tributyltin (TBT), dibutyltin (DBT), and monobutyltin (MBT) ranged from undetectable to 161 ng Sn g(-1) (d.w.). In most samples (71%), only MBT was quantifiable, whereas TBTs were measured in only 14, suggesting either an old contamination or rapid degradation processes. DBT was found in 27 samples, but could be quantified in only one. MBT concentrations did not differ significantly with time of day, zones, or tide conditions. DBT and TBT could not be compared under all these environmental conditions, because only a few samples were above the quantification limit. Pooled samples of TBT did not reveal any difference between day and night. These results indicated that, in assessing contamination by butyltin compounds, surface-sediment samples can be collected in any environmental conditions. However, the wide variation of BTs concentrations in the study area, i.e., over a very small geographic scale, illustrates the need for representative hierarchical and composite sampling designs that are compatible with the multiscalar temporal and spatial variability common to most marine systems. The use of such sampling designs will be necessary for future attempts to quantitatively evaluate and monitor the occurrence and impact of these compounds in nature

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work develops a computational approach for boundary and initial-value problems by using operational matrices, in order to run an evolutive process in a Hilbert space. Besides, upper bounds for errors in the solutions and in their derivatives can be estimated providing accuracy measures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Estimates of evapotranspiration on a local scale is important information for agricultural and hydrological practices. However, equations to estimate potential evapotranspiration based only on temperature data, which are simple to use, are usually less trustworthy than the Food and Agriculture Organization (FAO)Penman-Monteith standard method. The present work describes two correction procedures for potential evapotranspiration estimates by temperature, making the results more reliable. Initially, the standard FAO-Penman-Monteith method was evaluated with a complete climatologic data set for the period between 2002 and 2006. Then temperature-based estimates by Camargo and Jensen-Haise methods have been adjusted by error autocorrelation evaluated in biweekly and monthly periods. In a second adjustment, simple linear regression was applied. The adjusted equations have been validated with climatic data available for the Year 2001. Both proposed methodologies showed good agreement with the standard method indicating that the methodology can be used for local potential evapotranspiration estimates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of selecting the best linear unbiased predictor (BLUP) of the latent value (e.g., serum glucose fasting level) of sample subjects with heteroskedastic measurement errors. Using a simple example, we compare the usual mixed model BLUP to a similar predictor based on a mixed model framed in a finite population (FPMM) setup with two sources of variability, the first of which corresponds to simple random sampling and the second, to heteroskedastic measurement errors. Under this last approach, we show that when measurement errors are subject-specific, the BLUP shrinkage constants are based on a pooled measurement error variance as opposed to the individual ones generally considered for the usual mixed model BLUP. In contrast, when the heteroskedastic measurement errors are measurement condition-specific, the FPMM BLUP involves different shrinkage constants. We also show that in this setup, when measurement errors are subject-specific, the usual mixed model predictor is biased but has a smaller mean squared error than the FPMM BLUP which points to some difficulties in the interpretation of such predictors. (C) 2011 Elsevier By. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The scope of this study was to estimate calibrated values for dietary data obtained by the Food Frequency Questionnaire for Adolescents (FFQA) and illustrate the effect of this approach on food consumption data. The adolescents were assessed on two occasions, with an average interval of twelve months. In 2004, 393 adolescents participated, and 289 were then reassessed in 2005. Dietary data obtained by the FFQA were calibrated using the regression coefficients estimated from the average of two 24-hour recalls (24HR) of the subsample. The calibrated values were similar to the the 24HR reference measurement in the subsample. In 2004 and 2005 a significant difference was observed between the average consumption levels of the FFQA before and after calibration for all nutrients. With the use of calibrated data the proportion of schoolchildren who had fiber intake below the recommended level increased. Therefore, it is seen that calibrated data can be used to obtain adjusted associations due to reclassification of subjects within the predetermined categories.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Since a genome is a discrete sequence, the elements of which belong to a set of four letters, the question as to whether or not there is an error-correcting code underlying DNA sequences is unavoidable. The most common approach to answering this question is to propose a methodology to verify the existence of such a code. However, none of the methodologies proposed so far, although quite clever, has achieved that goal. In a recent work, we showed that DNA sequences can be identified as codewords in a class of cyclic error-correcting codes known as Hamming codes. In this paper, we show that a complete intron-exon gene, and even a plasmid genome, can be identified as a Hamming code codeword as well. Although this does not constitute a definitive proof that there is an error-correcting code underlying DNA sequences, it is the first evidence in this direction.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Workplace accidents involving machines are relevant for their magnitude and their impacts on worker health. Despite consolidated critical statements, explanation centered on errors of operators remains predominant with industry professionals, hampering preventive measures and the improvement of production-system reliability. Several initiatives were adopted by enforcement agencies in partnership with universities to stimulate production and diffusion of analysis methodologies with a systemic approach. Starting from one accident case that occurred with a worker who operated a brake-clutch type mechanical press, the article explores cognitive aspects and the existence of traps in the operation of this machine. It deals with a large-sized press that, despite being endowed with a light curtain in areas of access to the pressing zone, did not meet legal requirements. The safety devices gave rise to an illusion of safety, permitting activation of the machine when a worker was still found within the operational zone. Preventive interventions must stimulate the tailoring of systems to the characteristics of workers, minimizing the creation of traps and encouraging safety policies and practices that replace judgments of behaviors that participate in accidents by analyses of reasons that lead workers to act in that manner.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently, researches have shown that the performance of metaheuristics can be affected by population initialization. Opposition-based Differential Evolution (ODE), Quasi-Oppositional Differential Evolution (QODE), and Uniform-Quasi-Opposition Differential Evolution (UQODE) are three state-of-the-art methods that improve the performance of the Differential Evolution algorithm based on population initialization and different search strategies. In a different approach to achieve similar results, this paper presents a technique to discover promising regions in a continuous search-space of an optimization problem. Using machine-learning techniques, the algorithm named Smart Sampling (SS) finds regions with high possibility of containing a global optimum. Next, a metaheuristic can be initialized inside each region to find that optimum. SS and DE were combined (originating the SSDE algorithm) to evaluate our approach, and experiments were conducted in the same set of benchmark functions used by ODE, QODE and UQODE authors. Results have shown that the total number of function evaluations required by DE to reach the global optimum can be significantly reduced and that the success rate improves if SS is employed first. Such results are also in consonance with results from the literature, stating the importance of an adequate starting population. Moreover, SS presents better efficacy to find initial populations of superior quality when compared to the other three algorithms that employ oppositional learning. Finally and most important, the SS performance in finding promising regions is independent of the employed metaheuristic with which SS is combined, making SS suitable to improve the performance of a large variety of optimization techniques. (C) 2012 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study uses several measures derived from the error matrix for comparing two thematic maps generated with the same sample set. The reference map was generated with all the sample elements and the map set as the model was generated without the two points detected as influential by the analysis of local influence diagnostics. The data analyzed refer to the wheat productivity in an agricultural area of 13.55 ha considering a sampling grid of 50 x 50 m comprising 50 georeferenced sample elements. The comparison measures derived from the error matrix indicated that despite some similarity on the maps, they are different. The difference between the estimated production by the reference map and the actual production was of 350 kilograms. The same difference calculated with the mode map was of 50 kilograms, indicating that the study of influential points is of fundamental importance to obtain a more reliable estimative and use of measures obtained from the error matrix is a good option to make comparisons between thematic maps.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Within-site variability in species detectability is a problem common to many biodiversity assessments and can strongly bias the results. Such variability can be caused by many factors, including simple counting inaccuracies, which can be solved by increasing sample size, or by temporal changes in species behavior, meaning that the way the temporal sampling protocol is designed is also very important. Here we use the example of mist-netted tropical birds to determine how design decisions in the temporal sampling protocol can alter the data collected and how these changes might affect the detection of ecological patterns, such as the species-area relationship (SAR). Using data from almost 3400 birds captured from 21,000 net-hours at 31 sites in the Brazilian Atlantic Forest, we found that the magnitude of ecological trends remained fairly stable, but the probability of detecting statistically significant ecological patterns varied depending on sampling effort, time of day and season in which sampling was conducted. For example, more species were detected in the wet season, but the SAR was strongest in the dry season. We found that the temporal distribution of sampling effort was more important than its total amount, discovering that similar ecological results could have been obtained with one-third of the total effort, as long as each site had been equally sampled over 2 yr. We discuss that projects with the same sampling effort and spatial design, but with different temporal sampling protocol are likely to report different ecological patterns, which may ultimately lead to inappropriate conservation strategies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract Background Air pollution in São Paulo is constantly being measured by the State of Sao Paulo Environmental Agency, however there is no information on the variation between places with different traffic densities. This study was intended to identify a gradient of exposure to traffic-related air pollution within different areas in São Paulo to provide information for future epidemiological studies. Methods We measured NO2 using Palmes' diffusion tubes in 36 sites on streets chosen to be representative of different road types and traffic densities in São Paulo in two one-week periods (July and August 2000). In each study period, two tubes were installed in each site, and two additional tubes were installed in 10 control sites. Results Average NO2 concentrations were related to traffic density, observed on the spot, to number of vehicles counted, and to traffic density strata defined by the city Traffic Engineering Company (CET). Average NO2concentrations were 63μg/m3 and 49μg/m3 in the first and second periods, respectively. Dividing the sites by the observed traffic density, we found: heavy traffic (n = 17): 64μg/m3 (95% CI: 59μg/m3 – 68μg/m3); local traffic (n = 16): 48μg/m3 (95% CI: 44μg/m3 – 52μg/m3) (p < 0.001). Conclusion The differences in NO2 levels between heavy and local traffic sites are large enough to suggest the use of a more refined classification of exposure in epidemiological studies in the city. Number of vehicles counted, traffic density observed on the spot and traffic density strata defined by the CET might be used as a proxy for traffic exposure in São Paulo when more accurate measurements are not available.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study uses several measures derived from the error matrix for comparing two thematic maps generated with the same sample set. The reference map was generated with all the sample elements and the map set as the model was generated without the two points detected as influential by the analysis of local influence diagnostics. The data analyzed refer to the wheat productivity in an agricultural area of 13.55 ha considering a sampling grid of 50 x 50 m comprising 50 georeferenced sample elements. The comparison measures derived from the error matrix indicated that despite some similarity on the maps, they are different. The difference between the estimated production by the reference map and the actual production was of 350 kilograms. The same difference calculated with the mode map was of 50 kilograms, indicating that the study of influential points is of fundamental importance to obtain a more reliable estimative and use of measures obtained from the error matrix is a good option to make comparisons between thematic maps.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of this study was to validate three different models for predicting milk urea nitrogen using field conditions, attempting to evaluate the nutritional adequacy diets for dairy cows and prediction of nitrogen excreted to the environment. Observations (4,749) from 855 cows were used. Milk yield, body weight (BW), days in milk and parity were recorded on the milk sampling days. Milk was sampled monthly, for analysis of milk urea nitrogen (MUN), fat, protein, lactose and total solids concentration and somatic cells count. Individual dry matter intake was estimated using the NRC (2001). The three models studied were derived from a first one to predict urinary nitrogen (UN). Model 1 was MUN = UN/12.54, model 2 was MUN = UN/17.6 and model 3 was MUN = UN/(0.0259 × BW), adjusted by body weight effect. To evaluate models, they were tested for accuracy, precision and robustness. Despite being more accurate (mean bias = 0.94 mg/dL), model 2 was less precise (residual error = 4.50 mg/dL) than model 3 (mean bias = 1.41 and residual error = 4.11 mg/dL), while model 1 was the least accurate (mean bias = 6.94 mg/dL) and the least precise (residual error = 5.40 mg/dL). They were not robust, because they were influenced by almost all the variables studied. The three models for predicting milk urea nitrogen were different with respect to accuracy, precision and robustness.