980 resultados para Data errors
Resumo:
Traffic safety engineers are among the early adopters of Bayesian statistical tools for analyzing crash data. As in many other areas of application, empirical Bayes methods were their first choice, perhaps because they represent an intuitively appealing, yet relatively easy to implement alternative to purely classical approaches. With the enormous progress in numerical methods made in recent years and with the availability of free, easy to use software that permits implementing a fully Bayesian approach, however, there is now ample justification to progress towards fully Bayesian analyses of crash data. The fully Bayesian approach, in particular as implemented via multi-level hierarchical models, has many advantages over the empirical Bayes approach. In a full Bayesian analysis, prior information and all available data are seamlessly integrated into posterior distributions on which practitioners can base their inferences. All uncertainties are thus accounted for in the analyses and there is no need to pre-process data to obtain Safety Performance Functions and other such prior estimates of the effect of covariates on the outcome of interest. In this light, fully Bayesian methods may well be less costly to implement and may result in safety estimates with more realistic standard errors. In this manuscript, we present the full Bayesian approach to analyzing traffic safety data and focus on highlighting the differences between the empirical Bayes and the full Bayes approaches. We use an illustrative example to discuss a step-by-step Bayesian analysis of the data and to show some of the types of inferences that are possible within the full Bayesian framework.
Resumo:
Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.
Resumo:
Naturalistic driving studies are the latest resource for gathering data associated with driver behavior. The University of Iowa has been studying teen driving using naturalistic methods since 2006. By instrumenting teen drivers’ vehicles with event-triggered video recorders (ETVR), we are able to record a 12-second video clip every time a vehicle exceeds a pre-set g-force threshold. Each of these video clips contains valuable data regarding the frequency and types of distractions present in vehicles driven by today’s young drivers. The 16-year old drivers who participated in the study had a distraction present in nearly half of the events that were captured. While a lot of attention has been given to the distractions associated with technology in the vehicle (cell phones, navigation devices, entertainment systems, etc.), the most frequent type of distraction coded was the presence of teen passengers engaging in conversation (45%). Cognitive distractions, such as singing along with the radio, were the second most common distraction. Cell phone use was the third most common distraction, detected in only 10% of the events containing distraction.
Resumo:
The present study explores the statistical properties of a randomization test based on the random assignment of the intervention point in a two-phase (AB) single-case design. The focus is on randomization distributions constructed with the values of the test statistic for all possible random assignments and used to obtain p-values. The shape of those distributions is investigated for each specific data division defined by the moment in which the intervention is introduced. Another aim of the study consisted in testing the detection of inexistent effects (i.e., production of false alarms) in autocorrelated data series, in which the assumption of exchangeability between observations may be untenable. In this way, it was possible to compare nominal and empirical Type I error rates in order to obtain evidence on the statistical validity of the randomization test for each individual data division. The results suggest that when either of the two phases has considerably less measurement times, Type I errors may be too probable and, hence, the decision making process to be carried out by applied researchers may be jeopardized.
Resumo:
Introducció: Els errors de medicació són definits com qualsevol incident prevenible que pot causar dany al pacient o donar lloc a una utilització inapropiada dels medicaments, quan aquests estan sota el control dels professionals sanitaris o del pacient. Els errors en la preparació i l’administració de medicació són els més comuns de l’àrea hospitalària i, tot i la llarga cadena per la qual passa el fàrmac, el professional d’infermeria és el últim responsable de l’acció, tenint així, un paper molt important en la seguretat del pacient. Les infermeres dediquen el 40% del temps de la seva jornada laboral en tasques relacionades amb la medicació. Objectiu: Determinar si les infermeres produeixen més errors si treballen amb sistemes de distribució de medicació de stock o en sistemes de distribució unidosis de medicació. Metodologia: Estudi quantitatiu, observacional i descriptiu, on la notificació d’errors (o oportunitats d’error) realitzats per la infermera, en les fases de preparació i administració de medicació, es farà mitjançant un qüestionari autoelaborat. Els elements a identificar seran: el tipus d’error, les causes que poden haver--‐lo produït, la seva potencial gravetat i qui l’ha pogut evitar; així com el tipus de professional que l’ha produït. Altres dades rellevants són: el medicament implicat junt amb la dosis i la via d’administració i el sistema de distribució utilitzat. Mostreig i mostra: El mostreig serà no probabilístic i per conveniència. S’escolliran aquelles infermeres que l’investigador consideri amb les característiques necessàries per participar en l’estudi, així que la mostra estarà formada per les infermeres les quals treballen a la unitat 40 de l’Hospital del Mar i utilitzen un sistema de distribució de medicació de dosis unitàries i les infermeres que treballen a urgències (concretament a l’àrea de nivell dos) de l’Hospital del Mar les quals treballen amb un sistema de distribució de medicació de stock.
Resumo:
A statewide study was performed to develop regional regression equations for estimating selected annual exceedance- probability statistics for ungaged stream sites in Iowa. The study area comprises streamgages located within Iowa and 50 miles beyond the State’s borders. Annual exceedanceprobability estimates were computed for 518 streamgages by using the expected moments algorithm to fit a Pearson Type III distribution to the logarithms of annual peak discharges for each streamgage using annual peak-discharge data through 2010. The estimation of the selected statistics included a Bayesian weighted least-squares/generalized least-squares regression analysis to update regional skew coefficients for the 518 streamgages. Low-outlier and historic information were incorporated into the annual exceedance-probability analyses, and a generalized Grubbs-Beck test was used to detect multiple potentially influential low flows. Also, geographic information system software was used to measure 59 selected basin characteristics for each streamgage. Regional regression analysis, using generalized leastsquares regression, was used to develop a set of equations for each flood region in Iowa for estimating discharges for ungaged stream sites with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities, which are equivalent to annual flood-frequency recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, respectively. A total of 394 streamgages were included in the development of regional regression equations for three flood regions (regions 1, 2, and 3) that were defined for Iowa based on landform regions and soil regions. Average standard errors of prediction range from 31.8 to 45.2 percent for flood region 1, 19.4 to 46.8 percent for flood region 2, and 26.5 to 43.1 percent for flood region 3. The pseudo coefficients of determination for the generalized leastsquares equations range from 90.8 to 96.2 percent for flood region 1, 91.5 to 97.9 percent for flood region 2, and 92.4 to 96.0 percent for flood region 3. The regression equations are applicable only to stream sites in Iowa with flows not significantly affected by regulation, diversion, channelization, backwater, or urbanization and with basin characteristics within the range of those used to develop the equations. These regression equations will be implemented within the U.S. Geological Survey StreamStats Web-based geographic information system tool. StreamStats allows users to click on any ungaged site on a river and compute estimates of the eight selected statistics; in addition, 90-percent prediction intervals and the measured basin characteristics for the ungaged sites also are provided by the Web-based tool. StreamStats also allows users to click on any streamgage in Iowa and estimates computed for these eight selected statistics are provided for the streamgage.
Resumo:
US Geological Survey (USGS) based elevation data are the most commonly used data source for highway hydraulic analysis; however, due to the vertical accuracy of USGS-based elevation data, USGS data may be too “coarse” to adequately describe surface profiles of watershed areas or drainage patterns. Additionally hydraulic design requires delineation of much smaller drainage areas (watersheds) than other hydrologic applications, such as environmental, ecological, and water resource management. This research study investigated whether higher resolution LIDAR based surface models would provide better delineation of watersheds and drainage patterns as compared to surface models created from standard USGS-based elevation data. Differences in runoff values were the metric used to compare the data sets. The two data sets were compared for a pilot study area along the Iowa 1 corridor between Iowa City and Mount Vernon. Given the limited breadth of the analysis corridor, areas of particular emphasis were the location of drainage area boundaries and flow patterns parallel to and intersecting the road cross section. Traditional highway hydrology does not appear to be significantly impacted, or benefited, by the increased terrain detail that LIDAR provided for the study area. In fact, hydrologic outputs, such as streams and watersheds, may be too sensitive to the increased horizontal resolution and/or errors in the data set. However, a true comparison of LIDAR and USGS-based data sets of equal size and encompassing entire drainage areas could not be performed in this study. Differences may also result in areas with much steeper slopes or significant changes in terrain. LIDAR may provide possibly valuable detail in areas of modified terrain, such as roads. Better representations of channel and terrain detail in the vicinity of the roadway may be useful in modeling problem drainage areas and evaluating structural surety during and after significant storm events. Furthermore, LIDAR may be used to verify the intended/expected drainage patterns at newly constructed highways. LIDAR will likely provide the greatest benefit for highway projects in flood plains and areas with relatively flat terrain where slight changes in terrain may have a significant impact on drainage patterns.
Resumo:
The objective of this work was to develop a procedure to estimate soybean crop areas in Rio Grande do Sul state, Brazil. Estimations were made based on the temporal profiles of the enhanced vegetation index (Evi) calculated from moderate resolution imaging spectroradiometer (Modis) images. The methodology developed for soybean classification was named Modis crop detection algorithm (MCDA). The MCDA provides soybean area estimates in December (first forecast), using images from the sowing period, and March (second forecast), using images from the sowing and maximum crop development periods. The results obtained by the MCDA were compared with the official estimates on soybean area of the Instituto Brasileiro de Geografia e Estatística. The coefficients of determination ranged from 0.91 to 0.95, indicating good agreement between the estimates. For the 2000/2001 crop year, the MCDA soybean crop map was evaluated using a soybean crop map derived from Landsat images, and the overall map accuracy was approximately 82%, with similar commission and omission errors. The MCDA was able to estimate soybean crop areas in Rio Grande do Sul State and to generate an annual thematic map with the geographic position of the soybean fields. The soybean crop area estimates by the MCDA are in good agreement with the official agricultural statistics.
Resumo:
This letter presents a lossless data hiding scheme for digital images which uses an edge detector to locate plain areas for embedding. The proposed method takes advantage of the well-known gradient adjacent prediction utilized in image coding. In the suggested scheme, prediction errors and edge values are first computed and then, excluding the edge pixels, prediction error values are slightly modified through shifting the prediction errors to embed data. The aim of proposed scheme is to decrease the amount of modified pixels to improve transparency by keeping edge pixel values of the image. The experimental results have demonstrated that the proposed method is capable of hiding more secret data than the known techniques at the same PSNR, thus proving that using edge detector to locate plain areas for lossless data embedding can enhance the performance in terms of data embedding rate versus the PSNR of marked images with respect to original image.
Resumo:
The purpose of this bachelor's thesis was to chart scientific research articles to present contributing factors to medication errors done by nurses in a hospital setting, and introduce methods to prevent medication errors. Additionally, international and Finnish research was combined and findings were reflected in relation to the Finnish health care system. Literature review was conducted out of 23 scientific articles. Data was searched systematically from CINAHL, MEDIC and MEDLINE databases, and also manually. Literature was analysed and the findings combined using inductive content analysis. Findings revealed that both organisational and individual factors contributed to medication errors. High workload, communication breakdowns, unsuitable working environment, distractions and interruptions, and similar medication products were identified as organisational factors. Individual factors included nurses' inability to follow protocol, inadequate knowledge of medications and personal qualities of the nurse. Developing and improving the physical environment, error reporting, and medication management protocols were emphasised as methods to prevent medication errors. Investing to the staff's competence and well-being was also identified as a prevention method. The number of Finnish articles was small, and therefore the applicability of the findings to Finland is difficult to assess. However, the findings seem to fit to the Finnish health care system relatively well. Further research is needed to identify those factors that contribute to medication errors in Finland. This is a necessity for the development of methods to prevent medication errors that fit in to the Finnish health care system.
Resumo:
This study is an empirical analysis of the impact of direct tax revenue budgeting errors on fiscal deficits. Using panel data from 26 Swiss cantons between 1980 and 2002, we estimate a single equation model on the fiscal balance, as well as a simultaneous equation model on revenue and expenditure. We use new data on budgeted and actual tax revenue to show that underestimating tax revenue significantly reduces fiscal deficits. Furthermore, we show that this effect is channeled through decreased expenditure. The effects of over and underestimation turn out to be symmetric.
Resumo:
The market place of the twenty-first century will demand that manufacturing assumes a crucial role in a new competitive field. Two potential resources in the area of manufacturing are advanced manufacturing technology (AMT) and empowered employees. Surveys in Finland have shown the need to invest in the new AMT in the Finnish sheet metal industry in the 1990's. In this run the focus has been on hard technology and less attention is paid to the utilization of human resources. In manymanufacturing companies an appreciable portion of the profit within reach is wasted due to poor quality of planning and workmanship. The production flow production error distribution of the sheet metal part based constructions is inspectedin this thesis. The objective of the thesis is to analyze the origins of production errors in the production flow of sheet metal based constructions. Also the employee empowerment is investigated in theory and the meaning of the employee empowerment in reducing the overall production error amount is discussed in this thesis. This study is most relevant to the sheet metal part fabricating industrywhich produces sheet metal part based constructions for electronics and telecommunication industry. This study concentrates on the manufacturing function of a company and is based on a field study carried out in five Finnish case factories. In each studied case factory the most delicate work phases for production errors were detected. It can be assumed that most of the production errors are caused in manually operated work phases and in mass production work phases. However, no common theme in collected production error data for production error distribution in the production flow can be found. Most important finding was still that most of the production errors in each case factory studied belong to the 'human activity based errors-category'. This result indicates that most of the problemsin the production flow are related to employees or work organization. Development activities must therefore be focused to the development of employee skills orto the development of work organization. Employee empowerment gives the right tools and methods to achieve this.
A priori parameterisation of the CERES soil-crop models and tests against several European data sets
Resumo:
Mechanistic soil-crop models have become indispensable tools to investigate the effect of management practices on the productivity or environmental impacts of arable crops. Ideally these models may claim to be universally applicable because they simulate the major processes governing the fate of inputs such as fertiliser nitrogen or pesticides. However, because they deal with complex systems and uncertain phenomena, site-specific calibration is usually a prerequisite to ensure their predictions are realistic. This statement implies that some experimental knowledge on the system to be simulated should be available prior to any modelling attempt, and raises a tremendous limitation to practical applications of models. Because the demand for more general simulation results is high, modellers have nevertheless taken the bold step of extrapolating a model tested within a limited sample of real conditions to a much larger domain. While methodological questions are often disregarded in this extrapolation process, they are specifically addressed in this paper, and in particular the issue of models a priori parameterisation. We thus implemented and tested a standard procedure to parameterize the soil components of a modified version of the CERES models. The procedure converts routinely-available soil properties into functional characteristics by means of pedo-transfer functions. The resulting predictions of soil water and nitrogen dynamics, as well as crop biomass, nitrogen content and leaf area index were compared to observations from trials conducted in five locations across Europe (southern Italy, northern Spain, northern France and northern Germany). In three cases, the model’s performance was judged acceptable when compared to experimental errors on the measurements, based on a test of the model’s root mean squared error (RMSE). Significant deviations between observations and model outputs were however noted in all sites, and could be ascribed to various model routines. In decreasing importance, these were: water balance, the turnover of soil organic matter, and crop N uptake. A better match to field observations could therefore be achieved by visually adjusting related parameters, such as field-capacity water content or the size of soil microbial biomass. As a result, model predictions fell within the measurement errors in all sites for most variables, and the model’s RMSE was within the range of published values for similar tests. We conclude that the proposed a priori method yields acceptable simulations with only a 50% probability, a figure which may be greatly increased through a posteriori calibration. Modellers should thus exercise caution when extrapolating their models to a large sample of pedo-climatic conditions for which they have only limited information.
Resumo:
BACKGROUND: Worldwide data for cancer survival are scarce. We aimed to initiate worldwide surveillance of cancer survival by central analysis of population-based registry data, as a metric of the effectiveness of health systems, and to inform global policy on cancer control. METHODS: Individual tumour records were submitted by 279 population-based cancer registries in 67 countries for 25·7 million adults (age 15-99 years) and 75 000 children (age 0-14 years) diagnosed with cancer during 1995-2009 and followed up to Dec 31, 2009, or later. We looked at cancers of the stomach, colon, rectum, liver, lung, breast (women), cervix, ovary, and prostate in adults, and adult and childhood leukaemia. Standardised quality control procedures were applied; errors were corrected by the registry concerned. We estimated 5-year net survival, adjusted for background mortality in every country or region by age (single year), sex, and calendar year, and by race or ethnic origin in some countries. Estimates were age-standardised with the International Cancer Survival Standard weights. FINDINGS: 5-year survival from colon, rectal, and breast cancers has increased steadily in most developed countries. For patients diagnosed during 2005-09, survival for colon and rectal cancer reached 60% or more in 22 countries around the world; for breast cancer, 5-year survival rose to 85% or higher in 17 countries worldwide. Liver and lung cancer remain lethal in all nations: for both cancers, 5-year survival is below 20% everywhere in Europe, in the range 15-19% in North America, and as low as 7-9% in Mongolia and Thailand. Striking rises in 5-year survival from prostate cancer have occurred in many countries: survival rose by 10-20% between 1995-99 and 2005-09 in 22 countries in South America, Asia, and Europe, but survival still varies widely around the world, from less than 60% in Bulgaria and Thailand to 95% or more in Brazil, Puerto Rico, and the USA. For cervical cancer, national estimates of 5-year survival range from less than 50% to more than 70%; regional variations are much wider, and improvements between 1995-99 and 2005-09 have generally been slight. For women diagnosed with ovarian cancer in 2005-09, 5-year survival was 40% or higher only in Ecuador, the USA, and 17 countries in Asia and Europe. 5-year survival for stomach cancer in 2005-09 was high (54-58%) in Japan and South Korea, compared with less than 40% in other countries. By contrast, 5-year survival from adult leukaemia in Japan and South Korea (18-23%) is lower than in most other countries. 5-year survival from childhood acute lymphoblastic leukaemia is less than 60% in several countries, but as high as 90% in Canada and four European countries, which suggests major deficiencies in the management of a largely curable disease. INTERPRETATION: International comparison of survival trends reveals very wide differences that are likely to be attributable to differences in access to early diagnosis and optimum treatment. Continuous worldwide surveillance of cancer survival should become an indispensable source of information for cancer patients and researchers and a stimulus for politicians to improve health policy and health-care systems. FUNDING: Canadian Partnership Against Cancer (Toronto, Canada), Cancer Focus Northern Ireland (Belfast, UK), Cancer Institute New South Wales (Sydney, Australia), Cancer Research UK (London, UK), Centers for Disease Control and Prevention (Atlanta, GA, USA), Swiss Re (London, UK), Swiss Cancer Research foundation (Bern, Switzerland), Swiss Cancer League (Bern, Switzerland), and University of Kentucky (Lexington, KY, USA).