927 resultados para Error correction coding


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Longitudinal surveys are increasingly used to collect event history data on person-specific processes such as transitions between labour market states. Surveybased event history data pose a number of challenges for statistical analysis. These challenges include survey errors due to sampling, non-response, attrition and measurement. This study deals with non-response, attrition and measurement errors in event history data and the bias caused by them in event history analysis. The study also discusses some choices faced by a researcher using longitudinal survey data for event history analysis and demonstrates their effects. These choices include, whether a design-based or a model-based approach is taken, which subset of data to use and, if a design-based approach is taken, which weights to use. The study takes advantage of the possibility to use combined longitudinal survey register data. The Finnish subset of European Community Household Panel (FI ECHP) survey for waves 1–5 were linked at person-level with longitudinal register data. Unemployment spells were used as study variables of interest. Lastly, a simulation study was conducted in order to assess the statistical properties of the Inverse Probability of Censoring Weighting (IPCW) method in a survey data context. The study shows how combined longitudinal survey register data can be used to analyse and compare the non-response and attrition processes, test the missingness mechanism type and estimate the size of bias due to non-response and attrition. In our empirical analysis, initial non-response turned out to be a more important source of bias than attrition. Reported unemployment spells were subject to seam effects, omissions, and, to a lesser extent, overreporting. The use of proxy interviews tended to cause spell omissions. An often-ignored phenomenon classification error in reported spell outcomes, was also found in the data. Neither the Missing At Random (MAR) assumption about non-response and attrition mechanisms, nor the classical assumptions about measurement errors, turned out to be valid. Both measurement errors in spell durations and spell outcomes were found to cause bias in estimates from event history models. Low measurement accuracy affected the estimates of baseline hazard most. The design-based estimates based on data from respondents to all waves of interest and weighted by the last wave weights displayed the largest bias. Using all the available data, including the spells by attriters until the time of attrition, helped to reduce attrition bias. Lastly, the simulation study showed that the IPCW correction to design weights reduces bias due to dependent censoring in design-based Kaplan-Meier and Cox proportional hazard model estimators. The study discusses implications of the results for survey organisations collecting event history data, researchers using surveys for event history analysis, and researchers who develop methods to correct for non-sampling biases in event history data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce a new tool for correcting OCR errors of materials in a repository of cultural materials. The poster is aimed to all who are interested in digital humanities and who might find our tool useful. The poster will focus on the OCR correction tool and on the background processes. We have started a project on materials published in Finno-Ugric languages in the Soviet Union in the 1920s and 1930s. The materials are digitised in Russia. As they arrive, we publish them in DSpace (fennougrica.kansalliskirjasto.fi). For research purposes, the results of the OCR must be corrected manually. For this we have built a new tool. Although similar tools exist, we found in-house development necessary in order to serve the researchers' needs. The tool enables exporting the corrected text as required by the researchers. It makes it possible to distribute the correction tasks and their supervision. After a supervisor has approved a text as finalised, the new version of the work will replace the old one in DSpace. The project has - benefitted the small language communities, - opened channels for cooperation in Russia. - increased our capabilities in digital humanities. The OCR correction tool will be available to others.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The human immunoglobulin lambda variable locus (IGLV) is mapped at chromosome 22 band q11.1-q11.2. The 30 functional germline v-lambda genes sequenced untill now have been subgrouped into 10 families (Vl1 to Vl10). The number of Vl genes has been estimated at approximately 70. This locus is formed by three gene clusters (VA, VB and VC) that encompass the variable coding genes (V) responsible for the synthesis of lambda-type Ig light chains, and the Jl-Cl cluster with the joining segments and the constant genes. Recently the entire variable lambda gene locus was mapped by contig methodology and its one- megabase DNA totally sequenced. All the known functional V-lambda genes and pseudogenes were located. We screened a human genomic DNA cosmid library and isolated a clone with an insert of 37 kb (cosmid 8.3) encompassing four functional genes (IGLV7S1, IGLV1S1, IGLV1S2 and IGLV5a), a pseudogene (VlA) and a vestigial sequence (vg1) to study in detail the positions of the restriction sites surrounding the Vl genes. We generated a high resolution restriction map, locating 31 restriction sites in 37 kb of the VB cluster, a region rich in functional Vl genes. This mapping information opens the perspective for further RFLP studies and sequencing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nephrogenic diabetes insipidus (NDI) is a rare disease characterized by renal inability to respond properly to arginine vasopressin due to mutations in the vasopressin type 2 receptor (V2(R)) gene in affected kindreds. In most kindreds thus far reported, the mode of inheritance follows an X chromosome-linked recessive pattern although autosomal-dominant and autosomal-recessive modes of inheritance have also been described. Studies demonstrating mutations in the V2(R) gene in affected kindreds that modify the receptor structure, resulting in a dys- or nonfunctional receptor have been described, but phenotypically indistinguishable NDI patients with a structurally normal V2(R) gene have also been reported. In the present study, we analyzed exon 3 of the V2(R) gene in 20 unrelated individuals by direct sequencing. A C®T alteration in the third position of codon 331 (AGC®AGT), which did not alter the encoded amino acid, was found in nine individuals, including two unrelated patients with NDI. Taken together, these observations emphasize the molecular heterogeneity of a phenotypically homogeneous syndrome

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Measles virus is a highly contagious agent which causes a major health problem in developing countries. The viral genomic RNA is single-stranded, nonsegmented and of negative polarity. Many live attenuated vaccines for measles virus have been developed using either the prototype Edmonston strain or other locally isolated measles strains. Despite the diverse geographic origins of the vaccine viruses and the different attenuation methods used, there was remarkable sequence similarity of H, F and N genes among all vaccine strains. CAM-70 is a Japanese measles attenuated vaccine strain widely used in Brazilian children and produced by Bio-Manguinhos since 1982. Previous studies have characterized this vaccine biologically and genomically. Nevertheless, only the F, H and N genes have been sequenced. In the present study we have sequenced the remaining P, M and L genes (approximately 1.6, 1.4 and 6.5 kb, respectively) to complete the genomic characterization of CAM-70 and to assess the extent of genetic relationship between CAM-70 and other current vaccines. These genes were amplified using long-range or standard RT-PCR techniques, and the cDNA was cloned and automatically sequenced using the dideoxy chain-termination method. The sequence analysis comparing previously sequenced genotype A strains with the CAM-70 Bio-Manguinhos strain showed a low divergence among them. However, the CAM-70 strains (CAM-70 Bio-Manguinhos and a recently sequenced CAM-70 submaster seed strain) were assigned to a specific group by phylogenetic analysis using the neighbor-joining method. Information about our product at the genomic level is important for monitoring vaccination campaigns and for future studies of measles virus attenuation.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Our objective was to clone, express and characterize adult Dermatophagoides farinae group 1 (Der f 1) allergens to further produce recombinant allergens for future clinical applications in order to eliminate side reactions from crude extracts of mites. Based on GenBank data, we designed primers and amplified the cDNA fragment coding for Der f 1 by nested-PCR. After purification and recovery, the cDNA fragment was cloned into the pMD19-T vector. The fragment was then sequenced, subcloned into the plasmid pET28a(+), expressed in Escherichia coli BL21 and identified by Western blotting. The cDNA coding for Der f 1 was cloned, sequenced and expressed successfully. Sequence analysis showed the presence of an open reading frame containing 966 bp that encodes a protein of 321 amino acids. Interestingly, homology analysis showed that the Der p 1 shared more than 87% identity in amino acid sequence with Eur m 1 but only 80% with Der f 1. Furthermore, phylogenetic analyses suggested that D. pteronyssinus was evolutionarily closer to Euroglyphus maynei than to D. farinae, even though D. pteronyssinus and D. farinae belong to the same Dermatophagoides genus. A total of three cysteine peptidase active sites were found in the predicted amino acid sequence, including 127-138 (QGGCGSCWAFSG), 267-277 (NYHAVNIVGYG) and 284-303 (YWIVRNSWDTTWGDSGYGYF). Moreover, secondary structure analysis revealed that Der f 1 contained an a helix (33.96%), an extended strand (17.13%), a ß turn (5.61%), and a random coil (43.30%). A simple three-dimensional model of this protein was constructed using a Swiss-model server. The cDNA coding for Der f 1 was cloned, sequenced and expressed successfully. Alignment and phylogenetic analysis suggests that D. pteronyssinus is evolutionarily more similar to E. maynei than to D. farinae.