3 resultados para Probabilistic methods
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
Web-scale knowledge retrieval can be enabled by distributed information retrieval, clustering Web clients to a large-scale computing infrastructure for knowledge discovery from Web documents. Based on this infrastructure, we propose to apply semiotic (i.e., sub-syntactical) and inductive (i.e., probabilistic) methods for inferring concept associations in human knowledge. These associations can be combined to form a fuzzy (i.e.,gradual) semantic net representing a map of the knowledge in the Web. Thus, we propose to provide interactive visualizations of these cognitive concept maps to end users, who can browse and search the Web in a human-oriented, visual, and associative interface.
Resumo:
A protein of a biological sample is usually quantified by immunological techniques based on antibodies. Mass spectrometry offers alternative approaches that are not dependent on antibody affinity and avidity, protein isoforms, quaternary structures, or steric hindrance of antibody-antigen recognition in case of multiprotein complexes. One approach is the use of stable isotope-labeled internal standards; another is the direct exploitation of mass spectrometric signals recorded by LC-MS/MS analysis of protein digests. Here we assessed the peptide match score summation index based on probabilistic peptide scores calculated by the PHENYX protein identification engine for absolute protein quantification in accordance with the protein abundance index as proposed by Mann and co-workers (Rappsilber, J., Ryder, U., Lamond, A. I., and Mann, M. (2002) Large-scale proteomic analysis of the human spliceosome. Genome Res. 12, 1231-1245). Using synthetic protein mixtures, we demonstrated that this approach works well, although proteins can have different response factors. Applied to high density lipoproteins (HDLs), this new approach compared favorably to alternative protein quantitation methods like UV detection of protein peaks separated by capillary electrophoresis or quantitation of protein spots on SDS-PAGE. We compared the protein composition of a well defined HDL density class isolated from plasma of seven hypercholesterolemia subjects having low or high HDL cholesterol with HDL from nine normolipidemia subjects. The quantitative protein patterns distinguished individuals according to the corresponding concentration and distribution of cholesterol from serum lipid measurements of the same samples and revealed that hypercholesterolemia in unrelated individuals is the result of different deficiencies. The presented approach is complementary to HDL lipid analysis; does not rely on complicated sample treatment, e.g. chemical reactions, or antibodies; and can be used for projective clinical studies of larger patient groups.
Resumo:
BACKGROUND Record linkage of existing individual health care data is an efficient way to answer important epidemiological research questions. Reuse of individual health-related data faces several problems: Either a unique personal identifier, like social security number, is not available or non-unique person identifiable information, like names, are privacy protected and cannot be accessed. A solution to protect privacy in probabilistic record linkages is to encrypt these sensitive information. Unfortunately, encrypted hash codes of two names differ completely if the plain names differ only by a single character. Therefore, standard encryption methods cannot be applied. To overcome these challenges, we developed the Privacy Preserving Probabilistic Record Linkage (P3RL) method. METHODS In this Privacy Preserving Probabilistic Record Linkage method we apply a three-party protocol, with two sites collecting individual data and an independent trusted linkage center as the third partner. Our method consists of three main steps: pre-processing, encryption and probabilistic record linkage. Data pre-processing and encryption are done at the sites by local personnel. To guarantee similar quality and format of variables and identical encryption procedure at each site, the linkage center generates semi-automated pre-processing and encryption templates. To retrieve information (i.e. data structure) for the creation of templates without ever accessing plain person identifiable information, we introduced a novel method of data masking. Sensitive string variables are encrypted using Bloom filters, which enables calculation of similarity coefficients. For date variables, we developed special encryption procedures to handle the most common date errors. The linkage center performs probabilistic record linkage with encrypted person identifiable information and plain non-sensitive variables. RESULTS In this paper we describe step by step how to link existing health-related data using encryption methods to preserve privacy of persons in the study. CONCLUSION Privacy Preserving Probabilistic Record linkage expands record linkage facilities in settings where a unique identifier is unavailable and/or regulations restrict access to the non-unique person identifiable information needed to link existing health-related data sets. Automated pre-processing and encryption fully protect sensitive information ensuring participant confidentiality. This method is suitable not just for epidemiological research but also for any setting with similar challenges.