966 resultados para Database search
Resumo:
This paper analyses and discusses arguments that emerge from a recent discussion about the proper assessment of the evidential value of correspondences observed between the characteristics of a crime stain and those of a sample from a suspect when (i) this latter individual is found as a result of a database search and (ii) remaining database members are excluded as potential sources (because of different analytical characteristics). Using a graphical probability approach (i.e., Bayesian networks), the paper here intends to clarify that there is no need to (i) introduce a correction factor equal to the size of the searched database (i.e., to reduce a likelihood ratio), nor to (ii) adopt a propositional level not directly related to the suspect matching the crime stain (i.e., a proposition of the kind 'some person in (outside) the database is the source of the crime stain' rather than 'the suspect (some other person) is the source of the crime stain'). The present research thus confirms existing literature on the topic that has repeatedly demonstrated that the latter two requirements (i) and (ii) should not be a cause of concern.
Resumo:
This paper applies probability and decision theory in the graphical interface of an influence diagram to study the formal requirements of rationality which justify the individualization of a person found through a database search. The decision-theoretic part of the analysis studies the parameters that a rational decision maker would use to individualize the selected person. The modeling part (in the form of an influence diagram) clarifies the relationships between this decision and the ingredients that make up the database search problem, i.e., the results of the database search and the different pairs of propositions describing whether an individual is at the source of the crime stain. These analyses evaluate the desirability associated with the decision of 'individualizing' (and 'not individualizing'). They point out that this decision is a function of (i) the probability that the individual in question is, in fact, at the source of the crime stain (i.e., the state of nature), and (ii) the decision maker's preferences among the possible consequences of the decision (i.e., the decision maker's loss function). We discuss the relevance and argumentative implications of these insights with respect to recent comments in specialized literature, which suggest points of view that are opposed to the results of our study.
Resumo:
This paper analyses and discusses arguments that emerge from a recent discussion about the proper assessment of the evidential value of correspondences observed between the characteristics of a crime stain and those of a sample from a suspect when (i) this latter individual is found as a result of a database search and (ii) remaining database members are excluded as potential sources (because of different analytical characteristics). Using a graphical probability approach (i.e., Bayesian networks), the paper here intends to clarify that there is no need to (i) introduce a correction factor equal to the size of the searched database (i.e., to reduce a likelihood ratio), nor to (ii) adopt a propositional level not directly related to the suspect matching the crime stain (i.e., a proposition of the kind 'some person in (outside) the database is the source of the crime stain' rather than 'the suspect (some other person) is the source of the crime stain'). The present research thus confirms existing literature on the topic that has repeatedly demonstrated that the latter two requirements (i) and (ii) should not be a cause of concern.
Resumo:
Background The 'database search problem', that is, the strengthening of a case - in terms of probative value - against an individual who is found as a result of a database search, has been approached during the last two decades with substantial mathematical analyses, accompanied by lively debate and centrally opposing conclusions. This represents a challenging obstacle in teaching but also hinders a balanced and coherent discussion of the topic within the wider scientific and legal community. This paper revisits and tracks the associated mathematical analyses in terms of Bayesian networks. Their derivation and discussion for capturing probabilistic arguments that explain the database search problem are outlined in detail. The resulting Bayesian networks offer a distinct view on the main debated issues, along with further clarity. Methods As a general framework for representing and analyzing formal arguments in probabilistic reasoning about uncertain target propositions (that is, whether or not a given individual is the source of a crime stain), this paper relies on graphical probability models, in particular, Bayesian networks. This graphical probability modeling approach is used to capture, within a single model, a series of key variables, such as the number of individuals in a database, the size of the population of potential crime stain sources, and the rarity of the corresponding analytical characteristics in a relevant population. Results This paper demonstrates the feasibility of deriving Bayesian network structures for analyzing, representing, and tracking the database search problem. The output of the proposed models can be shown to agree with existing but exclusively formulaic approaches. Conclusions The proposed Bayesian networks allow one to capture and analyze the currently most well-supported but reputedly counter-intuitive and difficult solution to the database search problem in a way that goes beyond the traditional, purely formulaic expressions. The method's graphical environment, along with its computational and probabilistic architectures, represents a rich package that offers analysts and discussants with additional modes of interaction, concise representation, and coherent communication.
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
Homomorphic encryption is a particular type of encryption method that enables computing over encrypted data. This has a wide range of real world ramifications such as being able to blindly compute a search result sent to a remote server without revealing its content. In the first part of this thesis, we discuss how database search queries can be made secure using a homomorphic encryption scheme based on the ideas of Gahi et al. Gahi’s method is based on the integer-based fully homomorphic encryption scheme proposed by Dijk et al. We propose a new database search scheme called the Homomorphic Query Processing Scheme, which can be used with the ring-based fully homomorphic encryption scheme proposed by Braserski. In the second part of this thesis, we discuss the cybersecurity of the smart electric grid. Specifically, we use the Homomorphic Query Processing scheme to construct a keyword search technique in the smart grid. Our work is based on the Public Key Encryption with Keyword Search (PEKS) method introduced by Boneh et al. and a Multi-Key Homomorphic Encryption scheme proposed by L´opez-Alt et al. A summary of the results of this thesis (specifically the Homomorphic Query Processing Scheme) is published at the 14th Canadian Workshop on Information Theory (CWIT).
Resumo:
Mitochondrial DNA (mtDNA) population data for forensic purposes are still scarce for some populations, which may limit the evaluation of forensic evidence especially when the rarity of a haplotype needs to be determined in a database search. In order to improve the collection of mtDNA lineages from the Iberian and South American subcontinents, we here report the results of a collaborative study involving nine laboratories from the Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics (GHEP-ISFG) and EMPOP. The individual laboratories contributed population data that were generated throughout the past 10 years, but in the majority of cases have not been made available to the scientific community. A total of 1019 haplotypes from Iberia (Basque Country, 2 general Spanish populations, 2 North and 1 Central Portugal populations), and Latin America (3 populations from Sao Paulo) were collected, reviewed and harmonized according to defined EMPOP criteria. The majority of data ambiguities that were found during the reviewing process (41 in total) were transcription errors confirming that the documentation process is still the most error-prone stage in reporting mtDNA population data, especially when performed manually. This GHEP-EMPOP collaboration has significantly improved the quality of the individual mtDNA datasets and adds mtDNA population data as valuable resource to the EMPOP database (www.empop.org). (C) 2010 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Aims We conducted a meta-analysis to evaluate the accuracy of quantitative stress myocardial contrast echocardiography (MCE) in coronary artery disease (CAD). Methods and results Database search was performed through January 2008. We included studies evaluating accuracy of quantitative stress MCE for detection of CAD compared with coronary angiography or single-photon emission computed tomography (SPECT) and measuring reserve parameters of A, beta, and A beta. Data from studies were verified and supplemented by the authors of each study. Using random effects meta-analysis, we estimated weighted mean difference (WMD), likelihood ratios (LRs), diagnostic odds ratios (DORs), and summary area under curve (AUC), all with 95% confidence interval (0). Of 1443 studies, 13 including 627 patients (age range, 38-75 years) and comparing MCE with angiography (n = 10), SPECT (n = 1), or both (n = 2) were eligible. WMD (95% CI) were significantly less in CAD group than no-CAD group: 0.12 (0.06-0.18) (P < 0.001), 1.38 (1.28-1.52) (P < 0.001), and 1.47 (1.18-1.76) (P < 0.001) for A, beta, and A beta reserves, respectively. Pooled LRs for positive test were 1.33 (1.13-1.57), 3.76 (2.43-5.80), and 3.64 (2.87-4.78) and LRs for negative test were 0.68 (0.55-0.83), 0.30 (0.24-0.38), and 0.27 (0.22-0.34) for A, beta, and A beta reserves, respectively. Pooled DORs were 2.09 (1.42-3.07), 15.11 (7.90-28.91), and 14.73 (9.61-22.57) and AUCs were 0.637 (0.594-0.677), 0.851 (0.828-0.872), and 0.859 (0.842-0.750) for A, beta, and A beta reserves, respectively. Conclusion Evidence supports the use of quantitative MCE as a non-invasive test for detection of CAD. Standardizing MCE quantification analysis and adherence to reporting standards for diagnostic tests could enhance the quality of evidence in this field.
Resumo:
We report a further characterization of the genomic region containing the soybean supernodulation gene NTS-1. We performed a search for new markers linked to NTS-1 by combining DNA amplification fingerprinting (DAF) and bulked segregant analysis (BSA). The search resulted in one cloned polymorphism (B44-456) linked in trans, 8.5cM from the locus. Southern hybridization showed duplication of the B44-456 sequence in the soybean genome. Additionally, a DNA database search revealed one Arabidopsis thaliana genomic clone from chromosome I possessing 62% homology to the B44-456 marker. A relatively low number of polymorphisms were identified by several PCR marker technologies for this soybean genomic region, providing an additional support for its highly conserved and/or duplicated organization.
Resumo:
We present a novel maximum-likelihood-based algorithm for estimating the distribution of alignment scores from the scores of unrelated sequences in a database search. Using a new method for measuring the accuracy of p-values, we show that our maximum-likelihood-based algorithm is more accurate than existing regression-based and lookup table methods. We explore a more sophisticated way of modeling and estimating the score distributions (using a two-component mixture model and expectation maximization), but conclude that this does not improve significantly over simply ignoring scores with small E-values during estimation. Finally, we measure the classification accuracy of p-values estimated in different ways and observe that inaccurate p-values can, somewhat paradoxically, lead to higher classification accuracy. We explain this paradox and argue that statistical accuracy, not classification accuracy, should be the primary criterion in comparisons of similarity search methods that return p-values that adjust for target sequence length.
Resumo:
This study reports the ability of one hyperthermophile and two thermophilic microorganisms to grow anaerobically by the reduction of chlorate and perchlorate. Physiological, genomic and proteome analyses suggest that the Crenarchaeon Aeropyrum pernix reduces perchlorate with a periplasmic enzyme related to nitrate reductases, but that it lacks a functional chlorite-disproportionating enzyme (Cld) to complete the pathway. A. pernix, previously described as a strictly aerobic microorganism, seems to rely on the chemical reactivity of reduced sulfur compounds with chlorite, a mechanism previously reported for perchlorate-reducing Archaeoglobus fulgidus. The chemical oxidation of thiosulfate (in excessive amounts present in the medium) and the reduction of chlorite result in the release of sulfate and chloride, which are the products of a biotic-abiotic perchlorate reduction pathway in A. pernix. The apparent absence of Cld in two other perchlorate-reducing microorganisms, Carboxydothermus hydrogenoformans and Moorella glycerini strain NMP, and their dependence on sulfide for perchlorate reduction is consistent with observations made on A. fulgidus. Our findings suggest that microbial perchlorate reduction at high temperature differs notably from the physiology of perchlorate- and chlorate-reducing mesophiles and that it is characterized by the lack of a chlorite dismutase and is enabled by a combination of biotic and abiotic reactions.
Resumo:
What genotype should the scientist specify for conducting a database search to try to find the source of a low-template-DNA (lt-DNA) trace? When the scientist answers this question, he or she makes a decision. Here, we approach this decision problem from a normative point of view by defining a decision-theoretic framework for answering this question for one locus. This framework combines the probability distribution describing the uncertainty over the trace's donor's possible genotypes with a loss function describing the scientist's preferences concerning false exclusions and false inclusions that may result from the database search. According to this approach, the scientist should choose the genotype designation that minimizes the expected loss. To illustrate the results produced by this approach, we apply it to two hypothetical cases: (1) the case of observing one peak for allele xi on a single electropherogram, and (2) the case of observing one peak for allele xi on one replicate, and a pair of peaks for alleles xi and xj, i ≠ j, on a second replicate. Given that the probabilities of allele drop-out are defined as functions of the observed peak heights, the threshold values marking the turning points when the scientist should switch from one designation to another are derived in terms of the observed peak heights. For each case, sensitivity analyses show the impact of the model's parameters on these threshold values. The results support the conclusion that the procedure should not focus on a single threshold value for making this decision for all alleles, all loci and in all laboratories.