957 resultados para false negative rate
Resumo:
With hundreds of single nucleotide polymorphisms (SNPs) in a candidate gene and millions of SNPs across the genome, selecting an informative subset of SNPs to maximize the ability to detect genotype-phenotype association is of great interest and importance. In addition, with a large number of SNPs, analytic methods are needed that allow investigators to control the false positive rate resulting from large numbers of SNP genotype-phenotype analyses. This dissertation uses simulated data to explore methods for selecting SNPs for genotype-phenotype association studies. I examined the pattern of linkage disequilibrium (LD) across a candidate gene region and used this pattern to aid in localizing a disease-influencing mutation. The results indicate that the r2 measure of linkage disequilibrium is preferred over the common D′ measure for use in genotype-phenotype association studies. Using step-wise linear regression, the best predictor of the quantitative trait was not usually the single functional mutation. Rather it was a SNP that was in high linkage disequilibrium with the functional mutation. Next, I compared three strategies for selecting SNPs for application to phenotype association studies: based on measures of linkage disequilibrium, based on a measure of haplotype diversity, and random selection. The results demonstrate that SNPs selected based on maximum haplotype diversity are more informative and yield higher power than randomly selected SNPs or SNPs selected based on low pair-wise LD. The data also indicate that for genes with small contribution to the phenotype, it is more prudent for investigators to increase their sample size than to continuously increase the number of SNPs in order to improve statistical power. When typing large numbers of SNPs, researchers are faced with the challenge of utilizing an appropriate statistical method that controls the type I error rate while maintaining adequate power. We show that an empirical genotype based multi-locus global test that uses permutation testing to investigate the null distribution of the maximum test statistic maintains a desired overall type I error rate while not overly sacrificing statistical power. The results also show that when the penetrance model is simple the multi-locus global test does as well or better than the haplotype analysis. However, for more complex models, haplotype analyses offer advantages. The results of this dissertation will be of utility to human geneticists designing large-scale multi-locus genotype-phenotype association studies. ^
Resumo:
The difficulty of detecting differential gene expression in microarray data has existed for many years. Several correction procedures try to avoid the family-wise error rate in multiple comparison process, including the Bonferroni and Sidak single-step p-value adjustments, Holm's step-down correction method, and Benjamini and Hochberg's false discovery rate (FDR) correction procedure. Each multiple comparison technique has its advantages and weaknesses. We studied each multiple comparison method through numerical studies (simulations) and applied the methods to the real exploratory DNA microarray data, which detect of molecular signatures in papillary thyroid cancer (PTC) patients. According to our results of simulation studies, Benjamini and Hochberg step-up FDR controlling procedure is the best process among these multiple comparison methods and we discovered 1277 potential biomarkers among 54675 probe sets after applying the Benjamini and Hochberg's method to PTC microarray data.^
Resumo:
SNP genotyping arrays have been developed to characterize single-nucleotide polymorphisms (SNPs) and DNA copy number variations (CNVs). The quality of the inferences about copy number can be affected by many factors including batch effects, DNA sample preparation, signal processing, and analytical approach. Nonparametric and model-based statistical algorithms have been developed to detect CNVs from SNP genotyping data. However, these algorithms lack specificity to detect small CNVs due to the high false positive rate when calling CNVs based on the intensity values. Association tests based on detected CNVs therefore lack power even if the CNVs affecting disease risk are common. In this research, by combining an existing Hidden Markov Model (HMM) and the logistic regression model, a new genome-wide logistic regression algorithm was developed to detect CNV associations with diseases. We showed that the new algorithm is more sensitive and can be more powerful in detecting CNV associations with diseases than an existing popular algorithm, especially when the CNV association signal is weak and a limited number of SNPs are located in the CNV.^
Resumo:
The purpose of this study was to evaluate the adequacy of computerized vital records in Texas for conducting etiologic studies on neural tube defects (NTDs), using the revised and expanded National Centers for Health Statistics vital record forms introduced in Texas in 1989.^ Cases of NTDs (anencephaly and spina bifida) among Harris County (Houston) residents were identified from the computerized birth and death records for 1989-1991. The validity of the system was then measured against cases ascertained independently through medical records and death certificates. The computerized system performed poorly in its identification of NTDs, particularly for anencephaly, where the false positive rate was 80% with little or no improvement over the 3-year period. For both NTDs the sensitivity and predictive value positive of the tapes were somewhat higher for Hispanic than non-Hispanic mothers.^ Case control studies were conducted utilizing the tape set and the independently verified data set, using controls selected from the live birth tapes. Findings varied widely between the data sets. For example, the anencephaly odds ratio for Hispanic mothers (vs. non-Hispanic) was 1.91 (CI = 1.38-2.65) for the tape file, but 3.18 (CI = 1.81-5.58) for verified records. The odds ratio for diabetes was elevated for the tape set (OR = 3.33, CI = 1.67-6.66) but not for verified cases (OR = 1.09, CI = 0.24-4.96), among whom few mothers were diabetic. It was concluded that computerized tapes should not be solely relied on for NTD studies.^ Using the verified cases, Hispanic mother was associated with spina bifida, and Hispanic mother, teen mother, and previous pregnancy terminations were associated with anencephaly. Mother's birthplace, education, parity, and diabetes were not significant for either NTD.^ Stratified analyses revealed several notable examples of statistical interaction. For anencephaly, strong interaction was observed between Hispanic origin and trimester of first prenatal care.^ The prevalence was 3.8 per 10,000 live births for anencephaly and 2.0 for spina bifida (5.8 per 10,000 births for the combined categories). ^
Resumo:
Children who experience early pubertal development have an increased risk of developing cancer (breast, ovarian, and testicular), osteoporosis, insulin resistance, and obesity as adults. Early pubertal development has been associated with depression, aggressiveness, and increased sexual prowess. Possible explanations for the decline in age of pubertal onset include genetics, exposure to environmental toxins, better nutrition, and a reduction in childhood infections. In this study we (1) evaluated the association between 415 single nucleotide polymorphisms (SNPs) from hormonal pathways and early puberty, defined as menarche prior to age 12 in females and Tanner Stage 2 development prior to age 11 in males, and (2) measured endocrine hormone trajectories (estradiol, testosterone, and DHEAS) in relation to age, race, and Tanner Stage in a cohort of children from Project HeartBeat! At the end of the 4-year study, 193 females had onset of menarche and 121 males had pubertal staging at age 11. African American females had a younger mean age at menarche than Non-Hispanic White females. African American females and males had a lower mean age at each pubertal stage (1-5) than Non-Hispanic White females and males. African American females had higher mean BMI measures at each pubertal stage than Non-Hispanic White females. Of the 415 SNPs evaluated in females, 22 SNPs were associated with early menarche, when adjusted for race ( p<0.05), but none remained significant after adjusting for multiple testing by False Discovery Rate (p<0.00017). In males, 17 SNPs were associated with early pubertal development when adjusted for race (p<0.05), but none remained significant when adjusted for multiple testing (p<0.00017). ^ There were 4955 hormone measurements taken during the 4-year study period from 632 African American and Non-Hispanic White males and females. On average, African American females started and ended the pubertal process at a younger age than Non-Hispanic White females. The mean age of Tanner Stage 2 breast development in African American and Non-Hispanic White females was 9.7 (S.D.=0.8) and 10.2 (S.D.=1.1) years, respectively. There was a significant difference by race in mean age for each pubertal stage, except Tanner Stage 1 for pubic hair development. Both Estradiol and DHEAS levels in females varied significantly with age, but not by race. Estradiol and DHEAS levels increased from Tanner Stage 1 to Tanner Stage 5.^ African American males had a lower mean age at each Tanner Stage of development than Non-Hispanic White males. The mean age of Tanner Stage 2 genital development in African American and Non-Hispanic White males was 10.5 (S.D.=1.1) and 10.8 (S.D.=1.1) years, respectively, but this difference was not significant (p=0.11). Testosterone levels varied significantly with age and race. Non-Hispanic White males had higher levels of testosterone than African American males from Tanner Stage 1-4. Testosterone levels increased for both races from Tanner Stage 1 to Tanner Stage 5. Testosterone levels had the steepest increase from ages 11-15 for both races. DHEAS levels in males varied significantly with age, but not by race. DHEAS levels had the steepest increase from ages 14-17. ^ In conclusion, African American males and females experience pubertal onset at a younger age than Non-Hispanic White males and females, but in this study, we could not find a specific gene that explained the observed variation in age of pubertal onset. Future studies with larger study populations may provide a better understanding of the contribution of genes in early pubertal onset.^
Resumo:
My dissertation focuses on two aspects of RNA sequencing technology. The first is the methodology for modeling the overdispersion inherent in RNA-seq data for differential expression analysis. This aspect is addressed in three sections. The second aspect is the application of RNA-seq data to identify the CpG island methylator phenotype (CIMP) by integrating datasets of mRNA expression level and DNA methylation status. Section 1: The cost of DNA sequencing has reduced dramatically in the past decade. Consequently, genomic research increasingly depends on sequencing technology. However it remains elusive how the sequencing capacity influences the accuracy of mRNA expression measurement. We observe that accuracy improves along with the increasing sequencing depth. To model the overdispersion, we use the beta-binomial distribution with a new parameter indicating the dependency between overdispersion and sequencing depth. Our modified beta-binomial model performs better than the binomial or the pure beta-binomial model with a lower false discovery rate. Section 2: Although a number of methods have been proposed in order to accurately analyze differential RNA expression on the gene level, modeling on the base pair level is required. Here, we find that the overdispersion rate decreases as the sequencing depth increases on the base pair level. Also, we propose four models and compare them with each other. As expected, our beta binomial model with a dynamic overdispersion rate is shown to be superior. Section 3: We investigate biases in RNA-seq by exploring the measurement of the external control, spike-in RNA. This study is based on two datasets with spike-in controls obtained from a recent study. We observe an undiscovered bias in the measurement of the spike-in transcripts that arises from the influence of the sample transcripts in RNA-seq. Also, we find that this influence is related to the local sequence of the random hexamer that is used in priming. We suggest a model of the inequality between samples and to correct this type of bias. Section 4: The expression of a gene can be turned off when its promoter is highly methylated. Several studies have reported that a clear threshold effect exists in gene silencing that is mediated by DNA methylation. It is reasonable to assume the thresholds are specific for each gene. It is also intriguing to investigate genes that are largely controlled by DNA methylation. These genes are called “L-shaped” genes. We develop a method to determine the DNA methylation threshold and identify a new CIMP of BRCA. In conclusion, we provide a detailed understanding of the relationship between the overdispersion rate and sequencing depth. And we reveal a new bias in RNA-seq and provide a detailed understanding of the relationship between this new bias and the local sequence. Also we develop a powerful method to dichotomize methylation status and consequently we identify a new CIMP of breast cancer with a distinct classification of molecular characteristics and clinical features.
Resumo:
The increasing demand of security oriented to mobile applications has raised the attention to biometrics, as a proper and suitable solution for providing secure environment to mobile devices. With this aim, this document presents a biometric system based on hand geometry oriented to mobile devices, involving a high degree of freedom in terms of illumination, hand rotation and distance to camera. The user takes a picture of their own hand in the free space, without requiring any flat surface to locate the hand, and without removals of rings, bracelets or watches. The proposed biometric system relies on an accurate segmentation procedure, able to isolate hands from any background; a feature extraction, invariant to orientation, illumination, distance to camera and background; and a user classification, based on k-Nearest Neighbor approach, able to provide an accurate results on individual identification. The proposed method has been evaluated with two own databases collected with a HTC mobile. First database contains 120 individuals, with 20 acquisitions of both hands. Second database is a synthetic database, containing 408000 images of hand samples in different backgrounds: tiles, grass, water, sand, soil and the like. The system is able to identify individuals properly with False Reject Rate of 5.78% and False Acceptance Rate of 0.089%, using 60 features (15 features per finger)
Resumo:
The area of Human-Machine Interface is growing fast due to its high importance in all technological systems. The basic idea behind designing human-machine interfaces is to enrich the communication with the technology in a natural and easy way. Gesture interfaces are a good example of transparent interfaces. Such interfaces must identify properly the action the user wants to perform, so the proper gesture recognition is of the highest importance. However, most of the systems based on gesture recognition use complex methods requiring high-resource devices. In this work, we propose to model gestures capturing their temporal properties, which significantly reduce storage requirements, and use clustering techniques, namely self-organizing maps and unsupervised genetic algorithm, for their classification. We further propose to train a certain number of algorithms with different parameters and combine their decision using majority voting in order to decrease the false positive rate. The main advantage of the approach is its simplicity, which enables the implementation using devices with limited resources, and therefore low cost. The testing results demonstrate its high potential.
Resumo:
El extraordinario auge de las nuevas tecnologías de la información, el desarrollo de la Internet de las Cosas, el comercio electrónico, las redes sociales, la telefonía móvil y la computación y almacenamiento en la nube, han proporcionado grandes beneficios en todos los ámbitos de la sociedad. Junto a éstos, se presentan nuevos retos para la protección y privacidad de la información y su contenido, como la suplantación de personalidad y la pérdida de la confidencialidad e integridad de los documentos o las comunicaciones electrónicas. Este hecho puede verse agravado por la falta de una frontera clara que delimite el mundo personal del mundo laboral en cuanto al acceso de la información. En todos estos campos de la actividad personal y laboral, la Criptografía ha jugado un papel fundamental aportando las herramientas necesarias para garantizar la confidencialidad, integridad y disponibilidad tanto de la privacidad de los datos personales como de la información. Por otro lado, la Biometría ha propuesto y ofrecido diferentes técnicas con el fin de garantizar la autentificación de individuos a través del uso de determinadas características personales como las huellas dáctilares, el iris, la geometría de la mano, la voz, la forma de caminar, etc. Cada una de estas dos ciencias, Criptografía y Biometría, aportan soluciones a campos específicos de la protección de datos y autentificación de usuarios, que se verían enormemente potenciados si determinadas características de ambas ciencias se unieran con vistas a objetivos comunes. Por ello es imperativo intensificar la investigación en estos ámbitos combinando los algoritmos y primitivas matemáticas de la Criptografía con la Biometría para dar respuesta a la demanda creciente de nuevas soluciones más técnicas, seguras y fáciles de usar que potencien de modo simultáneo la protección de datos y la identificacíón de usuarios. En esta combinación el concepto de biometría cancelable ha supuesto una piedra angular en el proceso de autentificación e identificación de usuarios al proporcionar propiedades de revocación y cancelación a los ragos biométricos. La contribución de esta tesis se basa en el principal aspecto de la Biometría, es decir, la autentificación segura y eficiente de usuarios a través de sus rasgos biométricos, utilizando tres aproximaciones distintas: 1. Diseño de un esquema criptobiométrico borroso que implemente los principios de la biometría cancelable para identificar usuarios lidiando con los problemas acaecidos de la variabilidad intra e inter-usuarios. 2. Diseño de una nueva función hash que preserva la similitud (SPHF por sus siglas en inglés). Actualmente estas funciones se usan en el campo del análisis forense digital con el objetivo de buscar similitudes en el contenido de archivos distintos pero similares de modo que se pueda precisar hasta qué punto estos archivos pudieran ser considerados iguales. La función definida en este trabajo de investigación, además de mejorar los resultados de las principales funciones desarrolladas hasta el momento, intenta extender su uso a la comparación entre patrones de iris. 3. Desarrollando un nuevo mecanismo de comparación de patrones de iris que considera tales patrones como si fueran señales para compararlos posteriormente utilizando la transformada de Walsh-Hadarmard. Los resultados obtenidos son excelentes teniendo en cuenta los requerimientos de seguridad y privacidad mencionados anteriormente. Cada uno de los tres esquemas diseñados han sido implementados para poder realizar experimentos y probar su eficacia operativa en escenarios que simulan situaciones reales: El esquema criptobiométrico borroso y la función SPHF han sido implementados en lenguaje Java mientras que el proceso basado en la transformada de Walsh-Hadamard en Matlab. En los experimentos se ha utilizado una base de datos de imágenes de iris (CASIA) para simular una población de usuarios del sistema. En el caso particular de la función de SPHF, además se han realizado experimentos para comprobar su utilidad en el campo de análisis forense comparando archivos e imágenes con contenido similar y distinto. En este sentido, para cada uno de los esquemas se han calculado los ratios de falso negativo y falso positivo. ABSTRACT The extraordinary increase of new information technologies, the development of Internet of Things, the electronic commerce, the social networks, mobile or smart telephony and cloud computing and storage, have provided great benefits in all areas of society. Besides this fact, there are new challenges for the protection and privacy of information and its content, such as the loss of confidentiality and integrity of electronic documents and communications. This is exarcebated by the lack of a clear boundary between the personal world and the business world as their differences are becoming narrower. In both worlds, i.e the personal and the business one, Cryptography has played a key role by providing the necessary tools to ensure the confidentiality, integrity and availability both of the privacy of the personal data and information. On the other hand, Biometrics has offered and proposed different techniques with the aim to assure the authentication of individuals through their biometric traits, such as fingerprints, iris, hand geometry, voice, gait, etc. Each of these sciences, Cryptography and Biometrics, provides tools to specific problems of the data protection and user authentication, which would be widely strengthen if determined characteristics of both sciences would be combined in order to achieve common objectives. Therefore, it is imperative to intensify the research in this area by combining the basics mathematical algorithms and primitives of Cryptography with Biometrics to meet the growing demand for more secure and usability techniques which would improve the data protection and the user authentication. In this combination, the use of cancelable biometrics makes a cornerstone in the user authentication and identification process since it provides revocable or cancelation properties to the biometric traits. The contributions in this thesis involve the main aspect of Biometrics, i.e. the secure and efficient authentication of users through their biometric templates, considered from three different approaches. The first one is designing a fuzzy crypto-biometric scheme using the cancelable biometric principles to take advantage of the fuzziness of the biometric templates at the same time that it deals with the intra- and inter-user variability among users without compromising the biometric templates extracted from the legitimate users. The second one is designing a new Similarity Preserving Hash Function (SPHF), currently widely used in the Digital Forensics field to find similarities among different files to calculate their similarity level. The function designed in this research work, besides the fact of improving the results of the two main functions of this field currently in place, it tries to expand its use to the iris template comparison. Finally, the last approach of this thesis is developing a new mechanism of handling the iris templates, considering them as signals, to use the Walsh-Hadamard transform (complemented with three other algorithms) to compare them. The results obtained are excellent taking into account the security and privacy requirements mentioned previously. Every one of the three schemes designed have been implemented to test their operational efficacy in situations that simulate real scenarios: The fuzzy crypto-biometric scheme and the SPHF have been implemented in Java language, while the process based on the Walsh-Hadamard transform in Matlab. The experiments have been performed using a database of iris templates (CASIA-IrisV2) to simulate a user population. The case of the new SPHF designed is special since previous to be applied i to the Biometrics field, it has been also tested to determine its applicability in the Digital Forensic field comparing similar and dissimilar files and images. The ratios of efficiency and effectiveness regarding user authentication, i.e. False Non Match and False Match Rate, for the schemes designed have been calculated with different parameters and cases to analyse their behaviour.
Resumo:
Entendemos por inteligencia colectiva una forma de inteligencia que surge de la colaboración y la participación de varios individuos o, siendo más estrictos, varias entidades. En base a esta sencilla definición podemos observar que este concepto es campo de estudio de las más diversas disciplinas como pueden ser la sociología, las tecnologías de la información o la biología, atendiendo cada una de ellas a un tipo de entidades diferentes: seres humanos, elementos de computación o animales. Como elemento común podríamos indicar que la inteligencia colectiva ha tenido como objetivo el ser capaz de fomentar una inteligencia de grupo que supere a la inteligencia individual de las entidades que lo forman a través de mecanismos de coordinación, cooperación, competencia, integración, diferenciación, etc. Sin embargo, aunque históricamente la inteligencia colectiva se ha podido desarrollar de forma paralela e independiente en las distintas disciplinas que la tratan, en la actualidad, los avances en las tecnologías de la información han provocado que esto ya no sea suficiente. Hoy en día seres humanos y máquinas a través de todo tipo de redes de comunicación e interfaces, conviven en un entorno en el que la inteligencia colectiva ha cobrado una nueva dimensión: ya no sólo puede intentar obtener un comportamiento superior al de sus entidades constituyentes sino que ahora, además, estas inteligencias individuales son completamente diferentes unas de otras y aparece por lo tanto el doble reto de ser capaces de gestionar esta gran heterogeneidad y al mismo tiempo ser capaces de obtener comportamientos aún más inteligentes gracias a las sinergias que los distintos tipos de inteligencias pueden generar. Dentro de las áreas de trabajo de la inteligencia colectiva existen varios campos abiertos en los que siempre se intenta obtener unas prestaciones superiores a las de los individuos. Por ejemplo: consciencia colectiva, memoria colectiva o sabiduría colectiva. Entre todos estos campos nosotros nos centraremos en uno que tiene presencia en la práctica totalidad de posibles comportamientos inteligentes: la toma de decisiones. El campo de estudio de la toma de decisiones es realmente amplio y dentro del mismo la evolución ha sido completamente paralela a la que citábamos anteriormente en referencia a la inteligencia colectiva. En primer lugar se centró en el individuo como entidad decisoria para posteriormente desarrollarse desde un punto de vista social, institucional, etc. La primera fase dentro del estudio de la toma de decisiones se basó en la utilización de paradigmas muy sencillos: análisis de ventajas e inconvenientes, priorización basada en la maximización de algún parámetro del resultado, capacidad para satisfacer los requisitos de forma mínima por parte de las alternativas, consultas a expertos o entidades autorizadas o incluso el azar. Sin embargo, al igual que el paso del estudio del individuo al grupo supone una nueva dimensión dentro la inteligencia colectiva la toma de decisiones colectiva supone un nuevo reto en todas las disciplinas relacionadas. Además, dentro de la decisión colectiva aparecen dos nuevos frentes: los sistemas de decisión centralizados y descentralizados. En el presente proyecto de tesis nos centraremos en este segundo, que es el que supone una mayor atractivo tanto por las posibilidades de generar nuevo conocimiento y trabajar con problemas abiertos actualmente así como en lo que respecta a la aplicabilidad de los resultados que puedan obtenerse. Ya por último, dentro del campo de los sistemas de decisión descentralizados existen varios mecanismos fundamentales que dan lugar a distintas aproximaciones a la problemática propia de este campo. Por ejemplo el liderazgo, la imitación, la prescripción o el miedo. Nosotros nos centraremos en uno de los más multidisciplinares y con mayor capacidad de aplicación en todo tipo de disciplinas y que, históricamente, ha demostrado que puede dar lugar a prestaciones muy superiores a otros tipos de mecanismos de decisión descentralizados: la confianza y la reputación. Resumidamente podríamos indicar que confianza es la creencia por parte de una entidad que otra va a realizar una determinada actividad de una forma concreta. En principio es algo subjetivo, ya que la confianza de dos entidades diferentes sobre una tercera no tiene porqué ser la misma. Por otro lado, la reputación es la idea colectiva (o evaluación social) que distintas entidades de un sistema tiene sobre otra entidad del mismo en lo que respecta a un determinado criterio. Es por tanto una información de carácter colectivo pero única dentro de un sistema, no asociada a cada una de las entidades del sistema sino por igual a todas ellas. En estas dos sencillas definiciones se basan la inmensa mayoría de sistemas colectivos. De hecho muchas disertaciones indican que ningún tipo de organización podría ser viable de no ser por la existencia y la utilización de los conceptos de confianza y reputación. A partir de ahora, a todo sistema que utilice de una u otra forma estos conceptos lo denominaremos como sistema de confianza y reputación (o TRS, Trust and Reputation System). Sin embargo, aunque los TRS son uno de los aspectos de nuestras vidas más cotidianos y con un mayor campo de aplicación, el conocimiento que existe actualmente sobre ellos no podría ser más disperso. Existen un gran número de trabajos científicos en todo tipo de áreas de conocimiento: filosofía, psicología, sociología, economía, política, tecnologías de la información, etc. Pero el principal problema es que no existe una visión completa de la confianza y reputación en su sentido más amplio. Cada disciplina focaliza sus estudios en unos aspectos u otros dentro de los TRS, pero ninguna de ellas trata de explotar el conocimiento generado en el resto para mejorar sus prestaciones en su campo de aplicación concreto. Aspectos muy detallados en algunas áreas de conocimiento son completamente obviados por otras, o incluso aspectos tratados por distintas disciplinas, al ser estudiados desde distintos puntos de vista arrojan resultados complementarios que, sin embargo, no son aprovechados fuera de dichas áreas de conocimiento. Esto nos lleva a una dispersión de conocimiento muy elevada y a una falta de reutilización de metodologías, políticas de actuación y técnicas de una disciplina a otra. Debido su vital importancia, esta alta dispersión de conocimiento se trata de uno de los principales problemas que se pretenden resolver con el presente trabajo de tesis. Por otro lado, cuando se trabaja con TRS, todos los aspectos relacionados con la seguridad están muy presentes ya que muy este es un tema vital dentro del campo de la toma de decisiones. Además también es habitual que los TRS se utilicen para desempeñar responsabilidades que aportan algún tipo de funcionalidad relacionada con el mundo de la seguridad. Por último no podemos olvidar que el acto de confiar está indefectiblemente unido al de delegar una determinada responsabilidad, y que al tratar estos conceptos siempre aparece la idea de riesgo, riesgo de que las expectativas generadas por el acto de la delegación no se cumplan o se cumplan de forma diferente. Podemos ver por lo tanto que cualquier sistema que utiliza la confianza para mejorar o posibilitar su funcionamiento, por su propia naturaleza, es especialmente vulnerable si las premisas en las que se basa son atacadas. En este sentido podemos comprobar (tal y como analizaremos en más detalle a lo largo del presente documento) que las aproximaciones que realizan las distintas disciplinas que tratan la violación de los sistemas de confianza es de lo más variado. únicamente dentro del área de las tecnologías de la información se ha intentado utilizar alguno de los enfoques de otras disciplinas de cara a afrontar problemas relacionados con la seguridad de TRS. Sin embargo se trata de una aproximación incompleta y, normalmente, realizada para cumplir requisitos de aplicaciones concretas y no con la idea de afianzar una base de conocimiento más general y reutilizable en otros entornos. Con todo esto en cuenta, podemos resumir contribuciones del presente trabajo de tesis en las siguientes. • La realización de un completo análisis del estado del arte dentro del mundo de la confianza y la reputación que nos permite comparar las ventajas e inconvenientes de las diferentes aproximación que se realizan a estos conceptos en distintas áreas de conocimiento. • La definición de una arquitectura de referencia para TRS que contempla todas las entidades y procesos que intervienen en este tipo de sistemas. • La definición de un marco de referencia para analizar la seguridad de TRS. Esto implica tanto identificar los principales activos de un TRS en lo que respecta a la seguridad, así como el crear una tipología de posibles ataques y contramedidas en base a dichos activos. • La propuesta de una metodología para el análisis, el diseño, el aseguramiento y el despliegue de un TRS en entornos reales. Adicionalmente se exponen los principales tipos de aplicaciones que pueden obtenerse de los TRS y los medios para maximizar sus prestaciones en cada una de ellas. • La generación de un software que permite simular cualquier tipo de TRS en base a la arquitectura propuesta previamente. Esto permite evaluar las prestaciones de un TRS bajo una determinada configuración en un entorno controlado previamente a su despliegue en un entorno real. Igualmente es de gran utilidad para evaluar la resistencia a distintos tipos de ataques o mal-funcionamientos del sistema. Además de las contribuciones realizadas directamente en el campo de los TRS, hemos realizado aportaciones originales a distintas áreas de conocimiento gracias a la aplicación de las metodologías de análisis y diseño citadas con anterioridad. • Detección de anomalías térmicas en Data Centers. Hemos implementado con éxito un sistema de deteción de anomalías térmicas basado en un TRS. Comparamos la detección de prestaciones de algoritmos de tipo Self-Organized Maps (SOM) y Growing Neural Gas (GNG). Mostramos como SOM ofrece mejores resultados para anomalías en los sistemas de refrigeración de la sala mientras que GNG es una opción más adecuada debido a sus tasas de detección y aislamiento para casos de anomalías provocadas por una carga de trabajo excesiva. • Mejora de las prestaciones de recolección de un sistema basado en swarm computing y odometría social. Gracias a la implementación de un TRS conseguimos mejorar las capacidades de coordinación de una red de robots autónomos distribuidos. La principal contribución reside en el análisis y la validación de las mejoras increméntales que pueden conseguirse con la utilización apropiada de la información existente en el sistema y que puede ser relevante desde el punto de vista de un TRS, y con la implementación de algoritmos de cálculo de confianza basados en dicha información. • Mejora de la seguridad de Wireless Mesh Networks contra ataques contra la integridad, la confidencialidad o la disponibilidad de los datos y / o comunicaciones soportadas por dichas redes. • Mejora de la seguridad de Wireless Sensor Networks contra ataques avanzamos, como insider attacks, ataques desconocidos, etc. Gracias a las metodologías presentadas implementamos contramedidas contra este tipo de ataques en entornos complejos. En base a los experimentos realizados, hemos demostrado que nuestra aproximación es capaz de detectar y confinar varios tipos de ataques que afectan a los protocoles esenciales de la red. La propuesta ofrece unas velocidades de detección muy altas así como demuestra que la inclusión de estos mecanismos de actuación temprana incrementa significativamente el esfuerzo que un atacante tiene que introducir para comprometer la red. Finalmente podríamos concluir que el presente trabajo de tesis supone la generación de un conocimiento útil y aplicable a entornos reales, que nos permite la maximización de las prestaciones resultantes de la utilización de TRS en cualquier tipo de campo de aplicación. De esta forma cubrimos la principal carencia existente actualmente en este campo, que es la falta de una base de conocimiento común y agregada y la inexistencia de una metodología para el desarrollo de TRS que nos permita analizar, diseñar, asegurar y desplegar TRS de una forma sistemática y no artesanal y ad-hoc como se hace en la actualidad. ABSTRACT By collective intelligence we understand a form of intelligence that emerges from the collaboration and competition of many individuals, or strictly speaking, many entities. Based on this simple definition, we can see how this concept is the field of study of a wide range of disciplines, such as sociology, information science or biology, each of them focused in different kinds of entities: human beings, computational resources, or animals. As a common factor, we can point that collective intelligence has always had the goal of being able of promoting a group intelligence that overcomes the individual intelligence of the basic entities that constitute it. This can be accomplished through different mechanisms such as coordination, cooperation, competence, integration, differentiation, etc. Collective intelligence has historically been developed in a parallel and independent way among the different disciplines that deal with it. However, this is not enough anymore due to the advances in information technologies. Nowadays, human beings and machines coexist in environments where collective intelligence has taken a new dimension: we yet have to achieve a better collective behavior than the individual one, but now we also have to deal with completely different kinds of individual intelligences. Therefore, we have a double goal: being able to deal with this heterogeneity and being able to get even more intelligent behaviors thanks to the synergies that the different kinds of intelligence can generate. Within the areas of collective intelligence there are several open topics where they always try to get better performances from groups than from the individuals. For example: collective consciousness, collective memory, or collective wisdom. Among all these topics we will focus on collective decision making, that has influence in most of the collective intelligent behaviors. The field of study of decision making is really wide, and its evolution has been completely parallel to the aforementioned collective intelligence. Firstly, it was focused on the individual as the main decision-making entity, but later it became involved in studying social and institutional groups as basic decision-making entities. The first studies within the decision-making discipline were based on simple paradigms, such as pros and cons analysis, criteria prioritization, fulfillment, following orders, or even chance. However, in the same way that studying the community instead of the individual meant a paradigm shift within collective intelligence, collective decision-making means a new challenge for all the related disciplines. Besides, two new main topics come up when dealing with collective decision-making: centralized and decentralized decision-making systems. In this thesis project we focus in the second one, because it is the most interesting based on the opportunities to generate new knowledge and deal with open issues in this area, as well as these results can be put into practice in a wider set of real-life environments. Finally, within the decentralized collective decision-making systems discipline, there are several basic mechanisms that lead to different approaches to the specific problems of this field, for example: leadership, imitation, prescription, or fear. We will focus on trust and reputation. They are one of the most multidisciplinary concepts and with more potential for applying them in every kind of environments. Besides, they have historically shown that they can generate better performance than other decentralized decision-making mechanisms. Shortly, we say trust is the belief of one entity that the outcome of other entities’ actions is going to be in a specific way. It is a subjective concept because the trust of two different entities in another one does not have to be the same. Reputation is the collective idea (or social evaluation) that a group of entities within a system have about another entity based on a specific criterion. Thus, it is a collective concept in its origin. It is important to say that the behavior of most of the collective systems are based on these two simple definitions. In fact, a lot of articles and essays describe how any organization would not be viable if the ideas of trust and reputation did not exist. From now on, we call Trust an Reputation System (TRS) to any kind of system that uses these concepts. Even though TRSs are one of the most common everyday aspects in our lives, the existing knowledge about them could not be more dispersed. There are thousands of scientific works in every field of study related to trust and reputation: philosophy, psychology, sociology, economics, politics, information sciences, etc. But the main issue is that a comprehensive vision of trust and reputation for all these disciplines does not exist. Every discipline focuses its studies on a specific set of topics but none of them tries to take advantage of the knowledge generated in the other disciplines to improve its behavior or performance. Detailed topics in some fields are completely obviated in others, and even though the study of some topics within several disciplines produces complementary results, these results are not used outside the discipline where they were generated. This leads us to a very high knowledge dispersion and to a lack in the reuse of methodologies, policies and techniques among disciplines. Due to its great importance, this high dispersion of trust and reputation knowledge is one of the main problems this thesis contributes to solve. When we work with TRSs, all the aspects related to security are a constant since it is a vital aspect within the decision-making systems. Besides, TRS are often used to perform some responsibilities related to security. Finally, we cannot forget that the act of trusting is invariably attached to the act of delegating a specific responsibility and, when we deal with these concepts, the idea of risk is always present. This refers to the risk of generated expectations not being accomplished or being accomplished in a different way we anticipated. Thus, we can see that any system using trust to improve or enable its behavior, because of its own nature, is especially vulnerable if the premises it is based on are attacked. Related to this topic, we can see that the approaches of the different disciplines that study attacks of trust and reputation are very diverse. Some attempts of using approaches of other disciplines have been made within the information science area of knowledge, but these approaches are usually incomplete, not systematic and oriented to achieve specific requirements of specific applications. They never try to consolidate a common base of knowledge that could be reusable in other context. Based on all these ideas, this work makes the following direct contributions to the field of TRS: • The compilation of the most relevant existing knowledge related to trust and reputation management systems focusing on their advantages and disadvantages. • We define a generic architecture for TRS, identifying the main entities and processes involved. • We define a generic security framework for TRS. We identify the main security assets and propose a complete taxonomy of attacks for TRS. • We propose and validate a methodology to analyze, design, secure and deploy TRS in real-life environments. Additionally we identify the principal kind of applications we can implement with TRS and how TRS can provide a specific functionality. • We develop a software component to validate and optimize the behavior of a TRS in order to achieve a specific functionality or performance. In addition to the contributions made directly to the field of the TRS, we have made original contributions to different areas of knowledge thanks to the application of the analysis, design and security methodologies previously presented: • Detection of thermal anomalies in Data Centers. Thanks to the application of the TRS analysis and design methodologies, we successfully implemented a thermal anomaly detection system based on a TRS.We compare the detection performance of Self-Organized- Maps and Growing Neural Gas algorithms. We show how SOM provides better results for Computer Room Air Conditioning anomaly detection, yielding detection rates of 100%, in training data with malfunctioning sensors. We also show that GNG yields better detection and isolation rates for workload anomaly detection, reducing the false positive rate when compared to SOM. • Improving the performance of a harvesting system based on swarm computing and social odometry. Through the implementation of a TRS, we achieved to improve the ability of coordinating a distributed network of autonomous robots. The main contribution lies in the analysis and validation of the incremental improvements that can be achieved with proper use information that exist in the system and that are relevant for the TRS, and the implementation of the appropriated trust algorithms based on such information. • Improving Wireless Mesh Networks security against attacks against the integrity, confidentiality or availability of data and communications supported by these networks. Thanks to the implementation of a TRS we improved the detection time rate against these kind of attacks and we limited their potential impact over the system. • We improved the security of Wireless Sensor Networks against advanced attacks, such as insider attacks, unknown attacks, etc. Thanks to the TRS analysis and design methodologies previously described, we implemented countermeasures against such attacks in a complex environment. In our experiments we have demonstrated that our system is capable of detecting and confining various attacks that affect the core network protocols. We have also demonstrated that our approach is capable of rapid attack detection. Also, it has been proven that the inclusion of the proposed detection mechanisms significantly increases the effort the attacker has to introduce in order to compromise the network. Finally we can conclude that, to all intents and purposes, this thesis offers a useful and applicable knowledge in real-life environments that allows us to maximize the performance of any system based on a TRS. Thus, we deal with the main deficiency of this discipline: the lack of a common and complete base of knowledge and the lack of a methodology for the development of TRS that allow us to analyze, design, secure and deploy TRS in a systematic way.
Resumo:
Microarrays can measure the expression of thousands of genes to identify changes in expression between different biological states. Methods are needed to determine the significance of these changes while accounting for the enormous number of genes. We describe a method, Significance Analysis of Microarrays (SAM), that assigns a score to each gene on the basis of change in gene expression relative to the standard deviation of repeated measurements. For genes with scores greater than an adjustable threshold, SAM uses permutations of the repeated measurements to estimate the percentage of genes identified by chance, the false discovery rate (FDR). When the transcriptional response of human cells to ionizing radiation was measured by microarrays, SAM identified 34 genes that changed at least 1.5-fold with an estimated FDR of 12%, compared with FDRs of 60 and 84% by using conventional methods of analysis. Of the 34 genes, 19 were involved in cell cycle regulation and 3 in apoptosis. Surprisingly, four nucleotide excision repair genes were induced, suggesting that this repair pathway for UV-damaged DNA might play a previously unrecognized role in repairing DNA damaged by ionizing radiation.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
In 48 university students performing single-item spelling recognition, prior exposure to misspelled words improved slightly the accuracy on correctly spelled words and increased markedly the 'false alarm' rate (classifying a misspelling seen at study as correct). In a group given a dictation test (N = 24) the only effect of exposure to misspellings was a small increment in the number of misspellings that matched the misspelling seen at study. The two test groups showed no advantage of having the same display format at study and test (AA or BB vs AB or BA). Experiment 2 (in progress) investigated a format match at study and test against a condition with a new test context (AA or BB vs AC or BC). The results to date suggest an influence of memory of the study trial rather than simply an updating by the study exposures of abstract lexical representations. [ABSTRACT FROM AUTHOR]
Resumo:
The Roche Cobas Amplicor system is widely used for the detection of Neisseria gonorrhoeae but is known to cross react with some commensal Neisseria spp. Therefore, a confirmatory test is required. The most common target for confirmatory tests is the cppB gene of N. gonorrhoeae. However, the cppB gene is also present in other Neisseria spp. and is absent in some N. gonorrhoeae isolates. As a result, laboratories targeting this gene run the risk of obtaining both false-positive and false-negative results. In the study presented here, a newly developed N. gonorrhoeae LightCycler assay (NGpapLC) targeting the N. gonorrhoeae porA pseudogene was tested. The NGpapLC assay was used to test 282 clinical samples, and the results were compared to those obtained using a testing algorithm combining the Cobas Amplicor System (Roche Diagnostics, Sydney, Australia) and an in-house LightCycler assay targeting the cppB gene (cppB-LC). In addition, the specificity of the NGpapLC assay was investigated by testing a broad panel of bacteria including isolates of several Neisseria spp. The NGpapLC assay proved to have comparable clinical sensitivity to the cppB-LC assay. In addition; testing of the bacterial panel showed the NGpapLC assay to be highly specific for N. gonorrhoeae DNA. The results of this study show the NGpapLC assay is a suitable alternative to the cppB-LC assay for confirmation of N. gonorrhoeae-positive results obtained with Cobas Amplicor.
Resumo:
This study presents tympanometric normative data for Australian children at school entry in view of the lack of age-specific population-based data for this group. Participants were 327 children (164 boys, 163 girls) aged between 5 and 6 years, who had no history of middle ear infection, and passed pure-tone screening at 20 dB HL. Normative values for static admittance (SA), ear canal volume (ECV), tympanometric peak pressure, tympanometric width (TW) and tympanometric gradient were established. Based on these normative data, the use of the ASHA (1997) guidelines for medical referral, in which ECV > 1.0 ml in the presence of a flat tympanogram, SA < 0.3 ml, or TW > 200 daPa may not provide the best criteria for Australian children aged between 5 and 6 years. If SA < 0.3 ml were used instead of SA < 0.16 ml, a greater proportion of Australian children would have failed tympanometry, thus increasing the false alarm rate.