895 resultados para discriminant analysis and cluster analysis


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In numerous intervention studies and education field trials, random assignment to treatment occurs in clusters rather than at the level of observation. This departure of random assignment of units may be due to logistics, political feasibility, or ecological validity. Data within the same cluster or grouping are often correlated. Application of traditional regression techniques, which assume independence between observations, to clustered data produce consistent parameter estimates. However such estimators are often inefficient as compared to methods which incorporate the clustered nature of the data into the estimation procedure (Neuhaus 1993).1 Multilevel models, also known as random effects or random components models, can be used to account for the clustering of data by estimating higher level, or group, as well as lower level, or individual variation. Designing a study, in which the unit of observation is nested within higher level groupings, requires the determination of sample sizes at each level. This study investigates the design and analysis of various sampling strategies for a 3-level repeated measures design on the parameter estimates when the outcome variable of interest follows a Poisson distribution. ^ Results study suggest that second order PQL estimation produces the least biased estimates in the 3-level multilevel Poisson model followed by first order PQL and then second and first order MQL. The MQL estimates of both fixed and random parameters are generally satisfactory when the level 2 and level 3 variation is less than 0.10. However, as the higher level error variance increases, the MQL estimates become increasingly biased. If convergence of the estimation algorithm is not obtained by PQL procedure and higher level error variance is large, the estimates may be significantly biased. In this case bias correction techniques such as bootstrapping should be considered as an alternative procedure. For larger sample sizes, those structures with 20 or more units sampled at levels with normally distributed random errors produced more stable estimates with less sampling variance than structures with an increased number of level 1 units. For small sample sizes, sampling fewer units at the level with Poisson variation produces less sampling variation, however this criterion is no longer important when sample sizes are large. ^ 1Neuhaus J (1993). “Estimation efficiency and Tests of Covariate Effects with Clustered Binary Data”. Biometrics , 49, 989–996^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Many studies in biostatistics deal with binary data. Some of these studies involve correlated observations, which can complicate the analysis of the resulting data. Studies of this kind typically arise when a high degree of commonality exists between test subjects. If there exists a natural hierarchy in the data, multilevel analysis is an appropriate tool for the analysis. Two examples are the measurements on identical twins, or the study of symmetrical organs or appendages such as in the case of ophthalmic studies. Although this type of matching appears ideal for the purposes of comparison, analysis of the resulting data while ignoring the effect of intra-cluster correlation has been shown to produce biased results.^ This paper will explore the use of multilevel modeling of simulated binary data with predetermined levels of correlation. Data will be generated using the Beta-Binomial method with varying degrees of correlation between the lower level observations. The data will be analyzed using the multilevel software package MlwiN (Woodhouse, et al, 1995). Comparisons between the specified intra-cluster correlation of these data and the estimated correlations, using multilevel analysis, will be used to examine the accuracy of this technique in analyzing this type of data. ^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

BACKGROUND Listeria (L.) monocytogenes causes fatal infections in many species including ruminants and humans. In ruminants, rhombencephalitis is the most prevalent form of listeriosis. Using multilocus variable number tandem repeat analysis (MLVA) we recently showed that L. monocytogenes isolates from ruminant rhombencephalitis cases are distributed over three genetic complexes (designated A, B and C). However, the majority of rhombencephalitis strains and virtually all those isolated from cattle cluster in MLVA complex A, indicating that strains of this complex may have increased neurotropism and neurovirulence. The aim of this study was to investigate whether ruminant rhombencephalitis strains have an increased ability to propagate in the bovine hippocampal brain-slice model and can be discriminated from strains of other sources. For this study, forty-seven strains were selected and assayed on brain-slice cultures, a bovine macrophage cell line (BoMac) and a human colorectal adenocarcinoma cell line (Caco-2). They were isolated from ruminant rhombencephalitis cases (n = 21) and other sources including the environment, food, human neurolisteriosis cases and ruminant/human non-encephalitic infection cases (n = 26). RESULTS All but one L. monocytogenes strain replicated in brain slices, irrespectively of the source of the isolate or MLVA complex. The replication of strains from MLVA complex A was increased in hippocampal brain-slice cultures compared to complex C. Immunofluorescence revealed that microglia are the main target cells for L. monocytogenes and that strains from MLVA complex A caused larger infection foci than strains from MLVA complex C. Additionally, they caused larger plaques in BoMac cells, but not CaCo-2 cells. CONCLUSIONS Our brain slice model data shows that all L. monocytogenes strains should be considered potentially neurovirulent. Secondly, encephalitis strains cannot be conclusively discriminated from non-encephalitis strains with the bovine organotypic brain slice model. The data indicates that MLVA complex A strains are particularly adept at establishing encephalitis possibly by virtue of their higher resistance to antibacterial defense mechanisms in microglia cells, the main target of L. monocytogenes.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background. Liver cancer mortality continues to be a significant factor in deaths worldwide and in the U.S., yet there remains a lack of studies on how mortality burden is impacted by racial groups or by heavy alcohol use. This study evaluated the geographic distribution of liver cancer mortality across population groups in Texas and the U.S. over a 24-year period, as well as determining whether alcohol dependence or abuse correlates with mortality rates. ^ Methods. The Spatial Scan Statistic was used to identify regions of excess liver cancer mortality in Texas counties and the U.S. from 1980 to 2003. The statistic was conducted with a spatial cluster size of 50% of the population at risk, and all analyses used publicly available data. Alcohol abuse data by state and ethnicity were extracted from SAMHSA datasets for the study period 2000–2004. ^ Results. The results of the geographic analysis of liver cancer mortality in both Texas and the U.S. indicate that there were four and seven regions, respectively, that were identified as having statistically significant excess mortality rates with elevated relative risks ranging from 1.38–2.07 and 1.05–1.623 (p = 0.001), respectively. ^ Conclusion. This study revealed seven regions of excess mortality of liver cancer mortality across the U.S. and four regions of excess mortality in Texas between 1980–2003, as well as demonstrated a correlation between elevated liver cancer mortality rates and reporting of alcohol dependence among Hispanics and Other populations. ^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: Despite almost 40 years of research into the etiology of Kawasaki Syndrome (KS), there is little research published on spatial and temporal clustering of KS cases. Previous analysis has found significant spatial and temporal clustering of cases, therefore cluster analyses were performed to substantiate these findings and provide insight into incident KS cases discharged from a pediatric tertiary care hospital. Identifying clusters from a single institution would allow for prospective analysis of risk factors and potential exposures for further insight into KS etiology. ^ Methods: A retrospective study was carried out to examine the epidemiology and distribution of patients presenting to Texas Children’s Hospital in Houston, Texas, with a diagnosis of Acute Febrile Mucocutaneous Lymph Node Syndrome (MCLS) upon discharge from January 1, 2005 to December 31, 2009. Spatial, temporal, and space-time cluster analyses were performed using the Bernoulli model with case and control event data. ^ Results: 397 of 102,761 total patients admitted to Texas Children’s Hospital had a principal or secondary diagnosis of Acute Febrile MCLS upon over the 5 year period. Demographic data for KS cases remained consistent with known disease epidemiology. Spatial, temporal, and space-time analyses of clustering using the Bernoulli model demonstrated no statistically significant clusters. ^ Discussion: Despite previous findings of spatial-temporal clustering of KS cases, there were no significant clusters of KS cases discharged from a single institution. This implicates the need for an expanded approach to conducting spatial-temporal cluster analysis and KS surveillance given the limitations of evaluating data from a single institution.^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Decorin, a dermatan/chondroitin sulfate proteoglycan, is ubiquitously distributed in the extracellular matrix (ECM) of mammals. Decorin belongs to the small leucine rich proteoglycan (SLRP) family, a proteoglycan family characterized by a core protein dominated by Leucine Rich Repeat motifs. The decorin core protein appears to mediate the binding of decorin to ECM molecules, such as collagens and fibronectin. It is believed that the interactions of decorin with these ECM molecules contribute to the regulation of ECM assembly, cell adhesions, and cell proliferation. These basic biological processes play critical roles during embryonic development and wound healing and are altered in pathological conditions such as fibrosis and tumorgenesis. ^ In this dissertation, we discover that decorin core protein can bind to Zn2+ ions with high affinity. Zinc is an essential trace element in mammals. Zn2+ ions play a catalytic role in the activation of many enzymes and a structural role in the stabilization of protein conformation. By examining purified recombinant decorin and its core protein fragments for Zn2+ binding activity using Zn2+-chelating column chromatography and Zn2+-equilibrium dialysis approaches, we have located the Zn2+ binding domain to the N-terminal sequence of the decorin core protein. The decorin N-terminal domain appears to contain two Zn2+ binding sites with similar high binding affinity. The sequence of the decorin N-terminal domain does not resemble any other reported zinc-binding motifs and, therefore, represents a novel Zn 2+ binding motif. By investigating the influence of Zn2+ ions on decorin binding interactions, we found a novel Zn2+ dependent interaction with fibrinogen, the major plasma protein in blood clots. Furthermore, a recombinant peptide (MD4) consisting of a 41 amino acid sequence of mouse decorin N-terminal domain can prolong thrombin induced fibrinogen/fibrin clot formation. This suggests that in the presence of Zn2+ the decorin N-terminal domain has an anticoagulation activity. The changed Zn2+-binding activities of the truncated MD4 peptides and site-directed mutagenesis generated mutant peptides revealed that the functional MD4 peptide might contain both a structural zinc-binding site in the cysteine cluster region and a catalytic zinc site that could be created by the flanking sequences of the cysteine cluster region. A model of a loop-like structure for MD4 peptide is proposed. ^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The AND-1B drill core recovered a 13.57 million year Miocene through Pleistocene record from beneath the McMurdo Ice Shelf in Antarctica (77.9°S, 167.1°E). Varying sedimentary facies in the 1285 m core indicate glacial-interglacial cyclicity with the proximity of ice at the site ranging from grounding of ice in 917 m of water to ice free marine conditions. Broader interpretation of climatic conditions of the wider Ross Sea Embayment is deduced from provenance studies. Here we present an analysis of the iron oxide assemblages in the AND-1B core and interpret their variability with respect to wider paleoclimatic conditions. The core is naturally divided into an upper and lower succession by an expanded 170 m thick volcanic interval between 590 and 760 m. Above 590 m the Plio-Pleistocene glacial cycles are diatom rich and below 760 m late Miocene glacial cycles are terrigenous. Electron microscopy and rock magnetic parameters confirm the subdivision with biogenic silica diluting the terrigenous input (fine pseudo-single domain and stable single domain titanomagnetite from the McMurdo Volcanic Group with a variety of textures and compositions) above 590 m. Below 760 m, the Miocene section consists of coarse-grained ilmenite and multidomain magnetite derived from Transantarctic Mountain lithologies. This may reflect ice flow patterns and the absence of McMurdo Volcanic Group volcanic centers or indicate that volcanic centers had not yet grown to a significant size. The combined rock magnetic and electron microscopy signatures of magnetic minerals serve as provenance tracers in both ice proximal and distal sedimentary units, aiding in the study of ice sheet extent and dynamics, and the identification of ice rafted debris sources and dispersal patterns in the Ross Sea sector of Antarctica.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Lagoon of Venice is a large water basin that exchanges water with the Northern Adriatic Sea through three large inlets. We examined two adjacent sites within the Southern Basin and at the Chioggia inlet in autumn 2007 and summer 2008. A pilot study in June 2007 on a surface water sample from Chioggia with a rather high salinity of 36.9 PSU had revealed a conspicuous bloom of CF319a-positive cells likely affiliated with the Cytophaga /Flavobacteria cluster of Bacteroidetes. These flavobacterial abundances were one to two orders of magnitude higher than in other marine surface waters. DAPI-stained cells were identified as bacteria with the general bacterial probe mixture EUB338 I-III. CARD-FISH counts with group-specific probes confirmed the dominance of Bacteroidetes (CF319a), Alphaproteobacteria (ALF968), and Gammaproteobacteria (GAM42a). CARD-FISH showed thatBetaproteobacteria and Planctomycetes were minor components of the bacterioplankton in the Lagoon of Venice.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Esta tesis se centra en el análisis de dos aspectos complementarios de la ciberdelincuencia (es decir, el crimen perpetrado a través de la red para ganar dinero). Estos dos aspectos son las máquinas infectadas utilizadas para obtener beneficios económicos de la delincuencia a través de diferentes acciones (como por ejemplo, clickfraud, DDoS, correo no deseado) y la infraestructura de servidores utilizados para gestionar estas máquinas (por ejemplo, C & C, servidores explotadores, servidores de monetización, redirectores). En la primera parte se investiga la exposición a las amenazas de los ordenadores victimas. Para realizar este análisis hemos utilizado los metadatos contenidos en WINE-BR conjunto de datos de Symantec. Este conjunto de datos contiene metadatos de instalación de ficheros ejecutables (por ejemplo, hash del fichero, su editor, fecha de instalación, nombre del fichero, la versión del fichero) proveniente de 8,4 millones de usuarios de Windows. Hemos asociado estos metadatos con las vulnerabilidades en el National Vulnerability Database (NVD) y en el Opens Sourced Vulnerability Database (OSVDB) con el fin de realizar un seguimiento de la decadencia de la vulnerabilidad en el tiempo y observar la rapidez de los usuarios a remiendar sus sistemas y, por tanto, su exposición a posibles ataques. Hemos identificado 3 factores que pueden influir en la actividad de parches de ordenadores victimas: código compartido, el tipo de usuario, exploits. Presentamos 2 nuevos ataques contra el código compartido y un análisis de cómo el conocimiento usuarios y la disponibilidad de exploit influyen en la actividad de aplicación de parches. Para las 80 vulnerabilidades en nuestra base de datos que afectan código compartido entre dos aplicaciones, el tiempo entre el parche libera en las diferentes aplicaciones es hasta 118 das (con una mediana de 11 das) En la segunda parte se proponen nuevas técnicas de sondeo activos para detectar y analizar las infraestructuras de servidores maliciosos. Aprovechamos técnicas de sondaje activo, para detectar servidores maliciosos en el internet. Empezamos con el análisis y la detección de operaciones de servidores explotadores. Como una operación identificamos los servidores que son controlados por las mismas personas y, posiblemente, participan en la misma campaña de infección. Hemos analizado un total de 500 servidores explotadores durante un período de 1 año, donde 2/3 de las operaciones tenían un único servidor y 1/2 por varios servidores. Hemos desarrollado la técnica para detectar servidores explotadores a diferentes tipologías de servidores, (por ejemplo, C & C, servidores de monetización, redirectores) y hemos logrado escala de Internet de sondeo para las distintas categorías de servidores maliciosos. Estas nuevas técnicas se han incorporado en una nueva herramienta llamada CyberProbe. Para detectar estos servidores hemos desarrollado una novedosa técnica llamada Adversarial Fingerprint Generation, que es una metodología para generar un modelo único de solicitud-respuesta para identificar la familia de servidores (es decir, el tipo y la operación que el servidor apartenece). A partir de una fichero de malware y un servidor activo de una determinada familia, CyberProbe puede generar un fingerprint válido para detectar todos los servidores vivos de esa familia. Hemos realizado 11 exploraciones en todo el Internet detectando 151 servidores maliciosos, de estos 151 servidores 75% son desconocidos a bases de datos publicas de servidores maliciosos. Otra cuestión que se plantea mientras se hace la detección de servidores maliciosos es que algunos de estos servidores podrán estar ocultos detrás de un proxy inverso silente. Para identificar la prevalencia de esta configuración de red y mejorar el capacidades de CyberProbe hemos desarrollado RevProbe una nueva herramienta a través del aprovechamiento de leakages en la configuración de la Web proxies inversa puede detectar proxies inversos. RevProbe identifica que el 16% de direcciones IP maliciosas activas analizadas corresponden a proxies inversos, que el 92% de ellos son silenciosos en comparación con 55% para los proxies inversos benignos, y que son utilizado principalmente para equilibrio de carga a través de múltiples servidores. ABSTRACT In this dissertation we investigate two fundamental aspects of cybercrime: the infection of machines used to monetize the crime and the malicious server infrastructures that are used to manage the infected machines. In the first part of this dissertation, we analyze how fast software vendors apply patches to secure client applications, identifying shared code as an important factor in patch deployment. Shared code is code present in multiple programs. When a vulnerability affects shared code the usual linear vulnerability life cycle is not anymore effective to describe how the patch deployment takes place. In this work we show which are the consequences of shared code vulnerabilities and we demonstrate two novel attacks that can be used to exploit this condition. In the second part of this dissertation we analyze malicious server infrastructures, our contributions are: a technique to cluster exploit server operations, a tool named CyberProbe to perform large scale detection of different malicious servers categories, and RevProbe a tool that detects silent reverse proxies. We start by identifying exploit server operations, that are, exploit servers managed by the same people. We investigate a total of 500 exploit servers over a period of more 13 months. We have collected malware from these servers and all the metadata related to the communication with the servers. Thanks to this metadata we have extracted different features to group together servers managed by the same entity (i.e., exploit server operation), we have discovered that 2/3 of the operations have a single server while 1/3 have multiple servers. Next, we present CyberProbe a tool that detects different malicious server types through a novel technique called adversarial fingerprint generation (AFG). The idea behind CyberProbe’s AFG is to run some piece of malware and observe its network communication towards malicious servers. Then it replays this communication to the malicious server and outputs a fingerprint (i.e. a port selection function, a probe generation function and a signature generation function). Once the fingerprint is generated CyberProbe scans the Internet with the fingerprint and finds all the servers of a given family. We have performed a total of 11 Internet wide scans finding 151 new servers starting with 15 seed servers. This gives to CyberProbe a 10 times amplification factor. Moreover we have compared CyberProbe with existing blacklists on the internet finding that only 40% of the server detected by CyberProbe were listed. To enhance the capabilities of CyberProbe we have developed RevProbe, a reverse proxy detection tool that can be integrated with CyberProbe to allow precise detection of silent reverse proxies used to hide malicious servers. RevProbe leverages leakage based detection techniques to detect if a malicious server is hidden behind a silent reverse proxy and the infrastructure of servers behind it. At the core of RevProbe is the analysis of differences in the traffic by interacting with a remote server.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The molecular mechanisms of pulmonary fibrosis are poorly understood. We have used oligonucleotide arrays to analyze the gene expression programs that underlie pulmonary fibrosis in response to bleomycin, a drug that causes lung inflammation and fibrosis, in two strains of susceptible mice (129 and C57BL/6). We then compared the gene expression patterns in these mice with 129 mice carrying a null mutation in the epithelial-restricted integrin β6 subunit (β6−/−), which develop inflammation but are protected from pulmonary fibrosis. Cluster analysis identified two distinct groups of genes involved in the inflammatory and fibrotic responses. Analysis of gene expression at multiple time points after bleomycin administration revealed sequential induction of subsets of genes that characterize each response. The availability of this comprehensive data set should accelerate the development of more effective strategies for intervention at the various stages in the development of fibrotic diseases of the lungs and other organs.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Olfactory receptor (OR) genes represent ≈1% of genomic coding sequence in mammals, and these genes are clustered on multiple chromosomes in both the mouse and human genomes. We have taken a comparative genomics approach to identify features that may be involved in the dynamic evolution of this gene family and in the transcriptional control that results in a single OR gene expressed per olfactory neuron. We sequenced ≈350 kb of the murine P2 OR cluster and used synteny, gene linkage, and phylogenetic analysis to identify and sequence ≈111 kb of an orthologous cluster in the human genome. In total, 18 mouse and 8 human OR genes were identified, including 7 orthologs that appear to be functional in both species. Noncoding homology is evident between orthologs and generally is confined within the transcriptional unit. We find no evidence for common regulatory features shared among paralogs, and promoter regions generally do not contain strong promoter motifs. We discuss these observations, as well as OR clustering, in the context of evolutionary expansion and transcriptional regulation of OR repertoires.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Most chloroplast genes in vascular plants are organized into polycistronic transcription units, which generate a complex pattern of mono-, di-, and polycistronic transcripts. In contrast, most Chlamydomonas reinhardtii chloroplast transcripts characterized to date have been monocistronic. This paper describes the atpA gene cluster in the C. reinhardtii chloroplast genome, which includes the atpA, psbI, cemA, and atpH genes, encoding the α-subunit of the coupling-factor-1 (CF1) ATP synthase, a small photosystem II polypeptide, a chloroplast envelope membrane protein, and subunit III of the CF0 ATP synthase, respectively. We show that promoters precede the atpA, psbI, and atpH genes, but not the cemA gene, and that cemA mRNA is present only as part of di-, tri-, or tetracistronic transcripts. Deletions introduced into the gene cluster reveal, first, that CF1-α can be translated from di- or polycistronic transcripts, and, second, that substantial reductions in mRNA quantity have minimal effects on protein synthesis rates. We suggest that posttranscriptional mRNA processing is common in C. reinhardtii chloroplasts, permitting the expression of multiple genes from a single promoter.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Varicella-zoster virus open reading frame 10 (ORF10) protein, the homolog of the herpes simplex virus protein VP16, can transactivate immediate-early promoters from both viruses. A protein sequence comparison procedure termed hydrophobic cluster analysis was used to identify a motif centered at Phe-28, near the amino terminus of ORF10, that strongly resembles the sequence of the activating domain surrounding Phe-442 of VP16. With a series of GAL4-ORF10 fusion proteins, we mapped the ORF10 transcriptional-activation domain to the amino-terminal region (aa 5-79). Extensive mutagenesis of Phe-28 in GAL4-ORF10 fusion proteins demonstrated the importance of an aromatic or bulky hydrophobic amino acid at this position, as shown previously for Phe-442 of VP16. Transactivation by the native ORF10 protein was abolished when Phe-28 was replaced by Ala. Similar amino-terminal domains were identified in the VP16 homologs of other alphaherpesviruses. Hydrophobic cluster analysis correctly predicted activation domains of ORF10 and VP16 that share critical characteristics of a distinctive subclass of acidic activation domains.