995 resultados para clustered data
Resumo:
An interim analysis is usually applied in later phase II or phase III trials to find convincing evidence of a significant treatment difference that may lead to trial termination at an earlier point than planned at the beginning. This can result in the saving of patient resources and shortening of drug development and approval time. In addition, ethics and economics are also the reasons to stop a trial earlier. In clinical trials of eyes, ears, knees, arms, kidneys, lungs, and other clustered treatments, data may include distribution-free random variables with matched and unmatched subjects in one study. It is important to properly include both subjects in the interim and the final analyses so that the maximum efficiency of statistical and clinical inferences can be obtained at different stages of the trials. So far, no publication has applied a statistical method for distribution-free data with matched and unmatched subjects in the interim analysis of clinical trials. In this simulation study, the hybrid statistic was used to estimate the empirical powers and the empirical type I errors among the simulated datasets with different sample sizes, different effect sizes, different correlation coefficients for matched pairs, and different data distributions, respectively, in the interim and final analysis with 4 different group sequential methods. Empirical powers and empirical type I errors were also compared to those estimated by using the meta-analysis t-test among the same simulated datasets. Results from this simulation study show that, compared to the meta-analysis t-test commonly used for data with normally distributed observations, the hybrid statistic has a greater power for data observed from normally, log-normally, and multinomially distributed random variables with matched and unmatched subjects and with outliers. Powers rose with the increase in sample size, effect size, and correlation coefficient for the matched pairs. In addition, lower type I errors were observed estimated by using the hybrid statistic, which indicates that this test is also conservative for data with outliers in the interim analysis of clinical trials.^
Resumo:
Major histocompatibility complex (MHC) class II molecules displayed clustered patterns at the surfaces of T (HUT-102B2) and B (JY) lymphoma cells characterized by interreceptor distances in the micrometer range as detected by scanning force microscopy of immunogold-labeled antigens. Electron microscopy revealed that a fraction of the MHC class II molecules was also heteroclustered with MHC class I antigens at the same hierarchical level as described by the scanning force microscopy data, after specifically and sequentially labeling the antigens with 30- and 15-nm immunogold beads. On JY cells the estimated fraction of co-clustered HLA II was 0.61, whereas that of the HLA I was 0.24. Clusterization of the antigens was detected by the deviation of their spatial distribution from the Poissonian distribution representing the random case. Fluorescence resonance energy transfer measurements also confirmed partial co-clustering of the HLA class I and II molecules at another hierarchical level characterized by the 2- to 10-nm Förster distance range and providing fine details of the molecular organization of receptors. The larger-scale topological organization of the MHC class I and II antigens may reflect underlying membrane lipid domains and may fulfill significant functions in cell-to-cell contacts and signal transduction.
Resumo:
TCL1 and TCL1b genes on human chromosome 14q23.1 are activated in T cell leukemias by translocations and inversions at 14q32.1, juxtaposing them to regulatory elements of T cell receptor genes. In this report we present the cloning, mapping, and expression analysis of the human and murine TCL1/Tcl1 locus. In addition to TCL1 and TCL1b, the human locus contains two additional genes, TCL1-neighboring genes (TNG) 1 and 2, encoding proteins of 141 and 110 aa, respectively. Both genes show no homology to any known genes, but their expression profiles are very similar to those of TCL1 and TCL1b. TNG1 and TNG2 also are activated in T cell leukemias with rearrangements at 14q32.1. To aid in the development of a mouse model we also have characterized the murine Tcl1 locus and found five genes homologous to human TCL1b. Tcl1b1–Tcl1b5 proteins range from 117 to 123 aa and are 65–80% similar, but they show only a 30–40% similarity to human TCL1b. All five mouse Tcl1b and murine Tcl1 mRNAs are abundant in mouse oocytes and two-cell embryos but rare in various adult tissues and lymphoid cell lines. These data suggest a similar or complementary function of these proteins in early embryogenesis.
Resumo:
Natural killer (NK) cells express C-type lectin-like receptors, encoded in the NK gene complex, that interact with major histocompatibility complex class I and either inhibit or activate functional activity. Human NK cells express heterodimers consisting of CD94 and NKG2 family molecules, whereas murine NK cells express homodimers belonging to the Ly-49 family. The corresponding orthologues for other species, however, have not been described. In this report, we used probes derived from the expressed sequence tag database to clone C57BL/6-derived cDNAs homologous to human NKG2-D and CD94. Among normal tissues, murine NKG2-D and CD94 transcripts are highly expressed only in activated NK cells, including both Ly-49A+ and Ly-49A− subpopulations. Additionally, mNKG2-D is expressed in murine NK cell clones KY-1 and KY-2, whereas mCD94 expression is observed only in KY-1 cells but not KY-2. Last, we have finely mapped the physical location of the Cd94 (centromeric) and Nkg2d (telomeric) genes between Cd69 and the Ly49 cluster in the NK complex. Thus, these data indicate the expanding complexity of the NK complex and the corresponding repertoire of C-type lectin-like receptors on murine NK cells.
Resumo:
Sequences of cloned resistance genes from a wide range of plant taxa reveal significant similarities in sequence homology and structural motifs. This is observed among genes conferring resistance to viral, bacterial, and fungal pathogens. In this study, oligonucleotide primers designed for conserved sequences from coding regions of disease resistance genes N (tobacco), RPS2 (Arabidopsis) and L6 (flax) were used to amplify related sequences from soybean [Glycine max (L.) Merr.]. Sequencing of amplification products indicated that at least nine classes of resistance gene analogs (RGAs) were detected. Genetic mapping of members of these classes located them to eight different linkage groups. Several RGA loci mapped near known resistance genes. A bacterial artificial chromosome library of soybean DNA was screened using primers and probes specific for eight RGA classes and clones were identified containing sequences unique to seven classes. Individual bacterial artificial chromosomes contained 2-10 members of single RGA classes. Clustering and sequence similarity of members of RGA classes suggests a common process in their evolution. Our data indicate that it may be possible to use sequence homologies from conserved motifs of cloned resistance genes to identify candidate resistance loci from widely diverse plant taxa.
Resumo:
A progressive spatial query retrieves spatial data based on previous queries (e.g., to fetch data in a more restricted area with higher resolution). A direct query, on the other side, is defined as an isolated window query. A multi-resolution spatial database system should support both progressive queries and traditional direct queries. It is conceptually challenging to support both types of query at the same time, as direct queries favour location-based data clustering, whereas progressive queries require fragmented data clustered by resolutions. Two new scaleless data structures are proposed in this paper. Experimental results using both synthetic and real world datasets demonstrate that the query processing time based on the new multiresolution approaches is comparable and often better than multi-representation data structures for both types of queries.
Resumo:
The paper describes two new transport layer (TCP) options and an expanded transport layer queuing strategy that facilitate three functions that are fundamental to the dispatching-based clustered service. A transport layer option has been developed to facilitate. the use of client wait time data within the service request processing of the cluster. A second transport layer option has been developed to facilitate the redirection of service requests by the cluster dispatcher to the cluster processing member. An expanded transport layer service request queuing strategy facilitates the trust based filtering of incoming service requests so that a graceful degradation of service delivery may be achieved during periods of overload - most dramatically evidenced by distributed denial of service attacks against the clustered service. We describe how these new options and queues have been implemented and successfully tested within the transport layer of the Linux kernel.
Resumo:
Overlaying maps using a desktop GIS is often the first step of a multivariate spatial analysis. The potential of this operation has increased considerably as data sources and Web services to manipulate them are becoming widely available via the Internet. Standards from the OGC enable such geospatial mashups to be seamless and user driven, involving discovery of thematic data. The user is naturally inclined to look for spatial clusters and correlation of outcomes. Using classical cluster detection scan methods to identify multivariate associations can be problematic in this context, because of a lack of control on or knowledge about background populations. For public health and epidemiological mapping, this limiting factor can be critical but often the focus is on spatial identification of risk factors associated with health or clinical status. Spatial entropy index HSu for the ScankOO analysis of the hypothetical dataset using a vicinity which is fixed by the number of points without distinction between their labels. (The size of the labels is proportional to the inverse of the index) In this article we point out that this association itself can ensure some control on underlying populations, and develop an exploratory scan statistic framework for multivariate associations. Inference using statistical map methodologies can be used to test the clustered associations. The approach is illustrated with a hypothetical data example and an epidemiological study on community MRSA. Scenarios of potential use for online mashups are introduced but full implementation is left for further research.
Resumo:
An Automatic Vehicle Location (AVL) system is a computer-based vehicle tracking system that is capable of determining a vehicle's location in real time. As a major technology of the Advanced Public Transportation System (APTS), AVL systems have been widely deployed by transit agencies for purposes such as real-time operation monitoring, computer-aided dispatching, and arrival time prediction. AVL systems make a large amount of transit performance data available that are valuable for transit performance management and planning purposes. However, the difficulties of extracting useful information from the huge spatial-temporal database have hindered off-line applications of the AVL data. ^ In this study, a data mining process, including data integration, cluster analysis, and multiple regression, is proposed. The AVL-generated data are first integrated into a Geographic Information System (GIS) platform. The model-based cluster method is employed to investigate the spatial and temporal patterns of transit travel speeds, which may be easily translated into travel time. The transit speed variations along the route segments are identified. Transit service periods such as morning peak, mid-day, afternoon peak, and evening periods are determined based on analyses of transit travel speed variations for different times of day. The seasonal patterns of transit performance are investigated by using the analysis of variance (ANOVA). Travel speed models based on the clustered time-of-day intervals are developed using important factors identified as having significant effects on speed for different time-of-day periods. ^ It has been found that transit performance varied from different seasons and different time-of-day periods. The geographic location of a transit route segment also plays a role in the variation of the transit performance. The results of this research indicate that advanced data mining techniques have good potential in providing automated techniques of assisting transit agencies in service planning, scheduling, and operations control. ^
Resumo:
The Twitter System is the biggest social network in the world, and everyday millions of tweets are posted and talked about, expressing various views and opinions. A large variety of research activities have been conducted to study how the opinions can be clustered and analyzed, so that some tendencies can be uncovered. Due to the inherent weaknesses of the tweets - very short texts and very informal styles of writing - it is rather hard to make an investigation of tweet data analysis giving results with good performance and accuracy. In this paper, we intend to attack the problem from another aspect - using a two-layer structure to analyze the twitter data: LDA with topic map modelling. The experimental results demonstrate that this approach shows a progress in twitter data analysis. However, more experiments with this method are expected in order to ensure that the accurate analytic results can be maintained.
Resumo:
The brain is a network spanning multiple scales from subcellular to macroscopic. In this thesis I present four projects studying brain networks at different levels of abstraction. The first involves determining a functional connectivity network based on neural spike trains and using a graph theoretical method to cluster groups of neurons into putative cell assemblies. In the second project I model neural networks at a microscopic level. Using diferent clustered wiring schemes, I show that almost identical spatiotemporal activity patterns can be observed, demonstrating that there is a broad neuro-architectural basis to attain structured spatiotemporal dynamics. Remarkably, irrespective of the precise topological mechanism, this behavior can be predicted by examining the spectral properties of the synaptic weight matrix. The third project introduces, via two circuit architectures, a new paradigm for feedforward processing in which inhibitory neurons have the complex and pivotal role in governing information flow in cortical network models. Finally, I analyze axonal projections in sleep deprived mice using data collected as part of the Allen Institute's Mesoscopic Connectivity Atlas. After normalizing for experimental variability, the results indicate there is no single explanatory difference in the mesoscale network between control and sleep deprived mice. Using machine learning techniques, however, animal classification could be done at levels significantly above chance. This reveals that intricate changes in connectivity do occur due to chronic sleep deprivation.
Resumo:
High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted. We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled into contigs/singlets, or for lists of proteins/genes, metabolites and drugs of interest, and add them to the project; (iii) Annotation module, which assigns annotations from several databases for the contigs/singlets or lists of proteins/genes, generating tables with automatic annotation that can be manually curated; and (iv) Interactome module, which maps the contigs/singlets or the uploaded lists to entries in our integrated database, building networks that gather novel identified interactions, protein and metabolite expression/concentration levels, subcellular localization and computed topological metrics, GO biological processes and KEGG pathways enrichment. This module generates a XGMML file that can be imported into Cytoscape or be visualized directly on the web. We have developed IIS by the integration of diverse databases following the need of appropriate tools for a systematic analysis of physical, genetic and chemical-genetic interactions. IIS was validated with yeast two-hybrid, proteomics and metabolomics datasets, but it is also extendable to other datasets. IIS is freely available online at: http://www.lge.ibi.unicamp.br/lnbio/IIS/.
Resumo:
The article seeks to investigate patterns of performance and relationships between grip strength, gait speed and self-rated health, and investigate the relationships between them, considering the variables of gender, age and family income. This was conducted in a probabilistic sample of community-dwelling elderly aged 65 and over, members of a population study on frailty. A total of 689 elderly people without cognitive deficit suggestive of dementia underwent tests of gait speed and grip strength. Comparisons between groups were based on low, medium and high speed and strength. Self-related health was assessed using a 5-point scale. The males and the younger elderly individuals scored significantly higher on grip strength and gait speed than the female and oldest did; the richest scored higher than the poorest on grip strength and gait speed; females and men aged over 80 had weaker grip strength and lower gait speed; slow gait speed and low income arose as risk factors for a worse health evaluation. Lower muscular strength affects the self-rated assessment of health because it results in a reduction in functional capacity, especially in the presence of poverty and a lack of compensatory factors.
Resumo:
Obstructive sleep apnea syndrome has a high prevalence among adults. Cephalometric variables can be a valuable method for evaluating patients with this syndrome. To correlate cephalometric data with the apnea-hypopnea sleep index. We performed a retrospective and cross-sectional study that analyzed the cephalometric data of patients followed in the Sleep Disorders Outpatient Clinic of the Discipline of Otorhinolaryngology of a university hospital, from June 2007 to May 2012. Ninety-six patients were included, 45 men, and 51 women, with a mean age of 50.3 years. A total of 11 patients had snoring, 20 had mild apnea, 26 had moderate apnea, and 39 had severe apnea. The distance from the hyoid bone to the mandibular plane was the only variable that showed a statistically significant correlation with the apnea-hypopnea index. Cephalometric variables are useful tools for the understanding of obstructive sleep apnea syndrome. The distance from the hyoid bone to the mandibular plane showed a statistically significant correlation with the apnea-hypopnea index.
Resumo:
Resource specialisation, although a fundamental component of ecological theory, is employed in disparate ways. Most definitions derive from simple counts of resource species. We build on recent advances in ecophylogenetics and null model analysis to propose a concept of specialisation that comprises affinities among resources as well as their co-occurrence with consumers. In the distance-based specialisation index (DSI), specialisation is measured as relatedness (phylogenetic or otherwise) of resources, scaled by the null expectation of random use of locally available resources. Thus, specialists use significantly clustered sets of resources, whereas generalists use over-dispersed resources. Intermediate species are classed as indiscriminate consumers. The effectiveness of this approach was assessed with differentially restricted null models, applied to a data set of 168 herbivorous insect species and their hosts. Incorporation of plant relatedness and relative abundance greatly improved specialisation measures compared to taxon counts or simpler null models, which overestimate the fraction of specialists, a problem compounded by insufficient sampling effort. This framework disambiguates the concept of specialisation with an explicit measure applicable to any mode of affinity among resource classes, and is also linked to ecological and evolutionary processes. This will enable a more rigorous deployment of ecological specialisation in empirical and theoretical studies.