17 resultados para hypotheses

em Boston University Digital Common


Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND:Recent advances in genome sequencing suggest a remarkable conservation in gene content of mammalian organisms. The similarity in gene repertoire present in different organisms has increased interest in studying regulatory mechanisms of gene expression aimed at elucidating the differences in phenotypes. In particular, a proximal promoter region contains a large number of regulatory elements that control the expression of its downstream gene. Although many studies have focused on identification of these elements, a broader picture on the complexity of transcriptional regulation of different biological processes has not been addressed in mammals. The regulatory complexity may strongly correlate with gene function, as different evolutionary forces must act on the regulatory systems under different biological conditions. We investigate this hypothesis by comparing the conservation of promoters upstream of genes classified in different functional categories.RESULTS:By conducting a rank correlation analysis between functional annotation and upstream sequence alignment scores obtained by human-mouse and human-dog comparison, we found a significantly greater conservation of the upstream sequence of genes involved in development, cell communication, neural functions and signaling processes than those involved in more basic processes shared with unicellular organisms such as metabolism and ribosomal function. This observation persists after controlling for G+C content. Considering conservation as a functional signature, we hypothesize a higher density of cis-regulatory elements upstream of genes participating in complex and adaptive processes.CONCLUSION:We identified a class of functions that are associated with either high or low promoter conservation in mammals. We detected a significant tendency that points to complex and adaptive processes were associated with higher promoter conservation, despite the fact that they have emerged relatively recently during evolution. We described and contrasted several hypotheses that provide a deeper insight into how transcriptional complexity might have been emerged during evolution.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

INTRODUCTION:Subclinical atherosclerosis (SCA) measures in multiple arterial beds are heritable phenotypes that are associated with increased incidence of cardiovascular disease. We conducted a genome-wide association study (GWAS) for SCA measurements in the community-based Framingham Heart Study.METHODS:Over 100,000 single nucleotide polymorphisms (SNPs) were genotyped (Human 100K GeneChip, Affymetrix) in 1345 subjects from 310 families. We calculated sex-specific age-adjusted and multivariable-adjusted residuals in subjects tested for quantitative SCA phenotypes, including ankle-brachial index, coronary artery calcification and abdominal aortic calcification using multi-detector computed tomography, and carotid intimal medial thickness (IMT) using carotid ultrasonography. We evaluated associations of these phenotypes with 70,987 autosomal SNPs with minor allele frequency [greater than or equal to] 0.10, call rate [greater than or equal to] 80%, and Hardy-Weinberg p-value [greater than or equal to] 0.001 in samples ranging from 673 to 984 subjects, using linear regression with generalized estimating equations (GEE) methodology and family-based association testing (FBAT). Variance components LOD scores were also calculated.RESULTS:There was no association result meeting criteria for genome-wide significance, but our methods identified 11 SNPs with p < 10-5 by GEE and five SNPs with p < 10-5 by FBAT for multivariable-adjusted phenotypes. Among the associated variants were SNPs in or near genes that may be considered candidates for further study, such as rs1376877 (GEE p < 0.000001, located in ABI2) for maximum internal carotid artery IMT and rs4814615 (FBAT p = 0.000003, located in PCSK2) for maximum common carotid artery IMT. Modest significant associations were noted with various SCA phenotypes for variants in previously reported atherosclerosis candidate genes, including NOS3 and ESR1. Associations were also noted of a region on chromosome 9p21 with CAC phenotypes that confirm associations with coronary heart disease and CAC in two recently reported genome-wide association studies. In linkage analyses, several regions of genome-wide linkage were noted, confirming previously reported linkage of internal carotid artery IMT on chromosome 12. All GEE, FBAT and linkage results are provided as an open-access results resource at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007.CONCLUSION:The results from this GWAS generate hypotheses regarding several SNPs that may be associated with SCA phenotypes in multiple arterial beds. Given the number of tests conducted, subsequent independent replication in a staged approach is essential to identify genetic variants that may be implicated in atherosclerosis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND:Blood lipid levels including low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglycerides (TG) are highly heritable. Genome-wide association is a promising approach to map genetic loci related to these heritable phenotypes.METHODS:In 1087 Framingham Heart Study Offspring cohort participants (mean age 47 years, 52% women), we conducted genome-wide analyses (Affymetrix 100K GeneChip) for fasting blood lipid traits. Total cholesterol, HDL-C, and TG were measured by standard enzymatic methods and LDL-C was calculated using the Friedewald formula. The long-term averages of up to seven measurements of LDL-C, HDL-C, and TG over a ~30 year span were the primary phenotypes. We used generalized estimating equations (GEE), family-based association tests (FBAT) and variance components linkage to investigate the relationships between SNPs (on autosomes, with minor allele frequency [greater than or equal to]10%, genotypic call rate [greater than or equal to]80%, and Hardy-Weinberg equilibrium p [greater than or equal to] 0.001) and multivariable-adjusted residuals. We pursued a three-stage replication strategy of the GEE association results with 287 SNPs (P < 0.001 in Stage I) tested in Stage II (n ~1450 individuals) and 40 SNPs (P < 0.001 in joint analysis of Stages I and II) tested in Stage III (n~6650 individuals).RESULTS:Long-term averages of LDL-C, HDL-C, and TG were highly heritable (h2 = 0.66, 0.69, 0.58, respectively; each P < 0.0001). Of 70,987 tests for each of the phenotypes, two SNPs had p < 10-5 in GEE results for LDL-C, four for HDL-C, and one for TG. For each multivariable-adjusted phenotype, the number of SNPs with association p < 10-4 ranged from 13 to 18 and with p < 10-3, from 94 to 149. Some results confirmed previously reported associations with candidate genes including variation in the lipoprotein lipase gene (LPL) and HDL-C and TG (rs7007797; P = 0.0005 for HDL-C and 0.002 for TG). The full set of GEE, FBAT and linkage results are posted at the database of Genotype and Phenotype (dbGaP). After three stages of replication, there was no convincing statistical evidence for association (i.e., combined P < 10-5 across all three stages) between any of the tested SNPs and lipid phenotypes.CONCLUSION:Using a 100K genome-wide scan, we have generated a set of putative associations for common sequence variants and lipid phenotypes. Validation of selected hypotheses in additional samples did not identify any new loci underlying variability in blood lipids. Lack of replication may be due to inadequate statistical power to detect modest quantitative trait locus effects (i.e., < 1% of trait variance explained) or reduced genomic coverage of the 100K array. GWAS in FHS using a denser genome-wide genotyping platform and a better-powered replication strategy may identify novel loci underlying blood lipids.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: Family studies and heritability estimates provide evidence for a genetic contribution to variation in the human life span. METHODS:We conducted a genome wide association study (Affymetrix 100K SNP GeneChip) for longevity-related traits in a community-based sample. We report on 5 longevity and aging traits in up to 1345 Framingham Study participants from 330 families. Multivariable-adjusted residuals were computed using appropriate models (Cox proportional hazards, logistic, or linear regression) and the residuals from these models were used to test for association with qualifying SNPs (70, 987 autosomal SNPs with genotypic call rate [greater than or equal to]80%, minor allele frequency [greater than or equal to]10%, Hardy-Weinberg test p [greater than or equal to] 0.001).RESULTS:In family-based association test (FBAT) models, 8 SNPs in two regions approximately 500 kb apart on chromosome 1 (physical positions 73,091,610 and 73, 527,652) were associated with age at death (p-value < 10-5). The two sets of SNPs were in high linkage disequilibrium (minimum r2 = 0.58). The top 30 SNPs for generalized estimating equation (GEE) tests of association with age at death included rs10507486 (p = 0.0001) and rs4943794 (p = 0.0002), SNPs intronic to FOXO1A, a gene implicated in lifespan extension in animal models. FBAT models identified 7 SNPs and GEE models identified 9 SNPs associated with both age at death and morbidity-free survival at age 65 including rs2374983 near PON1. In the analysis of selected candidate genes, SNP associations (FBAT or GEE p-value < 0.01) were identified for age at death in or near the following genes: FOXO1A, GAPDH, KL, LEPR, PON1, PSEN1, SOD2, and WRN. Top ranked SNP associations in the GEE model for age at natural menopause included rs6910534 (p = 0.00003) near FOXO3a and rs3751591 (p = 0.00006) in CYP19A1. Results of all longevity phenotype-genotype associations for all autosomal SNPs are web posted at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007. CONCLUSION: Longevity and aging traits are associated with SNPs on the Affymetrix 100K GeneChip. None of the associations achieved genome-wide significance. These data generate hypotheses and serve as a resource for replication as more genes and biologic pathways are proposed as contributing to longevity and healthy aging.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the problem of learning disjunctions of counting functions, which are general cases of parity and modulo functions, with equivalence and membership queries. We prove that, for any prime number p, the class of disjunctions of integer-weighted counting functions with modulus p over the domain Znq (or Zn) for any given integer q ≥ 2 is polynomial time learnable using at most n + 1 equivalence queries, where the hypotheses issued by the learner are disjunctions of at most n counting functions with weights from Zp. The result is obtained through learning linear systems over an arbitrary field. In general a counting function may have a composite modulus. We prove that, for any given integer q ≥ 2, over the domain Zn2, the class of read-once disjunctions of Boolean-weighted counting functions with modulus q is polynomial time learnable with only one equivalence query, and the class of disjunctions of log log n Boolean-weighted counting functions with modulus q is polynomial time learnable. Finally, we present an algorithm for learning graph-based counting functions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A probabilistic, nonlinear supervised learning model is proposed: the Specialized Mappings Architecture (SMA). The SMA employs a set of several forward mapping functions that are estimated automatically from training data. Each specialized function maps certain domains of the input space (e.g., image features) onto the output space (e.g., articulated body parameters). The SMA can model ambiguous, one-to-many mappings that may yield multiple valid output hypotheses. Once learned, the mapping functions generate a set of output hypotheses for a given input via a statistical inference procedure. The SMA inference procedure incorporates an inverse mapping or feedback function in evaluating the likelihood of each of the hypothesis. Possible feedback functions include computer graphics rendering routines that can generate images for given hypotheses. The SMA employs a variant of the Expectation-Maximization algorithm for simultaneous learning of the specialized domains along with the mapping functions, and approximate strategies for inference. The framework is demonstrated in a computer vision system that can estimate the articulated pose parameters of a human’s body or hands, given silhouettes from a single image. The accuracy and stability of the SMA are also tested using synthetic images of human bodies and hands, where ground truth is known.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This technical report presents a combined solution for two problems, one: tracking objects in 3D space and estimating their trajectories and second: computing the similarity between previously estimated trajectories and clustering them using the similarities that we just computed. For the first part, trajectories are estimated using an EKF formulation that will provide the 3D trajectory up to a constant. To improve accuracy, when occlusions appear, multiple hypotheses are followed. For the second problem we compute the distances between trajectories using a similarity based on LCSS formulation. Similarities are computed between projections of trajectories on coordinate axes. Finally we group trajectories together based on previously computed distances, using a clustering algorithm. To check the validity of our approach, several experiments using real data were performed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A system is described that tracks moving objects in a video dataset so as to extract a representation of the objects' 3D trajectories. The system then finds hierarchical clusters of similar trajectories in the video dataset. Objects' motion trajectories are extracted via an EKF formulation that provides each object's 3D trajectory up to a constant factor. To increase accuracy when occlusions occur, multiple tracking hypotheses are followed. For trajectory-based clustering and retrieval, a modified version of edit distance, called longest common subsequence (LCSS) is employed. Similarities are computed between projections of trajectories on coordinate axes. Trajectories are grouped based, using an agglomerative clustering algorithm. To check the validity of the approach, experiments using real data were performed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new deformable shape-based method for color region segmentation is described. The method includes two stages: over-segmentation using a traditional color region segmentation algorithm, followed by deformable model-based region merging via grouping and hypothesis selection. During the second stage, region merging and object identification are executed simultaneously. A statistical shape model is used to estimate the likelihood of region groupings and model hypotheses. The prior distribution on deformation parameters is precomputed using principal component analysis over a training set of region groupings. Once trained, the system autonomously segments deformed shapes from the background, while not merging them with similarly colored adjacent objects. Furthermore, the recovered parametric shape model can be used directly in object recognition and comparison. Experiments in segmentation and image retrieval are reported.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An approach for estimating 3D body pose from multiple, uncalibrated views is proposed. First, a mapping from image features to 2D body joint locations is computed using a statistical framework that yields a set of several body pose hypotheses. The concept of a "virtual camera" is introduced that makes this mapping invariant to translation, image-plane rotation, and scaling of the input. As a consequence, the calibration matrices (intrinsics) of the virtual cameras can be considered completely known, and their poses are known up to a single angular displacement parameter. Given pose hypotheses obtained in the multiple virtual camera views, the recovery of 3D body pose and camera relative orientations is formulated as a stochastic optimization problem. An Expectation-Maximization algorithm is derived that can obtain the locally most likely (self-consistent) combination of body pose hypotheses. Performance of the approach is evaluated with synthetic sequences as well as real video sequences of human motion.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Object detection is challenging when the object class exhibits large within-class variations. In this work, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly learned in a multiplicative form of two kernel functions. One kernel measures similarity for foreground-background classification. The other kernel accounts for latent factors that control within-class variation and implicitly enables feature sharing among foreground training samples. Detector training can be accomplished via standard SVM learning. The resulting detectors are tuned to specific variations in the foreground class. They also serve to evaluate hypotheses of the foreground state. When the foreground parameters are provided in training, the detectors can also produce parameter estimate. When the foreground object masks are provided in training, the detectors can also produce object segmentation. The advantages of our method over past methods are demonstrated on data sets of human hands and vehicles.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Object detection and recognition are important problems in computer vision. The challenges of these problems come from the presence of noise, background clutter, large within class variations of the object class and limited training data. In addition, the computational complexity in the recognition process is also a concern in practice. In this thesis, we propose one approach to handle the problem of detecting an object class that exhibits large within-class variations, and a second approach to speed up the classification processes. In the first approach, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly solved with using a multiplicative form of two kernel functions. One kernel measures similarity for foreground-background classification. The other kernel accounts for latent factors that control within-class variation and implicitly enables feature sharing among foreground training samples. For applications where explicit parameterization of the within-class states is unavailable, a nonparametric formulation of the kernel can be constructed with a proper foreground distance/similarity measure. Detector training is accomplished via standard Support Vector Machine learning. The resulting detectors are tuned to specific variations in the foreground class. They also serve to evaluate hypotheses of the foreground state. When the image masks for foreground objects are provided in training, the detectors can also produce object segmentation. Methods for generating a representative sample set of detectors are proposed that can enable efficient detection and tracking. In addition, because individual detectors verify hypotheses of foreground state, they can also be incorporated in a tracking-by-detection frame work to recover foreground state in image sequences. To run the detectors efficiently at the online stage, an input-sensitive speedup strategy is proposed to select the most relevant detectors quickly. The proposed approach is tested on data sets of human hands, vehicles and human faces. On all data sets, the proposed approach achieves improved detection accuracy over the best competing approaches. In the second part of the thesis, we formulate a filter-and-refine scheme to speed up recognition processes. The binary outputs of the weak classifiers in a boosted detector are used to identify a small number of candidate foreground state hypotheses quickly via Hamming distance or weighted Hamming distance. The approach is evaluated in three applications: face recognition on the face recognition grand challenge version 2 data set, hand shape detection and parameter estimation on a hand data set, and vehicle detection and estimation of the view angle on a multi-pose vehicle data set. On all data sets, our approach is at least five times faster than simply evaluating all foreground state hypotheses with virtually no loss in classification accuracy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A common design of an object recognition system has two steps, a detection step followed by a foreground within-class classification step. For example, consider face detection by a boosted cascade of detectors followed by face ID recognition via one-vs-all (OVA) classifiers. Another example is human detection followed by pose recognition. Although the detection step can be quite fast, the foreground within-class classification process can be slow and becomes a bottleneck. In this work, we formulate a filter-and-refine scheme, where the binary outputs of the weak classifiers in a boosted detector are used to identify a small number of candidate foreground state hypotheses quickly via Hamming distance or weighted Hamming distance. The approach is evaluated in three applications: face recognition on the FRGC V2 data set, hand shape detection and parameter estimation on a hand data set and vehicle detection and view angle estimation on a multi-view vehicle data set. On all data sets, our approach has comparable accuracy and is at least five times faster than the brute force approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting and recognition algorithm that is based on the continuous dynamic programming (CDP) algorithm, and runs in real-time. To make gesture spotting efficient a pruning method is proposed that allows the system to evaluate a relatively small number of hypotheses compared to CDP. Pruning is implemented by a set of model-dependent classifiers, that are learned from training examples. To make gesture spotting more accurate a subgesture reasoning process is proposed that models the fact that some gesture models can falsely match parts of other longer gestures. In our experiments, the proposed method with pruning and subgesture modeling is an order of magnitude faster and 18% more accurate compared to the original CDP algorithm.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Spotting patterns of interest in an input signal is a very useful task in many different fields including medicine, bioinformatics, economics, speech recognition and computer vision. Example instances of this problem include spotting an object of interest in an image (e.g., a tumor), a pattern of interest in a time-varying signal (e.g., audio analysis), or an object of interest moving in a specific way (e.g., a human's body gesture). Traditional spotting methods, which are based on Dynamic Time Warping or hidden Markov models, use some variant of dynamic programming to register the pattern and the input while accounting for temporal variation between them. At the same time, those methods often suffer from several shortcomings: they may give meaningless solutions when input observations are unreliable or ambiguous, they require a high complexity search across the whole input signal, and they may give incorrect solutions if some patterns appear as smaller parts within other patterns. In this thesis, we develop a framework that addresses these three problems, and evaluate the framework's performance in spotting and recognizing hand gestures in video. The first contribution is a spatiotemporal matching algorithm that extends the dynamic programming formulation to accommodate multiple candidate hand detections in every video frame. The algorithm finds the best alignment between the gesture model and the input, and simultaneously locates the best candidate hand detection in every frame. This allows for a gesture to be recognized even when the hand location is highly ambiguous. The second contribution is a pruning method that uses model-specific classifiers to reject dynamic programming hypotheses with a poor match between the input and model. Pruning improves the efficiency of the spatiotemporal matching algorithm, and in some cases may improve the recognition accuracy. The pruning classifiers are learned from training data, and cross-validation is used to reduce the chance of overpruning. The third contribution is a subgesture reasoning process that models the fact that some gesture models can falsely match parts of other, longer gestures. By integrating subgesture reasoning the spotting algorithm can avoid the premature detection of a subgesture when the longer gesture is actually being performed. Subgesture relations between pairs of gestures are automatically learned from training data. The performance of the approach is evaluated on two challenging video datasets: hand-signed digits gestured by users wearing short sleeved shirts, in front of a cluttered background, and American Sign Language (ASL) utterances gestured by ASL native signers. The experiments demonstrate that the proposed method is more accurate and efficient than competing approaches. The proposed approach can be generally applied to alignment or search problems with multiple input observations, that use dynamic programming to find a solution.