124 resultados para Cong shu.


Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the proliferation of geo-positioning and geo-tagging techniques, spatio-textual objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However, the queries studied so far generally focus on finding individual objects that each satisfy a query rather than finding groups of objects where the objects in a group together satisfy a query.

We define the problem of retrieving a group of spatio-textual objects such that the group's keywords cover the query's keywords and such that the objects are nearest to the query location and have the smallest inter-object distances. Specifically, we study three instantiations of this problem, all of which are NP-hard. We devise exact solutions as well as approximate solutions with provable approximation bounds to the problems. In addition, we solve the problems of retrieving top-k groups of three instantiations, and study a weighted version of the problem that incorporates object weights. We present empirical studies that offer insight into the efficiency of the solutions, as well as the accuracy of the approximate solutions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As an important type of spatial keyword query, the m-closest keywords (mCK) query finds a group of objects such that they cover all query keywords and have the smallest diameter, which is defined as the largest distance between any pair of objects in the group. The query is useful in many applications such as detecting locations of web resources. However, the existing work does not study the intractability of this problem and only provides exact algorithms, which are computationally expensive.

In this paper, we prove that the problem of answering mCK queries is NP-hard. We first devise a greedy algorithm that has an approximation ratio of 2. Then, we observe that an mCK query can be approximately answered by finding the circle with the smallest diameter that encloses a group of objects together covering all query keywords. We prove that the group enclosed in the circle can answer the mCK query with an approximation ratio of 2 over 3. Based on this, we develop an algorithm for finding such a circle exactly, which has a high time complexity. To improve efficiency, we propose another two algorithms that find such a circle approximately, with a ratio of 2 over √3 + ε. Finally, we propose an exact algorithm that utilizes the group found by the 2 over √3 + ε)-approximation algorithm to obtain the optimal group. We conduct extensive experiments using real-life datasets. The experimental results offer insights into both efficiency and accuracy of the proposed approximation algorithms, and the results also demonstrate that our exact algorithm outperforms the best known algorithm by an order of magnitude.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Massive amount of data that are geo-tagged and associated with text information are being generated at an unprecedented scale. These geo-textual data cover a wide range of topics. Users are interested in receiving up-to-date tweets such that their locations are close to a user specified location and their texts are interesting to users. For example, a user may want to be updated with tweets near her home on the topic “food poisoning vomiting.” We consider the Temporal Spatial-Keyword Top-k Subscription (TaSK) query. Given a TaSK query, we continuously maintain up-to-date top-k most relevant results over a stream of geo-textual objects (e.g., geo-tagged Tweets) for the query. The TaSK query takes into account text relevance, spatial proximity, and recency of geo-textual objects in evaluating its relevance with a geo-textual object. We propose a novel solution to efficiently process a large number of TaSK queries over a stream of geotextual objects. We evaluate the efficiency of our approach on two real-world datasets and the experimental results show that our solution is able to achieve a reduction of the processing time by 70-80% compared with two baselines.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tumor cells require angiogenesis to deliver nutrients and oxygen to support their fast growth and metabolism. The vascular endothelial growth factor (VEGF) pathway plays an important role in promoting angiogenesis, including tumor-induced angiogenesis. Recent clinical trials have demonstrated the benefit of targeting VEGF in the treatment of glioblastoma. However, the prognostic significance of the expression of VEGFA and its receptors VEGFR1 (FLT1) and VEGFR2 (KDR) are still largely elusive. In the present study, we aimed to investigate the prognostic significance of these three factors, alone or in combination, in glioma patients. Gene mRNA expression was extracted from three independent brain tumor cohorts totaling 242 patients and the association between gene expression and survival was tested. We found that when VEGFA, FLT1 and KDR expressions were considered alone, only VEGFA demonstrated a significant association with patient survival. Patients with high expression of both VEGFA and either receptor had significantly worse survival than patients expressing both factors at a low level. Importantly, we found that those patients whose tumors overexpressed all three genes had a significantly shorter survival compared to those patients with a low level expression of these genes. Our results suggest that a high level expression of VEGFA and its receptors, both FLT1 and KDR, may be required for brain tumor progression, and that these three factors should be considered together as a prognostic indicator for brain tumor patients.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Although PTP4A3 has been shown to be a very important factor in promoting cancer progression, the role of its close family member PTP4A2 is still largely unknown. Recent reports have shown contradicting results on the role of PTP4A2 in breast cancer progression. Considering this, we aimed to investigate the prognostic value of PTP4A2 in five independent breast cancer data sets (minimum 198 patients per cohort, totaling 1,124 patients) in the Gene Expression Omnibus Database. We found that high expression of PTP4A2 was a favorable prognostic marker in all five independent breast cancer data sets, as well as in the combined cohort, with a hazard ratio of 0.68 (95% confidence interval =0.56-0.83; P<0.001). Low PTP4A2 expression was associated with estrogen receptor-negative tumors and tumors with higher histological grading; furthermore, low expression was inversely correlated with the expression of genes involved in proliferation, including MKI67 and the MCM gene family encoding the minichromosome maintenance proteins. These findings suggest that PTP4A2 may play a role in breast cancer progression by dysregulating cell proliferation. PTP4A2 expression was positively correlated with ESR1, the gene encoding estrogen receptor-alpha, and inversely correlated with EGFR expression, suggesting that PTP4A2 may be involved in these two important oncogenic pathways. Together, our results suggest that expression of PTP4A2 is a favorable prognostic marker in breast cancer.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: While the discovery of new drugs is a complex, lengthy and costly process, identifying new uses for existing drugs is a cost-effective approach to therapeutic discovery. Connectivity mapping integrates gene expression profiling with advanced algorithms to connect genes, diseases and small molecule compounds and has been applied in a large number of studies to identify potential drugs, particularly to facilitate drug repurposing. Colorectal cancer (CRC) is a commonly diagnosed cancer with high mortality rates, presenting a worldwide health problem. With the advancement of high throughput omics technologies, a number of large scale gene expression profiling studies have been conducted on CRCs, providing multiple datasets in gene expression data repositories. In this work, we systematically apply gene expression connectivity mapping to multiple CRC datasets to identify candidate therapeutics to this disease.

RESULTS: We developed a robust method to compile a combined gene signature for colorectal cancer across multiple datasets. Connectivity mapping analysis with this signature of 148 genes identified 10 candidate compounds, including irinotecan and etoposide, which are chemotherapy drugs currently used to treat CRCs. These results indicate that we have discovered high quality connections between the CRC disease state and the candidate compounds, and that the gene signature we created may be used as a potential therapeutic target in treating the disease. The method we proposed is highly effective in generating quality gene signature through multiple datasets; the publication of the combined CRC gene signature and the list of candidate compounds from this work will benefit both cancer and systems biology research communities for further development and investigations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we present a meeting report for the 2nd Summer School in Computational Biology organized by the Queen's University of Belfast. We describe the organization of the summer school, its underlying concept and student feedback we received after the completion of the summer school.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

One of the major challenges in systems biology is to understand the complex responses of a biological system to external perturbations or internal signalling depending on its biological conditions. Genome-wide transcriptomic profiling of cellular systems under various chemical perturbations allows the manifestation of certain features of the chemicals through their transcriptomic expression profiles. The insights obtained may help to establish the connections between human diseases, associated genes and therapeutic drugs. The main objective of this study was to systematically analyse cellular gene expression data under various drug treatments to elucidate drug-feature specific transcriptomic signatures. We first extracted drug-related information (drug features) from the collected textual description of DrugBank entries using text-mining techniques. A novel statistical method employing orthogonal least square learning was proposed to obtain drug-feature-specific signatures by integrating gene expression with DrugBank data. To obtain robust signatures from noisy input datasets, a stringent ensemble approach was applied with the combination of three techniques: resampling, leave-one-out cross validation, and aggregation. The validation experiments showed that the proposed method has the capacity of extracting biologically meaningful drug-feature-specific gene expression signatures. It was also shown that most of signature genes are connected with common hub genes by regulatory network analysis. The common hub genes were further shown to be related to general drug metabolism by Gene Ontology analysis. Each set of genes has relatively few interactions with other sets, indicating the modular nature of each signature and its drug-feature-specificity. Based on Gene Ontology analysis, we also found that each set of drug feature (DF)-specific genes were indeed enriched in biological processes related to the drug feature. The results of these experiments demonstrated the pot- ntial of the method for predicting certain features of new drugs using their transcriptomic profiles, providing a useful methodological framework and a valuable resource for drug development and characterization.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Quantile normalization (QN) is a technique for microarray data processing and is the default normalization method in the Robust Multi-array Average (RMA) procedure, which was primarily designed for analysing gene expression data from Affymetrix arrays. Given the abundance of Affymetrix microarrays and the popularity of the RMA method, it is crucially important that the normalization procedure is applied appropriately. In this study we carried out simulation experiments and also analysed real microarray data to investigate the suitability of RMA when it is applied to dataset with different groups of biological samples. From our experiments, we showed that RMA with QN does not preserve the biological signal included in each group, but rather it would mix the signals between the groups. We also showed that the Median Polish method in the summarization step of RMA has similar mixing effect. RMA is one of the most widely used methods in microarray data processing and has been applied to a vast volume of data in biomedical research. The problematic behaviour of this method suggests that previous studies employing RMA could have been misadvised or adversely affected. Therefore we think it is crucially important that the research community recognizes the issue and starts to address it. The two core elements of the RMA method, quantile normalization and Median Polish, both have the undesirable effects of mixing biological signals between different sample groups, which can be detrimental to drawing valid biological conclusions and to any subsequent analyses. Based on the evidence presented here and that in the literature, we recommend exercising caution when using RMA as a method of processing microarray gene expression data, particularly in situations where there are likely to be unknown subgroups of samples.