8 resultados para National Cancer Institute (U.S.). Viral Oncology Program
em CentAUR: Central Archive University of Reading - UK
Resumo:
Visual exploration of scientific data in life science area is a growing research field due to the large amount of available data. The Kohonen’s Self Organizing Map (SOM) is a widely used tool for visualization of multidimensional data. In this paper we present a fast learning algorithm for SOMs that uses a simulated annealing method to adapt the learning parameters. The algorithm has been adopted in a data analysis framework for the generation of similarity maps. Such maps provide an effective tool for the visual exploration of large and multi-dimensional input spaces. The approach has been applied to data generated during the High Throughput Screening of molecular compounds; the generated maps allow a visual exploration of molecules with similar topological properties. The experimental analysis on real world data from the National Cancer Institute shows the speed up of the proposed SOM training process in comparison to a traditional approach. The resulting visual landscape groups molecules with similar chemical properties in densely connected regions.
Resumo:
With many cancers showing resistance to current chemotherapies, the search for novel anti-cancer agents is attracting considerable attention. Natural flavonoids have been identified as useful leads in such programmes. However, since an in-depth understanding of the structural requirements for optimum activity is generally lacking, further research is required before the full potential of flavonoids as anti-proliferative agents can be realised. Herein a broad library of 76 methoxy and hydroxy flavones, and their 4-thio analogues, was constructed and their structure-activity relationships for anti-proliferative activity against the breast cancer cell lines MCF-7 (ER+ve), MCF-7/DX (ER+ve, anthracycline resistant) and MDA-MB-231 (ER-ve) were probed. Within this library, 42 compounds were novel, and all compounds were afforded in good yields and > 95% purity. The most promising lead compounds, specifically the novel hydroxy 4-thioflavones 15f and 16f, were further evaluated for their anti-proliferative activities against a broader range of cancer cell lines by the National Cancer Institute (NCI), USA and displayed significant growth inhibition profiles (e.g Compound-15f: MCF-7 (GI50 = 0.18 μM), T-47D (GI50 = 0.03 μM) and MDA-MB-468 (GI50 = 0.47 μM) and compound-16f: MCF-7 (GI50 = 1.46 μM), T-47D (GI50 = 1.27 μM) and MDA-MB-231 (GI50 = 1.81 μM). Overall, 15f and 16f exhibited 7-46 fold greater anti-proliferative potency than the natural flavone chrysin (2d). A systematic structure-activity relationship study against the breast cancer cell lines highlighted that free hydroxyl groups and the B-ring phenyl groups were essential for enhanced anti-proliferative activities. Substitution of the 4-C=O functionality with a 4-C=S functionality, and incorporation of electron withdrawing groups at C4’ of the B-ring phenyl, also enhanced activity. Molecular docking and mechanistic studies suggest that the anti-proliferative effects of flavones 15f and 16f are mediated via ER-independent cleavage of PARP and downregulation of GSK-3β for MCF-7 and MCF-7/DX cell lines. For the MDA-MB-231 cell line, restoration of the wild-type p53 DNA binding activity of mutant p53 tumour suppressor gene was indicated.
Resumo:
In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable for large-scale, multi-domain, heterogeneous environments, such as computational grids.
Resumo:
In real world applications sequential algorithms of data mining and data exploration are often unsuitable for datasets with enormous size, high-dimensionality and complex data structure. Grid computing promises unprecedented opportunities for unlimited computing and storage resources. In this context there is the necessity to develop high performance distributed data mining algorithms. However, the computational complexity of the problem and the large amount of data to be explored often make the design of large scale applications particularly challenging. In this paper we present the first distributed formulation of a frequent subgraph mining algorithm for discriminative fragments of molecular compounds. Two distributed approaches have been developed and compared on the well known National Cancer Institute’s HIV-screening dataset. We present experimental results on a small-scale computing environment.
Resumo:
Structured data represented in the form of graphs arises in several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network of workstations.
Resumo:
Frequent pattern discovery in structured data is receiving an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset.
Resumo:
The growing energy consumption in the residential sector represents about 30% of global demand. This calls for Demand Side Management solutions propelling change in behaviors of end consumers, with the aim to reduce overall consumption as well as shift it to periods in which demand is lower and where the cost of generating energy is lower. Demand Side Management solutions require detailed knowledge about the patterns of energy consumption. The profile of electricity demand in the residential sector is highly correlated with the time of active occupancy of the dwellings; therefore in this study the occupancy patterns in Spanish properties was determined using the 2009–2010 Time Use Survey (TUS), conducted by the National Statistical Institute of Spain. The survey identifies three peaks in active occupancy, which coincide with morning, noon and evening. This information has been used to input into a stochastic model which generates active occupancy profiles of dwellings, with the aim to simulate domestic electricity consumption. TUS data were also used to identify which appliance-related activities could be considered for Demand Side Management solutions during the three peaks of occupancy.
Resumo:
Background. Leopards Panthera pardus show genetically determined colour variation. Erythristic (strawberry) morphs, where individuals are paler and black pigment in the coat is replaced by a red-brown colour, are exceptionally rare in the wild. Historically, few records exist, with only five putative records from India known. Objectives. To record the presence of erythristic leopards in our study site (Thabo Thalo Wilderness Reserve, Mpumalanga), and to collate records from across South Africa. Method. A network of camera traps was used to record individual leopards at Thabo Thalo. We also surveyed local experts, searched the popular South African press and used social media to request observations. Results. Two out of 27 individual leopards (7.1%) recorded in our study site over three years were of this colour morph. We obtained records of five other erythristic leopards in the Waterberg and Mpumalanga region, with no reports outside of this population. Conclusions. Erythristic leopards are widely dispersed across north-west South Africa, predominantly in the Lydenburg region. The presence of this rare colour morph may reflect the consequences of population fragmentation.