30 resultados para Distant supervision
Resumo:
Repeats are two or more contiguous segments of amino acid residues that are believed to have arisen as a result of intragenic duplication, recombination and mutation events. These repeats can be utilized for protein structure prediction and can provide insights into the protein evolution and phylogenetic relationship. Therefore, to aid structural biologists and phylogeneticists in their research, a computing resource (a web server and a database), Repeats in Protein Sequences (RPS), has been created. Using RPS, users can obtain useful information regarding identical, similar and distant repeats (of varying lengths) in protein sequences. In addition, users can check the frequency of occurrence of the repeats in sequence databases such as the Genome Database, PIR and SWISS-PROT and among the protein sequences available in the Protein Data Bank archive. Furthermore, users can view the three-dimensional structure of the repeats using the Java visualization plug-in Jmol. The proposed computing resource can be accessed over the World Wide Web at http://bioserver1.physics.iisc.ernet.in/rps/.
Resumo:
In a typical sensor network scenario a goal is to monitor a spatio-temporal process through a number of inexpensive sensing nodes, the key parameter being the fidelity at which the process has to be estimated at distant locations. We study such a scenario in which multiple encoders transmit their correlated data at finite rates to a distant and common decoder. In particular, we derive inner and outer bounds on the rate region for the random field to be estimated with a given mean distortion.
Resumo:
Background & objectives: There is a need to develop an affordable and reliable tool for hearing screening of neonates in resource constrained, medically underserved areas of developing nations. This study valuates a strategy of health worker based screening of neonates using a low cost mechanical calibrated noisemaker followed up with parental monitoring of age appropriate auditory milestones for detecting severe-profound hearing impairment in infants by 6 months of age. Methods: A trained health worker under the supervision of a qualified audiologist screened 425 neonates of whom 20 had confirmed severe-profound hearing impairment. Mechanical calibrated noisemakers of 50, 60, 70 and 80 dB (A) were used to elicit the behavioural responses. The parents of screened neonates were instructed to monitor the normal language and auditory milestones till 6 months of age. This strategy was validated against the reference standard consisting of a battery of tests - namely, auditory brain stem response (ABR), otoacoustic emissions (OAE) and behavioural assessment at 2 years of age. Bayesian prevalence weighted measures of screening were calculated. Results: The sensitivity and specificity was high with least false positive referrals for. 70 and 80 dB (A) noisemakers. All the noisemakers had 100 per cent negative predictive value. 70 and 80 dB (A) noisemakers had high positive likelihood ratios of 19 and 34, respectively. The probability differences for pre- and post- test positive was 43 and 58 for 70 and 80 dB (A) noisemakers, respectively. Interpretation & conclusions: In a controlled setting, health workers with primary education can be trained to use a mechanical calibrated noisemaker made of locally available material to reliably screen for severe-profound hearing loss in neonates. The monitoring of auditory responses could be done by informed parents. Multi-centre field trials of this strategy need to be carried out to examine the feasibility of community health care workers using it in resource constrained settings of developing nations to implement an effective national neonatal hearing screening programme.
Resumo:
Over the past two decades, many ingenious efforts have been made in protein remote homology detection. Because homologous proteins often diversify extensively in sequence, it is challenging to demonstrate such relatedness through entirely sequence-driven searches. Here, we describe a computational method for the generation of `protein-like' sequences that serves to bridge gaps in protein sequence space. Sequence profile information, as embodied in a position-specific scoring matrix of multiply aligned sequences of bona fide family members, serves as the starting point in this algorithm. The observed amino acid propensity and the selection of a random number dictate the selection of a residue for each position in the sequence. In a systematic manner, and by applying a `roulette-wheel' selection approach at each position, we generate parent family-like sequences and thus facilitate an enlargement of sequence space around the family. When generated for a large number of families, we demonstrate that they expand the utility of natural intermediately related sequences in linking distant proteins. In 91% of the assessed examples, inclusion of designed sequences improved fold coverage by 5-10% over searches made in their absence. Furthermore, with several examples from proteins adopting folds such as TIM, globin, lipocalin and others, we demonstrate that the success of including designed sequences in a database positively sensitized methods such as PSI-BLAST and Cascade PSI-BLAST and is a promising opportunity for enormously improved remote homology recognition using sequence information alone.
Resumo:
Evaluating the hazard potential of the Makran subduction zone requires understanding the previous records of the large earthquakes and tsunamis. We address this problem by searching for earthquake and tectonic proxies along the Makran Coast and linking those observations with the available constraints on historical seismicity and the tell-tale characteristics of sea floor morphology. The earthquake of Mw 8.1 of 1945 and the consequent tsunami that originated on the eastern part of the Makran are the only historically known hazardous events in this region. The seismic status of the western part of the subduction zone outside the rupture area of the 1945 earthquake remains an enigma. The near-shore shallow stratigraphy of the central part of Makran near Chabahar shows evidence of seismically induced liquefaction that we attribute to the distant effects of the 1945 earthquake. The coastal sites further westward around Jask are remarkable for the absence of liquefaction features, at least at the shallow level. Although a negative evidence, this possibly implies that the western part of Makran Coast region may not have been impacted by near-field large earthquakes in the recent past-a fact also supported by the analysis of historical data. On the other hand, the elevated marine terraces on the western Makran and their uplift rates are indicative of comparable degree of long-term tectonic activity, at least around Chabahar. The offshore data suggest occurrences of recently active submarine slumps on the eastern part of the Makran, reflective of shaking events, owing to the great 1945 earthquake. The ocean floor morphologic features on the western segment, on the contrary, are much subdued and the prograding delta lobes on the shelf edge also remain intact. The coast on the western Makran, in general, shows indications of progradation and uplift. The various lines of evidence thus suggest that although the western segment is potentially seismogenic, large earthquakes have not occurred there in the recent past, at least during the last 600 years. The recurrence period of earthquakes may range up to 1,000 years or more, an assessment based on the age of the youngest dated coastal ridge. The long elapsed time points to the fact that the western segment may have accumulated sufficient slip to produce a major earthquake.
Resumo:
Background: Development of sensitive sequence search procedures for the detection of distant relationships between proteins at superfamily/fold level is still a big challenge. The intermediate sequence search approach is the most frequently employed manner of identifying remote homologues effectively. In this study, examination of serine proteases of prolyl oligopeptidase, rhomboid and subtilisin protein families were carried out using plant serine proteases as queries from two genomes including A. thaliana and O. sativa and 13 other families of unrelated folds to identify the distant homologues which could not be obtained using PSI-BLAST. Methodology/Principal Findings: We have proposed to start with multiple queries of classical serine protease members to identify remote homologues in families, using a rigorous approach like Cascade PSI-BLAST. We found that classical sequence based approaches, like PSI-BLAST, showed very low sequence coverage in identifying plant serine proteases. The algorithm was applied on enriched sequence database of homologous domains and we obtained overall average coverage of 88% at family, 77% at superfamily or fold level along with specificity of similar to 100% and Mathew's correlation coefficient of 0.91. Similar approach was also implemented on 13 other protein families representing every structural class in SCOP database. Further investigation with statistical tests, like jackknifing, helped us to better understand the influence of neighbouring protein families. Conclusions/Significance: Our study suggests that employment of multiple queries of a family for the Cascade PSI-BLAST searches is useful for predicting distant relationships effectively even at superfamily level. We have proposed a generalized strategy to cover all the distant members of a particular family using multiple query sequences. Our findings reveal that prior selection of sequences as query and the presence of neighbouring families can be important for covering the search space effectively in minimal computational time. This study also provides an understanding of the `bridging' role of related families.
Resumo:
Protein functional annotation relies on the identification of accurate relationships, sequence divergence being a key factor. This is especially evident when distant protein relationships are demonstrated only with three-dimensional structures. To address this challenge, we describe a computational approach to purposefully bridge gaps between related protein families through directed design of protein-like ``linker'' sequences. For this, we represented SCOP domain families, integrated with sequence homologues, as multiple profiles and performed HMM-HMM alignments between related domain families. Where convincing alignments were achieved, we applied a roulette wheel-based method to design 3,611,010 protein-like sequences corresponding to 374 SCOP folds. To analyze their ability to link proteins in homology searches, we used 3024 queries to search two databases, one containing only natural sequences and another one additionally containing designed sequences. Our results showed that augmented database searches showed up to 30% improvement in fold coverage for over 74% of the folds, with 52 folds achieving all theoretically possible connections. Although sequences could not be designed between some families, the availability of designed sequences between other families within the fold established the sequence continuum to demonstrate 373 difficult relationships. Ultimately, as a practical and realistic extension, we demonstrate that such protein-like sequences can be ``plugged-into'' routine and generic sequence database searches to empower not only remote homology detection but also fold recognition. Our richly statistically supported findings show that complementary searches in both databases will increase the effectiveness of sequence-based searches in recognizing all homologues sharing a common fold. (C) 2013 Elsevier Ltd. All rights reserved.
Resumo:
Cancer stem cells are becoming recognised as being responsible for metastasis and treatment resistance. The complex cellular and molecular network that regulates cancer stem cells and the role that inflammation plays in cancer progression are slowly being elucidated. Cytokines, secreted by tumour associated immune cells, activate the necessary pathways required by cancer stem cells to facilitate cancer stem cells progressing through the epithelial-mesenchymal transition and migrating to distant sites. Once in situ, these cancer stem cells can secrete their own attractants, thus providing an environment whereby these cells can continue to propagate the tumour in a secondary niche. (C) 2013 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Head pose classification from surveillance images acquired with distant, large field-of-view cameras is difficult as faces are captured at low-resolution and have a blurred appearance. Domain adaptation approaches are useful for transferring knowledge from the training (source) to the test (target) data when they have different attributes, minimizing target data labeling efforts in the process. This paper examines the use of transfer learning for efficient multi-view head pose classification with minimal target training data under three challenging situations: (i) where the range of head poses in the source and target images is different, (ii) where source images capture a stationary person while target images capture a moving person whose facial appearance varies under motion due to changing perspective, scale and (iii) a combination of (i) and (ii). On the whole, the presented methods represent novel transfer learning solutions employed in the context of multi-view head pose classification. We demonstrate that the proposed solutions considerably outperform the state-of-the-art through extensive experimental validation. Finally, the DPOSE dataset compiled for benchmarking head pose classification performance with moving persons, and to aid behavioral understanding applications is presented in this work.
Resumo:
High conservation of glycyl residues in homologous proteins is fairly frequent. It is commonly understood that glycine tends to be highly conserved either because of its unique Ramachandran angles or to avoid steric clash that would arise with a larger side chain. Using a database of aligned 3D structures of homologous proteins we identified conserved Gly in 288 alignment positions from 85 families. Ninety-six of these alignment positions correspond to conserved Gly residue with (phi, ) values allowed for non-glycyl residues. Reasons for this observation were investigated by in-silico mutation of these glycyl residues to Ala. We found in 94% of the cases a short contact exists between the C atom of the introduced Ala with the atoms which are often distant in the primary structure. This suggests the lack of space even for a short side chain thereby explaining high conservation of glycyl residues even when they adopt (phi, ) values allowed for Ala. In 189 alignment positions, the conserved glycyl residues adopt (phi, ) values which are disallowed for Ala. In-silico mutation of these Gly residues to Ala almost always results in steric hindrance involving C atom of Ala as one would expect by comparing Ramachandran maps for Ala and Gly. Rare occurrence of the disallowed glycyl conformations even in ultrahigh resolution protein structures are accompanied by short contacts in the crystal structures and such disallowed conformations are not conserved in the homologues. These observations raise the doubt on the accuracy of such glycyl conformations in proteins.
Resumo:
The M-w 8.6 and 8.2 strike-slip earthquakes that struck the northeast Indian Ocean on 11 April 2012 resulted in coseismic deformation both at near and distant sites. The slip distribution, deduced using seismic-wave analysis for the orthogonal faults that ruptured during these earthquakes, is sufficient to predict the coseismic displacements at the Global Positioning System (GPS) sites, such as NTUS, PALK, and CUSV, but fall short at four continuous sites in the Andaman Islands region. Slip modeling, for times prior to the events, suggests that the lower portion of the thrust fault beneath the Andaman Islands has been slipping at least at the rate of 40 cm/yr, in response to the 2004 Sumatra-Andaman coseismic stress change. Modeling of GPS displacements suggests that the en echelon and orthogonal fault ruptures of the 2012 intraplate oceanic earthquakes could have possibly accelerated the ongoing slow slip, along the lower portion of the thrust fault beneath the islands with a month-long slip of 4-10 cm. The misfit to the coseismic GPS displacements along the Andaman Islands could be improved with a better source model, assuming that no local process contributed to this anomaly.
Resumo:
The 2004 earthquake left several traces of coseismic land deformation and tsunami deposits, both on the islands along the plate boundary and distant shores of the Indian Ocean rim countries. Researchers are now exploring these sites to develop a chronology of past events. Where the coastal regions are also inundated by storm surges, there is an additional challenge to discriminate between the deposits formed by these two processes. Paleo-tsunami research relies largely on finding deposits where preservation potential is high and storm surge origin can be excluded. During the past decade of our work along the Andaman and Nicobar Islands and the east coast of India, we have observed that the 2004 tsunami deposits are best preserved in lagoons, inland streams and also on elevated terraces. Chronological evidence for older events obtained from such sites is better correlated with those from Thailand, Sri Lanka and Indonesia, reiterating their usefulness in tsunami geology studies. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
Since the time of Kirkwood, observed deviations in magnitude of the dielectric constant of aqueous protein solution from that of neat water (similar to 80) and slower decay of polarization have been subjects of enormous interest, controversy, and debate. Most of the common proteins have large permanent dipole moments (often more than 100 D) that can influence structure and dynamics of even distant water molecules, thereby affecting collective polarization fluctuation of the solution, which in turn can significantly alter solution's dielectric constant. Therefore, distance dependence of polarization fluctuation can provide important insight into the nature of biological water. We explore these aspects by studying aqueous solutions of four different proteins of different characteristics and varying sizes, chicken villin headpiece subdomain (HP-36), immunoglobulin binding domain protein G (GB1), hen-egg white lysozyme (LYS), and Myoglobin (MYO). We simulate fairly large systems consisting of single protein molecule and 20000-30000 water molecules (varied according to the protein size), providing a concentration in the range of similar to 2-3 mM. We find that the calculated dielectric constant of the system shows a noticeable increment in all the cases compared to that of neat water. Total dipole moment auto time correlation function of water < dM(W) (0)delta M-W (t) > is found to be sensitive to the nature of the protein. Surprisingly, dipole moment of the protein and total dipole moment of the water molecules are found to be only weakly coupled. Shellwise decomposition of water molecules around protein reveals higher density of first layer compared to the succeeding ones. We also calculate heuristic effective dielectric constant of successive layers and find that the layer adjacent to protein has much lower value (similar to 50). However, progressive layers exhibit successive increment of dielectric constant, finally reaching a value close to that of bulk 4-5 layers away. We also calculate shellwise orientational correlation function and tetrahedral order parameter to understand the local dynamics and structural re-arrangement of water. Theoretical analysis providing simple method for calculation of shellwise local dielectric constant and implication of these findings are elaborately discussed in the present work. (C) 2014 AIP Publishing LLC.
Resumo:
Despite significant improvements in their properties as emitters, colloidal quantum dots have not had much success in emerging as suitable materials for laser applications. Gain in most colloidal systems is short lived, and needs to compete with biexcitonic decay. This has necessitated the use of short pulsed lasers to pump quantum dots to thresholds needed for amplified spontaneous emission or lasing. Continuous wave pumping of gain that is possible in some inorganic phosphors has therefore remained a very distant possibility for quantum dots. Here, we demonstrate that trilayer heterostructures could provide optimal conditions for demonstration of continuous wave lasing in colloidal materials. The design considerations for these materials are discussed in terms of a kinetic model. The electronic structure of the proposed dot architectures is modeled within effective mass theory.
Resumo:
Insects of the order Hemiptera (true bugs) use a wide range of mechanisms of sex determination, including genetic sex determination, paternal genome elimination, and haplodiploidy. Genetic sex determination, the prevalent mode, is generally controlled by a pair of XY sex chromosomes or by an XX/XO system, but different configurations that include additional sex chromosomes are also present. Although this diversity of sex determining systems has been extensively studied at the cytogenetic level, only the X chromosome of the model pea aphid Acyrthosiphon pisum has been analyzed at the genomic level, and little is known about X chromosome biology in the rest of the order. In this study, we take advantage of published DNA- and RNA-seq data from three additional Hemiptera species to perform a comparative analysis of the gene content and expression of the X chromosome throughout this clade. We find that, despite showing evidence of dosage compensation, the X chromosomes of these species show female-biased expression, and a deficit of male-biased genes, in direct contrast to the pea aphid X. We further detect an excess of shared gene content between these very distant species, suggesting that despite the diversity of sex determining systems, the same chromosomal element is used as the X throughout a large portion of the order.