3 resultados para Discriminative model training
em National Center for Biotechnology Information - NCBI
Resumo:
Structural genomics aims to solve a large number of protein structures that represent the protein space. Currently an exhaustive solution for all structures seems prohibitively expensive, so the challenge is to define a relatively small set of proteins with new, currently unknown folds. This paper presents a method that assigns each protein with a probability of having an unsolved fold. The method makes extensive use of protomap, a sequence-based classification, and scop, a structure-based classification. According to protomap, the protein space encodes the relationship among proteins as a graph whose vertices correspond to 13,354 clusters of proteins. A representative fold for a cluster with at least one solved protein is determined after superposition of all scop (release 1.37) folds onto protomap clusters. Distances within the protomap graph are computed from each representative fold to the neighboring folds. The distribution of these distances is used to create a statistical model for distances among those folds that are already known and those that have yet to be discovered. The distribution of distances for solved/unsolved proteins is significantly different. This difference makes it possible to use Bayes' rule to derive a statistical estimate that any protein has a yet undetermined fold. Proteins that score the highest probability to represent a new fold constitute the target list for structural determination. Our predicted probabilities for unsolved proteins correlate very well with the proportion of new folds among recently solved structures (new scop 1.39 records) that are disjoint from our original training set.
Resumo:
By evoking changes in climbing fiber activity, movement errors are thought to modify synapses from parallel fibers onto Purkinje cells (pf*Pkj) so as to improve subsequent motor performance. Theoretical arguments suggest there is an intrinsic tradeoff, however, between motor adaptation and long-term storage. Assuming a baseline rate of motor errors is always present, then repeated performance of any learned movement will generate a series of climbing fiber-mediated corrections. By reshuffling the synaptic weights responsible for any given movement, such corrections will degrade the memories for other learned movements stored in overlapping sets of synapses. The present paper shows that long-term storage can be accomplished by a second site of plasticity at synapses from parallel fibers onto stellate/basket interneurons (pf*St/Bk). Plasticity at pf*St/Bk synapses can be insulated from ongoing fluctuations in climbing fiber activity by assuming that changes in pf*St/Bk synapses occur only after changes in pf*Pkj synapses have built up to a threshold level. Although climbing fiber-dependent plasticity at pf*Pkj synapses allows for the exploration of novel motor strategies in response to changing environmental conditions, plasticity at pf*St/Bk synapses transfers successful strategies to stable long-term storage. To quantify this hypothesis, both sites of plasticity are incorporated into a dynamical model of the cerebellar cortex and its interactions with the inferior olive. When used to simulate idealized motor conditioning trials, the model predicts that plasticity develops first at pf*Pkj synapses, but with additional training is transferred to pf*St/Bk synapses for long-term storage.
Resumo:
Speech recognition involves three processes: extraction of acoustic indices from the speech signal, estimation of the probability that the observed index string was caused by a hypothesized utterance segment, and determination of the recognized utterance via a search among hypothesized alternatives. This paper is not concerned with the first process. Estimation of the probability of an index string involves a model of index production by any given utterance segment (e.g., a word). Hidden Markov models (HMMs) are used for this purpose [Makhoul, J. & Schwartz, R. (1995) Proc. Natl. Acad. Sci. USA 92, 9956-9963]. Their parameters are state transition probabilities and output probability distributions associated with the transitions. The Baum algorithm that obtains the values of these parameters from speech data via their successive reestimation will be described in this paper. The recognizer wishes to find the most probable utterance that could have caused the observed acoustic index string. That probability is the product of two factors: the probability that the utterance will produce the string and the probability that the speaker will wish to produce the utterance (the language model probability). Even if the vocabulary size is moderate, it is impossible to search for the utterance exhaustively. One practical algorithm is described [Viterbi, A. J. (1967) IEEE Trans. Inf. Theory IT-13, 260-267] that, given the index string, has a high likelihood of finding the most probable utterance.