860 resultados para computer vision,machine learning,centernet,volleyball,sports


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Document classification is a supervised machine learning process, where predefined category labels are assigned to documents based on the hypothesis derived from training set of labelled documents. Documents cannot be directly interpreted by a computer system unless they have been modelled as a collection of computable features. Rogati and Yang [M. Rogati and Y. Yang, Resource selection for domain-specific cross-lingual IR, in SIGIR 2004: Proceedings of the 27th annual international conference on Research and Development in Information Retrieval, ACM Press, Sheffied: United Kingdom, pp. 154-161.] pointed out that the effectiveness of document classification system may vary in different domains. This implies that the quality of document model contributes to the effectiveness of document classification. Conventionally, model evaluation is accomplished by comparing the effectiveness scores of classifiers on model candidates. However, this kind of evaluation methods may encounter either under-fitting or over-fitting problems, because the effectiveness scores are restricted by the learning capacities of classifiers. We propose a model fitness evaluation method to determine whether a model is sufficient to distinguish positive and negative instances while still competent to provide satisfactory effectiveness with a small feature subset. Our experiments demonstrated how the fitness of models are assessed. The results of our work contribute to the researches of feature selection, dimensionality reduction and document classification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Learning from mistakes has proven to be an effective way of learning in the interactive document classifications. In this paper we propose an approach to effectively learning from mistakes in the email filtering process. Our system has employed both SVM and Winnow machine learning algorithms to learn from misclassified email documents and refine the email filtering process accordingly. Our experiments have shown that the training of an email filter becomes much effective and faster

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditionally, machine learning algorithms have been evaluated in applications where assumptions can be reliably made about class priors and/or misclassification costs. In this paper, we consider the case of imprecise environments, where little may be known about these factors and they may well vary significantly when the system is applied. Specifically, the use of precision-recall analysis is investigated and compared to the more well known performance measures such as error-rate and the receiver operating characteristic (ROC). We argue that while ROC analysis is invariant to variations in class priors, this invariance in fact hides an important factor of the evaluation in imprecise environments. Therefore, we develop a generalised precision-recall analysis methodology in which variation due to prior class probabilities is incorporated into a multi-way analysis of variance (ANOVA). The increased sensitivity and reliability of this approach is demonstrated in a remote sensing application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Esta tese teve por objetivo saber como o corpo docente da Universidade Estadual de Mato Grosso do Sul (UEMS) percebe, entende e reage ante a incorporação e utilização das Tecnologias de Informação e Comunicação (TICs) nos cursos de graduação dessa Instituição, considerando os novos processos comunicacionais dialógicos que elas podem proporcionar na sociedade atual. Metodologicamente, a tese é composta por pesquisa bibliográfica, buscando fundamentar as áreas da Educação e Comunicação, assim como a Educomunicação; pesquisa documental para contextualização do lócus da pesquisa e de uma pesquisa exploratória a partir da aplicação de um questionário online a 165 docentes da UEMS, que responderam voluntariamente. Verificou-se que os professores utilizam as TICs cotidianamente nas atividades pessoais e, em menor escala, nos ambientes profissionais. Os desafios estão em se formar melhor esse docente e oferecer capacitação continuada para que utilizem de forma mais eficaz as TICs nas salas de aula. Destaca-se ainda que os avanços em tecnologia e os novos ecossistemas comunicacionais construíram novas e outras realidades, tornando a aprendizagem um fator não linear, exigindo-se revisão nos projetos pedagógicos na educação superior para que estes viabilizem diálogos propositivos entre a comunicação e a educação. A infraestrutura institucional para as TICs é outro entrave apontado, tanto na aquisição como na manutenção desses aparatos tecnológicos pela Universidade. Ao final, propõe-se realizar estudos e pesquisas que possam discutir alterações nos regimes contratuais de trabalho dos docentes, uma vez que, para atuar com as TICs de maneira apropriada, exige-se mais tempo e dedicação do docente.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Vapnik-Chervonenkis (VC) dimension is a combinatorial measure of a certain class of machine learning problems, which may be used to obtain upper and lower bounds on the number of training examples needed to learn to prescribed levels of accuracy. Most of the known bounds apply to the Probably Approximately Correct (PAC) framework, which is the framework within which we work in this paper. For a learning problem with some known VC dimension, much is known about the order of growth of the sample-size requirement of the problem, as a function of the PAC parameters. The exact value of sample-size requirement is however less well-known, and depends heavily on the particular learning algorithm being used. This is a major obstacle to the practical application of the VC dimension. Hence it is important to know exactly how the sample-size requirement depends on VC dimension, and with that in mind, we describe a general algorithm for learning problems having VC dimension 1. Its sample-size requirement is minimal (as a function of the PAC parameters), and turns out to be the same for all non-trivial learning problems having VC dimension 1. While the method used cannot be naively generalised to higher VC dimension, it suggests that optimal algorithm-dependent bounds may improve substantially on current upper bounds.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A theoretical model is presented which describes selection in a genetic algorithm (GA) under a stochastic fitness measure and correctly accounts for finite population effects. Although this model describes a number of selection schemes, we only consider Boltzmann selection in detail here as results for this form of selection are particularly transparent when fitness is corrupted by additive Gaussian noise. Finite population effects are shown to be of fundamental importance in this case, as the noise has no effect in the infinite population limit. In the limit of weak selection we show how the effects of any Gaussian noise can be removed by increasing the population size appropriately. The theory is tested on two closely related problems: the one-max problem corrupted by Gaussian noise and generalization in a perceptron with binary weights. The averaged dynamics can be accurately modelled for both problems using a formalism which describes the dynamics of the GA using methods from statistical mechanics. The second problem is a simple example of a learning problem and by considering this problem we show how the accurate characterization of noise in the fitness evaluation may be relevant in machine learning. The training error (negative fitness) is the number of misclassified training examples in a batch and can be considered as a noisy version of the generalization error if an independent batch is used for each evaluation. The noise is due to the finite batch size and in the limit of large problem size and weak selection we show how the effect of this noise can be removed by increasing the population size. This allows the optimal batch size to be determined, which minimizes computation time as well as the total number of training examples required.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we introduce and illustrate non-trivial upper and lower bounds on the learning curves for one-dimensional Gaussian Processes. The analysis is carried out emphasising the effects induced on the bounds by the smoothness of the random process described by the Modified Bessel and the Squared Exponential covariance functions. We present an explanation of the early, linearly-decreasing behavior of the learning curves and the bounds as well as a study of the asymptotic behavior of the curves. The effects of the noise level and the lengthscale on the tightness of the bounds are also discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There have been two main approaches to feature detection in human and computer vision - luminance-based and energy-based. Bars and edges might arise from peaks of luminance and luminance gradient respectively, or bars and edges might be found at peaks of local energy, where local phases are aligned across spatial frequency. This basic issue of definition is important because it guides more detailed models and interpretations of early vision. Which approach better describes the perceived positions of elements in a 3-element contour-alignment task? We used the class of 1-D images defined by Morrone and Burr in which the amplitude spectrum is that of a (partially blurred) square wave and Fourier components in a given image have a common phase. Observers judged whether the centre element (eg ±458 phase) was to the left or right of the flanking pair (eg 0º phase). Lateral offset of the centre element was varied to find the point of subjective alignment from the fitted psychometric function. This point shifted systematically to the left or right according to the sign of the centre phase, increasing with the degree of blur. These shifts were well predicted by the location of luminance peaks and other derivative-based features, but not by energy peaks which (by design) predicted no shift at all. These results on contour alignment agree well with earlier ones from a more explicit feature-marking task, and strongly suggest that human vision does not use local energy peaks to locate basic first-order features. [Supported by the Wellcome Trust (ref: 056093)]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach. © 2008 Springer-Verlag Berlin Heidelberg.