966 resultados para Crowd density estimation


Relevância:

80.00% 80.00%

Publicador:

Resumo:

To enhance the efficiency of regression parameter estimation by modeling the correlation structure of correlated binary error terms in quantile regression with repeated measurements, we propose a Gaussian pseudolikelihood approach for estimating correlation parameters and selecting the most appropriate working correlation matrix simultaneously. The induced smoothing method is applied to estimate the covariance of the regression parameter estimates, which can bypass density estimation of the errors. Extensive numerical studies indicate that the proposed method performs well in selecting an accurate correlation structure and improving regression parameter estimation efficiency. The proposed method is further illustrated by analyzing a dental dataset.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We consider online prediction problems where the loss between the prediction and the outcome is measured by the squared Euclidean distance and its generalization, the squared Mahalanobis distance. We derive the minimax solutions for the case where the prediction and action spaces are the simplex (this setup is sometimes called the Brier game) and the \ell_2 ball (this setup is related to Gaussian density estimation). We show that in both cases the value of each sub-game is a quadratic function of a simple statistic of the state, with coefficients that can be efficiently computed using an explicit recurrence relation. The resulting deterministic minimax strategy and randomized maximin strategy are linear functions of the statistic.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Aerial surveys of kangaroos (Macropus spp.) in Queensland are used to make economically important judgements on the levels of viable commercial harvest. Previous analysis methods for aerial kangaroo surveys have used both mark-recapture methodologies and conventional distance-sampling analyses. Conventional distance sampling has the disadvantage that detection is assumed to be perfect on the transect line, while mark-recapture methods are notoriously sensitive to problems with unmodelled heterogeneity in capture probabilities. We introduce three methodologies for combining together mark-recapture and distance-sampling data, aimed at exploiting the strengths of both methodologies and overcoming the weaknesses. Of these methods, two are based on the assumption of full independence between observers in the mark-recapture component, and this appears to introduce more bias in density estimation than it resolves through allowing uncertain trackline detection. Both of these methods give lower density estimates than conventional distance sampling, indicating a clear failure of the independence assumption. The third method, termed point independence, appears to perform very well, giving credible density estimates and good properties in terms of goodness-of-fit and percentage coefficient of variation. Estimated densities of eastern grey kangaroos range from 21 to 36 individuals km-2, with estimated coefficients of variation between 11% and 14% and estimated trackline detection probabilities primarily between 0.7 and 0.9.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Interstellar clouds are not featureless, but show quite complex internal structures of filaments and clumps when observed with high enough resolution. These structures have been generated by 1) turbulent motions driven mainly by supernovae, 2) magnetic fields working on the ions and, through neutral-ion collisions, on neutral gas as well, and 3) self-gravity pulling a dense clump together to form a new star. The study of the cloud structure gives us information on the relative importance of each of these mechanisms, and helps us to gain a better understanding of the details of the star formation process. Interstellar dust is often used as a tracer for the interstellar gas which forms the bulk of the interstellar matter. Some of the methods that are used to derive the column density are summarized in this thesis. A new method, which uses the scattered light to map the column density in large fields with high spatial resolution, is introduced. This thesis also takes a look at the grain alignment with respect to the magnetic fields. The aligned grains give rise to the polarization of starlight and dust emission, thus revealing the magnetic field. The alignment mechanisms have been debated for the last half century. The strongest candidate at present is the radiative torques mechanism. In the first four papers included in this thesis, the scattered light method of column density estimation is formulated, tested in simulations, and finally used to obtain a column density map from observations. They demonstrate that the scattered light method is a very useful and reliable tool in column density estimation, and is able to provide higher resolution than the near-infrared color excess method. These two methods are complementary. The derived column density maps are also used to gain information on the dust emissivity within the observed cloud. The two final papers present simulations of polarized thermal dust emission assuming that the alignment happens by the radiative torques mechanism. We show that the radiative torques can explain the observed decline of the polarization degree towards dense cores. Furthermore, the results indicate that the dense cores themselves might not contribute significantly to the polarized signal, and hence one needs to be careful when interpreting the observations and deriving the magnetic field.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a measurement of the mass of the top quark using data corresponding to an integrated luminosity of 1.9fb^-1 of ppbar collisions collected at sqrt{s}=1.96 TeV with the CDF II detector at Fermilab's Tevatron. This is the first measurement of the top quark mass using top-antitop pair candidate events in the lepton + jets and dilepton decay channels simultaneously. We reconstruct two observables in each channel and use a non-parametric kernel density estimation technique to derive two-dimensional probability density functions from simulated signal and background samples. The observables are the top quark mass and the invariant mass of two jets from the W decay in the lepton + jets channel, and the top quark mass and the scalar sum of transverse energy of the event in the dilepton channel. We perform a simultaneous fit for the top quark mass and the jet energy scale, which is constrained in situ by the hadronic W boson mass. Using 332 lepton + jets candidate events and 144 dilepton candidate events, we measure the top quark mass to be mtop=171.9 +/- 1.7 (stat. + JES) +/- 1.1 (syst.) GeV/c^2 = 171.9 +/- 2.0 GeV/c^2.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we compare the experimental results for Tamil online handwritten character recognition using HMM and Statistical Dynamic Time Warping (SDTW) as classifiers. HMM was used for a 156-class problem. Different feature sets and values for the HMM states & mixtures were tried and the best combination was found to be 16 states & 14 mixtures, giving an accuracy of 85%. The features used in this combination were retained and a SDTW model with 20 states and single Gaussian was used as classifier. Also, the symbol set was increased to include numerals, punctuation marks and special symbols like $, & and #, taking the number of classes to 188. It was found that, with a small addition to the feature set, this simple SDTW classifier performed on par with the more complicated HMM model, giving an accuracy of 84%. Mixture density estimation computations was reduced by 11 times. The recognition is writer independent, as the dataset used is quite large, with a variety of handwriting styles.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The mixtures of factor analyzers (MFA) model allows data to be modeled as a mixture of Gaussians with a reduced parametrization. We present the formulation of a nonparametric form of the MFA model, the Dirichlet process MFA (DPMFA). The proposed model can be used for density estimation or clustering of high dimensiona data. We utilize the DPMFA for clustering the action potentials of different neurons from extracellular recordings, a problem known as spike sorting. DPMFA model is compared to Dirichlet process mixtures of Gaussians model (DPGMM) which has a higher computational complexity. We show that DPMFA has similar modeling performance in lower dimensions when compared to DPGMM, and is able to work in higher dimensions. ©2009 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters. To produce more appropriate clusterings, we introduce a model which warps a latent mixture of Gaussians to produce nonparametric cluster shapes. The possibly low-dimensional latent mixture model allows us to summarize the properties of the high-dimensional clusters (or density manifolds) describing the data. The number of manifolds, as well as the shape and dimension of each manifold is automatically inferred. We derive a simple inference scheme for this model which analytically integrates out both the mixture parameters and the warping function. We show that our model is effective for density estimation, performs better than infinite Gaussian mixture models at recovering the true number of clusters, and produces interpretable summaries of high-dimensional datasets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman's coalescent, Dirichlet diffusion trees and Wishart processes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The effect of different factors (spawning biomass, environmental conditions) on recruitment is a subject of great importance in the management of fisheries, recovery plans and scenario exploration. In this study, recently proposed supervised classification techniques, tested by the machine-learning community, are applied to forecast the recruitment of seven fish species of North East Atlantic (anchovy, sardine, mackerel, horse mackerel, hake, blue whiting and albacore), using spawning, environmental and climatic data. In addition, the use of the probabilistic flexible naive Bayes classifier (FNBC) is proposed as modelling approach in order to reduce uncertainty for fisheries management purposes. Those improvements aim is to improve probability estimations of each possible outcome (low, medium and high recruitment) based in kernel density estimation, which is crucial for informed management decision making with high uncertainty. Finally, a comparison between goodness-of-fit and generalization power is provided, in order to assess the reliability of the final forecasting models. It is found that in most cases the proposed methodology provides useful information for management whereas the case of horse mackerel is an example of the limitations of the approach. The proposed improvements allow for a better probabilistic estimation of the different scenarios, i.e. to reduce the uncertainty in the provided forecasts.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A real-time VHF swept frequency (20–300 MHz) reflectometry measurement for radio-frequency capacitive-coupled atmospheric pressure plasmas is described. The measurement is scalar, non-invasive and deployed on the main power line of the plasma chamber. The purpose of this VHF signal injection is to remotely interrogate in real-time the frequency reflection properties of plasma. The information obtained is used for remote monitoring of high-value atmospheric plasma processing. Measurements are performed under varying gas feed (helium mixed with 0–2% oxygen) and power conditions (0–40 W) on two contrasting reactors. The first is a classical parallel-plate chamber driven at 16 MHz with well-defined electrical grounding but limited optical access and the second is a cross-field plasma jet driven at 13.56 MHz with open optical access but with poor electrical shielding of the driven electrode. The electrical measurements are modelled using a lumped element electrical circuit to provide an estimate of power dissipated in the plasma as a function of gas and applied power. The performances of both reactors are evaluated against each other. The scalar measurements reveal that 0.1% oxygen admixture in helium plasma can be detected. The equivalent electrical model indicates that the current density between the parallel-plate reactor is of the order of 8–20 mA cm-2 . This value is in accord with 0.03 A cm-2 values reported by Park et al (2001 J. Appl. Phys. 89 20–8). The current density of the cross-field plasma jet electrodes is found to be 20 times higher. When the cross-field plasma jet unshielded electrode area is factored into the current density estimation, the resultant current density agrees with the parallel-plate reactor. This indicates that the unshielded reactor radiates electromagnetic energy into free space and so acts as a plasma antenna.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Camera traps are used to estimate densities or abundances using capture-recapture and, more recently, random encounter models (REMs). We deploy REMs to describe an invasive-native species replacement process, and to demonstrate their wider application beyond abundance estimation. The Irish hare Lepus timidus hibernicus is a high priority endemic of conservation concern. It is threatened by an expanding population of non-native, European hares L. europaeus, an invasive species of global importance. Camera traps were deployed in thirteen 1 km squares, wherein the ratio of invader to native densities were corroborated by night-driven line transect distance sampling throughout the study area of 1652 km2. Spatial patterns of invasive and native densities between the invader’s core and peripheral ranges, and native allopatry, were comparable between methods. Native densities in the peripheral range were comparable to those in native allopatry using REM, or marginally depressed using Distance Sampling. Numbers of the invader were substantially higher than the native in the core range, irrespective of method, with a 5:1 invader-to-native ratio indicating species replacement. We also describe a post hoc optimization protocol for REM which will inform subsequent (re-)surveys, allowing survey effort (camera hours) to be reduced by up to 57% without compromising the width of confidence intervals associated with density estimates. This approach will form the basis of a more cost-effective means of surveillance and monitoring for both the endemic and invasive species. The European hare undoubtedly represents a significant threat to the endemic Irish hare.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

L'un des modèles d'apprentissage non-supervisé générant le plus de recherche active est la machine de Boltzmann --- en particulier la machine de Boltzmann restreinte, ou RBM. Un aspect important de l'entraînement ainsi que l'exploitation d'un tel modèle est la prise d'échantillons. Deux développements récents, la divergence contrastive persistante rapide (FPCD) et le herding, visent à améliorer cet aspect, se concentrant principalement sur le processus d'apprentissage en tant que tel. Notamment, le herding renonce à obtenir un estimé précis des paramètres de la RBM, définissant plutôt une distribution par un système dynamique guidé par les exemples d'entraînement. Nous généralisons ces idées afin d'obtenir des algorithmes permettant d'exploiter la distribution de probabilités définie par une RBM pré-entraînée, par tirage d'échantillons qui en sont représentatifs, et ce sans que l'ensemble d'entraînement ne soit nécessaire. Nous présentons trois méthodes: la pénalisation d'échantillon (basée sur une intuition théorique) ainsi que la FPCD et le herding utilisant des statistiques constantes pour la phase positive. Ces méthodes définissent des systèmes dynamiques produisant des échantillons ayant les statistiques voulues et nous les évaluons à l'aide d'une méthode d'estimation de densité non-paramétrique. Nous montrons que ces méthodes mixent substantiellement mieux que la méthode conventionnelle, l'échantillonnage de Gibbs.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Sigmoid type belief networks, a class of probabilistic neural networks, provide a natural framework for compactly representing probabilistic information in a variety of unsupervised and supervised learning problems. Often the parameters used in these networks need to be learned from examples. Unfortunately, estimating the parameters via exact probabilistic calculations (i.e, the EM-algorithm) is intractable even for networks with fairly small numbers of hidden units. We propose to avoid the infeasibility of the E step by bounding likelihoods instead of computing them exactly. We introduce extended and complementary representations for these networks and show that the estimation of the network parameters can be made fast (reduced to quadratic optimization) by performing the estimation in either of the alternative domains. The complementary networks can be used for continuous density estimation as well.