923 resultados para model selection in binary regression


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Affiliation: Département de Biochimie, Université de Montréal

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Les logiciels utilisés sont Splus et R.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Le travail présenté dans cette thèse porte sur le rôle du cortex prémoteur dorsal (PMd) au sujet de la prise de décision (sélection d’une action parmis nombreux choix) et l'orientation visuelle des mouvements du bras. L’ouvrage décrit des expériences électrophysiologiques chez le singe éveillé (Macaca mulatta) permettant d’adresser une fraction importante des prédictions proposées par l'hypothèse des affordances concurrentes (Cisek, 2006; Cisek, 2007a). Cette hypothèse suggère que le choix de toute action est l’issue d'une concurrence entre les représentations internes des exigences et des atouts de chacune des options présentées (affordances; Gibson, 1979). Un intérêt particulier est donné au traitement de l'information spatiale et la valeur des options (expected value, EV) dans la prise de décisions. La première étude (article 1) explore la façon dont PMd reflète ces deux paramètres dans la période délai ainsi que de leur intéraction. La deuxième étude (article 2) explore le mécanisme de décision de façon plus détaillée et étend les résultats au cortex prémoteur ventral (PMv). Cette étude porte également sur la représentation spatiale et l’EV dans une perspective d'apprentissage. Dans un environnement nouveau les paramètres spatiaux des actions semblent être présents en tout temps dans PMd, malgré que la représentation de l’EV apparaît uniquement lorsque les animaux commencent à prendre des décisions éclairées au sujet de la valeur des options disponibles. La troisième étude (article 3) explore la façon dont PMd est impliqué aux “changements d'esprit“ dans un procès de décision. Cette étude décrit comment la sélection d’une action est mise à jour à la suite d'une instruction de mouvement (GO signal). I II Les résultats principaux des études sont reproduits par un modèle computationnel (Cisek, 2006) suggérant que la prise de décision entre plusieurs actions alternatives peux se faire par voie d’un mécanisme de concurrence (biased competition) qui aurait lieu dans la même région qui spécifie les actions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cette thèse a pour but d’améliorer l’automatisation dans l’ingénierie dirigée par les modèles (MDE pour Model Driven Engineering). MDE est un paradigme qui promet de réduire la complexité du logiciel par l’utilisation intensive de modèles et des transformations automatiques entre modèles (TM). D’une façon simplifiée, dans la vision du MDE, les spécialistes utilisent plusieurs modèles pour représenter un logiciel, et ils produisent le code source en transformant automatiquement ces modèles. Conséquemment, l’automatisation est un facteur clé et un principe fondateur de MDE. En plus des TM, d’autres activités ont besoin d’automatisation, e.g. la définition des langages de modélisation et la migration de logiciels. Dans ce contexte, la contribution principale de cette thèse est de proposer une approche générale pour améliorer l’automatisation du MDE. Notre approche est basée sur la recherche méta-heuristique guidée par les exemples. Nous appliquons cette approche sur deux problèmes importants de MDE, (1) la transformation des modèles et (2) la définition précise de langages de modélisation. Pour le premier problème, nous distinguons entre la transformation dans le contexte de la migration et les transformations générales entre modèles. Dans le cas de la migration, nous proposons une méthode de regroupement logiciel (Software Clustering) basée sur une méta-heuristique guidée par des exemples de regroupement. De la même façon, pour les transformations générales, nous apprenons des transformations entre modèles en utilisant un algorithme de programmation génétique qui s’inspire des exemples des transformations passées. Pour la définition précise de langages de modélisation, nous proposons une méthode basée sur une recherche méta-heuristique, qui dérive des règles de bonne formation pour les méta-modèles, avec l’objectif de bien discriminer entre modèles valides et invalides. Les études empiriques que nous avons menées, montrent que les approches proposées obtiennent des bons résultats tant quantitatifs que qualitatifs. Ceux-ci nous permettent de conclure que l’amélioration de l’automatisation du MDE en utilisant des méthodes de recherche méta-heuristique et des exemples peut contribuer à l’adoption plus large de MDE dans l’industrie à là venir.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La régression logistique est un modèle de régression linéaire généralisée (GLM) utilisé pour des variables à expliquer binaires. Le modèle cherche à estimer la probabilité de succès de cette variable par la linéarisation de variables explicatives. Lorsque l’objectif est d’estimer le plus précisément l’impact de différents incitatifs d’une campagne marketing (coefficients de la régression logistique), l’identification de la méthode d’estimation la plus précise est recherchée. Nous comparons, avec la méthode MCMC d’échantillonnage par tranche, différentes densités a priori spécifiées selon différents types de densités, paramètres de centralité et paramètres d’échelle. Ces comparaisons sont appliquées sur des échantillons de différentes tailles et générées par différentes probabilités de succès. L’estimateur du maximum de vraisemblance, la méthode de Gelman et celle de Genkin viennent compléter le comparatif. Nos résultats démontrent que trois méthodes d’estimations obtiennent des estimations qui sont globalement plus précises pour les coefficients de la régression logistique : la méthode MCMC d’échantillonnage par tranche avec une densité a priori normale centrée en 0 de variance 3,125, la méthode MCMC d’échantillonnage par tranche avec une densité Student à 3 degrés de liberté aussi centrée en 0 de variance 3,125 ainsi que la méthode de Gelman avec une densité Cauchy centrée en 0 de paramètre d’échelle 2,5.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research was undertaken with the primary objective of explaining differences in consumption of personal care products using personality variables. Several streams of research reported were reviewed and a conceptual model was developed. Theories on the relationship between self concept and behaviour was reviewed and the need to use individual difference variables to conceptualize and measure the salient dimensions of the self were emphasized. Theories relating to social comparison, eating disorders, role of idealized media images in shaping the self-concept, evidence on cosmetic surgery and persuasibility were reviewed in the study. These came from diverse fields like social psychology, use of cosmetics, women studies, media studies, self-concept literature in psychology and consumer research, and marketing. From the review three basic dimensions, namely self-evaluation, self-awareness and persuasibility were identified and they were posited to be related to consumption. Several personality variables from these conceptual domains were identified and factor analysis confirmed the expected structure fitting the basic theoretical dimensions. Demographic variables like gender and income were also considered.It was found that self-awareness measured by the variable public self-consciousness explain differences in consumption of personal care products. The relationship between public self-consciousness and consumption was found to be most conspicuous in cases of poor self-, evaluation measured by self-esteem. Susceptibility to advertising also was found to explain differences in consumption.From the research, it may be concluded that personality variables are useful for explaining consumption and they must be used together to explain and understand the process. There may not be obvious and conspicuous links between individual measures and behaviour in marketing. However, when used in proper combination and with the help oftheoretical models personality offers considerable explanatory power as illustrated in the seventy five percent accuracy rate of prediction obtained in binary logistic regression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Summary: Productivity and forage quality of legume-grass swards are important factors for successful arable farming in both organic and conventional farming systems. For these objectives the botanical composition of the swards is of particular importance, especially, the content of legumes due to their ability to fix airborne nitrogen. As it can vary considerably within a field, a non-destructive detection method while doing other tasks would facilitate a more targeted sward management and could predict the nitrogen supply of the soil for the subsequent crop. This study was undertaken to explore the potential of digital image analysis (DIA) for a non destructive prediction of legume dry matter (DM) contribution of legume-grass mixtures. For this purpose an experiment was conducted in a greenhouse, comprising a sample size of 64 experimental swards such as pure swards of red clover (Trifolium pratense L.), white clover (Trifolium repens L.) and lucerne (Medicago sativa L.) as well as binary mixtures of each legume with perennial ryegrass (Lolium perenne L.). Growth stages ranged from tillering to heading and the proportion of legumes from 0 to 80 %. Based on digital sward images three steps were considered in order to estimate the legume contribution (% of DM): i) The development of a digital image analysis (DIA) procedure in order to estimate legume coverage (% of area). ii) The description of the relationship between legume coverage (% area) and legume contribution (% of DM) derived from digital analysis of legume coverage related to the green area in a digital image. iii) The estimation of the legume DM contribution with the findings of i) and ii). i) In order to evaluate the most suitable approach for the estimation of legume coverage by means of DIA different tools were tested. Morphological operators such as erode and dilate support the differentiation of objects of different shape by shrinking and dilating objects (Soille, 1999). When applied to digital images of legume-grass mixtures thin grass leaves were removed whereas rounder clover leaves were left. After this process legume leaves were identified by threshold segmentation. The segmentation of greyscale images turned out to be not applicable since the segmentation between legumes and bare soil failed. The advanced procedure comprising morphological operators and HSL colour information could determine bare soil areas in young and open swards very accurately. Also legume specific HSL thresholds allowed for precise estimations of legume coverage across a wide range from 11.8 - 72.4 %. Based on this legume specific DIA procedure estimated legume coverage showed good correlations with the measured values across the whole range of sward ages (R2 0.96, SE 4.7 %). A wide range of form parameters (i.e. size, breadth, rectangularity, and circularity of areas) was tested across all sward types, but none did improve prediction accuracy of legume coverage significantly. ii) Using measured reference data of legume coverage and contribution, in a first approach a common relationship based on all three legumes and sward ages of 35, 49 and 63 days was found with R2 0.90. This relationship was improved by a legume-specific approach of only 49- and 63-d old swards (R2 0.94, 0.96 and 0.97 for red clover, white clover, and lucerne, respectively) since differing structural attributes of the legume species influence the relationship between these two parameters. In a second approach biomass was included in the model in order to allow for different structures of swards of different ages. Hence, a model was developed, providing a close look on the relationship between legume coverage in binary legume-ryegrass communities and the legume contribution: At the same level of legume coverage, legume contribution decreased with increased total biomass. This phenomenon may be caused by more non-leguminous biomass covered by legume leaves at high levels of total biomass. Additionally, values of legume contribution and coverage were transformed to the logit-scale in order to avoid problems with heteroscedasticity and negative predictions. The resulting relationships between the measured legume contribution and the calculated legume contribution indicated a high model accuracy for all legume species (R2 0.93, 0.97, 0.98 with SE 4.81, 3.22, 3.07 % of DM for red clover, white clover, and lucerne swards, respectively). The validation of the model by using digital images collected over field grown swards with biomass ranges considering the scope of the model shows, that the model is able to predict legume contribution for most common legume-grass swards (Frame, 1992; Ledgard and Steele, 1992; Loges, 1998). iii) An advanced procedure for the determination of legume DM contribution by DIA is suggested, which comprises the inclusion of morphological operators and HSL colour information in the analysis of images and which applies an advanced function to predict legume DM contribution from legume coverage by considering total sward biomass. Low residuals between measured and calculated values of legume dry matter contribution were found for the separate legume species (R2 0.90, 0.94, 0.93 with SE 5.89, 4.31, 5.52 % of DM for red clover, white clover, and lucerne swards, respectively). The introduced DIA procedure provides a rapid and precise estimation of legume DM contribution for different legume species across a wide range of sward ages. Further research is needed in order to adapt the procedure to field scale, dealing with differing light effects and potentially higher swards. The integration of total biomass into the model for determining legume contribution does not necessarily reduce its applicability in practice as a combined estimation of total biomass and legume coverage by field spectroscopy (Biewer et al. 2009) and DIA, respectively, may allow for an accurate prediction of the legume contribution in legume-grass mixtures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The HMAX model has recently been proposed by Riesenhuber & Poggio as a hierarchical model of position- and size-invariant object recognition in visual cortex. It has also turned out to model successfully a number of other properties of the ventral visual stream (the visual pathway thought to be crucial for object recognition in cortex), and particularly of (view-tuned) neurons in macaque inferotemporal cortex, the brain area at the top of the ventral stream. The original modeling study only used ``paperclip'' stimuli, as in the corresponding physiology experiment, and did not explore systematically how model units' invariance properties depended on model parameters. In this study, we aimed at a deeper understanding of the inner workings of HMAX and its performance for various parameter settings and ``natural'' stimulus classes. We examined HMAX responses for different stimulus sizes and positions systematically and found a dependence of model units' responses on stimulus position for which a quantitative description is offered. Interestingly, we find that scale invariance properties of hierarchical neural models are not independent of stimulus class, as opposed to translation invariance, even though both are affine transformations within the image plane.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A novel technique for estimating the rank of the trajectory matrix in the local subspace affinity (LSA) motion segmentation framework is presented. This new rank estimation is based on the relationship between the estimated rank of the trajectory matrix and the affinity matrix built with LSA. The result is an enhanced model selection technique for trajectory matrix rank estimation by which it is possible to automate LSA, without requiring any a priori knowledge, and to improve the final segmentation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

I test the presence of hidden information and action in the automobile insurance market using a data set from several Colombian insurers. To identify the presence of hidden information I find a common knowledge variable providing information on policyholder s risk type which is related to both experienced risk and insurance demand and that was excluded from the pricing mechanism. Such unused variable is the record of policyholder s traffic offenses. I find evidence of adverse selection in six of the nine insurance companies for which the test is performed. From the point of view of hidden action I develop a dynamic model of effort in accident prevention given an insurance contract with bonus experience rating scheme and I show that individual accident probability decreases with previous accidents. This result brings a testable implication for the empirical identification of hidden action and based on that result I estimate an econometric model of the time spans between the purchase of the insurance and the first claim, between the first claim and the second one, and so on. I find strong evidence on the existence of unobserved heterogeneity that deceives the testable implication. Once the unobserved heterogeneity is controlled, I find conclusive statistical grounds supporting the presence of moral hazard in the Colombian insurance market.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El artículo busca encontrar evidencia empírica de los determinantes de la salud, como una medición de capital salud en un país en desarrollo después de una profunda reforma en el sector salud. Siguiendo el modelo de Grossman (1972) y tomando factores institucionales, además de las variables individuales y socioeconómicas. Se usaron las encuestas de 1997 y 2000 donde se responde subjetivamente sobre el estado de salud y tipo de afiliación al sistema de salud. El proceso de estimación usado es un probit ordenado. Los resultados muestran una importante conexión entre las variables individuales, institucionales y socioeconómicas con el estado de salud. El efecto de tipo de acceso al sistema de salud presiona las inequidades en salud.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Detailed knowledge of waterfowl abundance and distribution across Canada is lacking, which limits our ability to effectively conserve and manage their populations. We used 15 years of data from an aerial transect survey to model the abundance of 17 species or species groups of ducks within southern and boreal Canada. We included 78 climatic, hydrological, and landscape variables in Boosted Regression Tree models, allowing flexible response curves and multiway interactions among variables. We assessed predictive performance of the models using four metrics and calculated uncertainty as the coefficient of variation of predictions across 20 replicate models. Maps of predicted relative abundance were generated from resulting models, and they largely match spatial patterns evident in the transect data. We observed two main distribution patterns: a concentrated prairie-parkland distribution and a more dispersed pan-Canadian distribution. These patterns were congruent with the relative importance of predictor variables and model evaluation statistics among the two groups of distributions. Most species had a hydrological variable as the most important predictor, although the specific hydrological variable differed somewhat among species. In some cases, important variables had clear ecological interpretations, but in some instances, e.g., topographic roughness, they may simply reflect chance correlations between species distributions and environmental variables identified by the model-building process. Given the performance of our models, we suggest that the resulting prediction maps can be used in future research and to guide conservation activities, particularly within the bounds of the survey area.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Understanding the effect of habitat fragmentation is a fundamental yet complicated aim of many ecological studies. Beni savanna is a naturally fragmented forest habitat, where forest islands exhibit variation in resources and threats. To understand how the availability of resources and threats affect the use of forest islands by parrots, we applied occupancy modeling to quantify use and detection probabilities for 12 parrot species on 60 forest islands. The presence of urucuri (Attalea phalerata) and macaw (Acrocomia aculeata) palms, the number of tree cavities on the islands, and the presence of selective logging,and fire were included as covariates associated with availability of resources and threats. The model-selection analysis indicated that both resources and threats variables explained the use of forest islands by parrots. For most species, the best models confirmed predictions. The number of cavities was positively associated with use of forest islands by 11 species. The area of the island and the presence of macaw palm showed a positive association with the probability of use by seven and five species, respectively, while selective logging and fire showed a negative association with five and six species, respectively. The Blue-throated Macaw (Ara glaucogularis), the critically endangered parrot species endemic to our study area, was the only species that showed a negative association with both threats. Monitoring continues to be essential to evaluate conservation and management actions of parrot populations. Understanding of how species are using this natural fragmented habitat will help determine which fragments should be preserved and which conservation actions are needed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The characteristics of convectively-generated gravity waves during an episode of deep convection near the coast of Wales are examined in both high resolution mesoscale simulations [with the (UK) Met Oce Unified Model] and in observations from a Mesosphere-Stratosphere-Troposphere (MST) wind profiling Doppler radar. Deep convection reached the tropopause and generated vertically propagating, high frequency waves in the lower stratosphere that produced vertical velocity perturbations O(1 m/s). Wavelet analysis is applied in order to determine the characteristic periods and wavelengths of the waves. In both the simulations and observations, the wavelet spectra contain several distinct preferred scales indicated by multiple spectral peaks. The peaks are most pronounced in the horizontal spectra at several wavelengths less than 50 km. Although these peaks are most clear and of largest amplitude in the highest resolution simulations (with 1 km horizontal grid length), they are also evident in coarser simulations (with 4 km horizontal grid length). Peaks also exist in the vertical and temporal spectra (between approximately 2.5 and 4.5 km, and 10 to 30 minutes, respectively) with good agreement between simulation and observation. Two-dimensional (wavenumber-frequency) spectra demonstrate that each of the selected horizontal scales contains peaks at each of preferred temporal scales revealed by the one- dimensional spectra alone.