967 resultados para Feature Model
Resumo:
Personalized medicine will revolutionize our capabilities to combat disease. Working toward this goal, a fundamental task is the deciphering of geneticvariants that are predictive of complex diseases. Modern studies, in the formof genome-wide association studies (GWAS) have afforded researchers with the opportunity to reveal new genotype-phenotype relationships through the extensive scanning of genetic variants. These studies typically contain over half a million genetic features for thousands of individuals. Examining this with methods other than univariate statistics is a challenging task requiring advanced algorithms that are scalable to the genome-wide level. In the future, next-generation sequencing studies (NGS) will contain an even larger number of common and rare variants. Machine learning-based feature selection algorithms have been shown to have the ability to effectively create predictive models for various genotype-phenotype relationships. This work explores the problem of selecting genetic variant subsets that are the most predictive of complex disease phenotypes through various feature selection methodologies, including filter, wrapper and embedded algorithms. The examined machine learning algorithms were demonstrated to not only be effective at predicting the disease phenotypes, but also doing so efficiently through the use of computational shortcuts. While much of the work was able to be run on high-end desktops, some work was further extended so that it could be implemented on parallel computers helping to assure that they will also scale to the NGS data sets. Further, these studies analyzed the relationships between various feature selection methods and demonstrated the need for careful testing when selecting an algorithm. It was shown that there is no universally optimal algorithm for variant selection in GWAS, but rather methodologies need to be selected based on the desired outcome, such as the number of features to be included in the prediction model. It was also demonstrated that without proper model validation, for example using nested cross-validation, the models can result in overly-optimistic prediction accuracies and decreased generalization ability. It is through the implementation and application of machine learning methods that one can extract predictive genotype–phenotype relationships and biological insights from genetic data sets.
Resumo:
This thesis discusses the basic problem of the modern portfolio theory about how to optimise the perfect allocation for an investment portfolio. The theory provides a solution for an efficient portfolio, which minimises the risk of the portfolio with respect to the expected return. A central feature for all the portfolios on the efficient frontier is that the investor needs to provide the expected return for each asset. Market anomalies are persistent patterns seen in the financial markets, which cannot be explained with the current asset pricing theory. The goal of this thesis is to study whether these anomalies can be observed among different asset classes. Finally, if persistent patterns are found, it is investigated whether the anomalies hold valuable information for determining the expected returns used in the portfolio optimization Market anomalies and investment strategies based on them are studied with a rolling estimation window, where the return for the following period is always based on historical information. This is also crucial when rebalancing the portfolio. The anomalies investigated within this thesis are value, momentum, reversal, and idiosyncratic volatility. The research data includes price series of country level stock indices, government bonds, currencies, and commodities. The modern portfolio theory and the views given by the anomalies are combined by utilising the Black-Litterman model. This makes it possible to optimise the portfolio so that investor’s views are taken into account. When constructing the portfolios, the goal is to maximise the Sharpe ratio. Significance of the results is studied by assessing if the strategy yields excess returns in a relation to those explained by the threefactormodel. The most outstanding finding is that anomaly based factors include valuable information to enhance efficient portfolio diversification. When the highest Sharpe ratios for each asset class are picked from the test factors and applied to the Black−Litterman model, the final portfolio results in superior riskreturn combination. The highest Sharpe ratios are provided by momentum strategy for stocks and long-term reversal for the rest of the asset classes. Additionally, a strategy based on the value effect was highly appealing, and it basically performs as well as the previously mentioned Sharpe strategy. When studying the anomalies, it is found, that 12-month momentum is the strongest effect, especially for stock indices. In addition, a high idiosyncratic volatility seems to be positively correlated with country indices on stocks.
Resumo:
Finnish Defence Studies is published under the auspices of the War College, and the contributions reflect the fields of research and teaching of the College. Finnish Defence Studies will occasionally feature documentation on Finnish Security Policy. Views expressed are those of the authors and do not necessarily imply endorsement by the War College.
Resumo:
The purpose of this study is to examine the impact of the choice of cut-off points, sampling procedures, and the business cycle on the accuracy of bankruptcy prediction models. Misclassification can result in erroneous predictions leading to prohibitive costs to firms, investors and the economy. To test the impact of the choice of cut-off points and sampling procedures, three bankruptcy prediction models are assessed- Bayesian, Hazard and Mixed Logit. A salient feature of the study is that the analysis includes both parametric and nonparametric bankruptcy prediction models. A sample of firms from Lynn M. LoPucki Bankruptcy Research Database in the U. S. was used to evaluate the relative performance of the three models. The choice of a cut-off point and sampling procedures were found to affect the rankings of the various models. In general, the results indicate that the empirical cut-off point estimated from the training sample resulted in the lowest misclassification costs for all three models. Although the Hazard and Mixed Logit models resulted in lower costs of misclassification in the randomly selected samples, the Mixed Logit model did not perform as well across varying business-cycles. In general, the Hazard model has the highest predictive power. However, the higher predictive power of the Bayesian model, when the ratio of the cost of Type I errors to the cost of Type II errors is high, is relatively consistent across all sampling methods. Such an advantage of the Bayesian model may make it more attractive in the current economic environment. This study extends recent research comparing the performance of bankruptcy prediction models by identifying under what conditions a model performs better. It also allays a range of user groups, including auditors, shareholders, employees, suppliers, rating agencies, and creditors' concerns with respect to assessing failure risk.
Resumo:
This lexical decision study with eye tracking of Japanese two-kanji-character words investigated the order in which a whole two-character word and its morphographic constituents are activated in the course of lexical access, the relative contributions of the left and the right characters in lexical decision, the depth to which semantic radicals are processed, and how nonlinguistic factors affect lexical processes. Mixed-effects regression analyses of response times and subgaze durations (i.e., first-pass fixation time spent on each of the two characters) revealed joint contributions of morphographic units at all levels of the linguistic structure with the magnitude and the direction of the lexical effects modulated by readers’ locus of attention in a left-to-right preferred processing path. During the early time frame, character effects were larger in magnitude and more robust than radical and whole-word effects, regardless of the font size and the type of nonwords. Extending previous radical-based and character-based models, we propose a task/decision-sensitive character-driven processing model with a level-skipping assumption: Connections from the feature level bypass the lower radical level and link up directly to the higher character level.
Resumo:
I study long-term financial contracts between lenders and borrowers in the absence of perfect enforceability and when both parties are credit constrained. Borrowers repeatedly have projects to undertake and need external financing. Lenders can commit to contractual agreements whereas borrowers can renege any period. I show that equilibrium contracts feature interesting dynamics: the economy exhibits efficient investment cycles; absence of perfect enforcement and shortage of capital skew the cycles toward states of liquidity drought; credit is rationed if either the lender has too little capital or if the borrower has too little collateral. This paper's technical contribution is its demonstration of the existence and characterization of financial contracts that are solutions to a non-convex dynamic programming problem.
Resumo:
Ce mémoire de maîtrise présente une nouvelle approche non supervisée pour détecter et segmenter les régions urbaines dans les images hyperspectrales. La méthode proposée n ́ecessite trois étapes. Tout d’abord, afin de réduire le coût calculatoire de notre algorithme, une image couleur du contenu spectral est estimée. A cette fin, une étape de réduction de dimensionalité non-linéaire, basée sur deux critères complémentaires mais contradictoires de bonne visualisation; à savoir la précision et le contraste, est réalisée pour l’affichage couleur de chaque image hyperspectrale. Ensuite, pour discriminer les régions urbaines des régions non urbaines, la seconde étape consiste à extraire quelques caractéristiques discriminantes (et complémentaires) sur cette image hyperspectrale couleur. A cette fin, nous avons extrait une série de paramètres discriminants pour décrire les caractéristiques d’une zone urbaine, principalement composée d’objets manufacturés de formes simples g ́eométriques et régulières. Nous avons utilisé des caractéristiques texturales basées sur les niveaux de gris, la magnitude du gradient ou des paramètres issus de la matrice de co-occurrence combinés avec des caractéristiques structurelles basées sur l’orientation locale du gradient de l’image et la détection locale de segments de droites. Afin de réduire encore la complexité de calcul de notre approche et éviter le problème de la ”malédiction de la dimensionnalité” quand on décide de regrouper des données de dimensions élevées, nous avons décidé de classifier individuellement, dans la dernière étape, chaque caractéristique texturale ou structurelle avec une simple procédure de K-moyennes et ensuite de combiner ces segmentations grossières, obtenues à faible coût, avec un modèle efficace de fusion de cartes de segmentations. Les expérimentations données dans ce rapport montrent que cette stratégie est efficace visuellement et se compare favorablement aux autres méthodes de détection et segmentation de zones urbaines à partir d’images hyperspectrales.
Resumo:
L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires.
Pro-inflammatory and angiogenic activities of VEGF and angiopoietins in murine sponge/Matrigel model
Resumo:
La dérégulation de la formation et l'intégrité des vaisseaux sanguins peut conduire à un état pathologique tel qu’observé dans de nombreuses maladies ischémiques telles que: la croissance de tumeur solide, l’arthrite rhumatoïde, le psoriasis, les rétinopathies et l'athérosclérose. Par conséquent, la possibilité de moduler l'angiogenèse régionale chez les patients souffrant d'ischémie est cliniquement pertinente. Un élément clé dans l'induction de l'angiogenèse pathologique est une inflammation qui précède et accompagne la formation des nouveaux vaisseaux. Ce phénomène est démontré par l'augmentation de la perméabilité vasculaire et le recrutement de monocytes/ macrophages et cellules polynucléaires (neutrophiles). En collaboration avec d'autres groupes, nous avons montré que différents facteurs de croissance tels que le facteur de croissance endothélial vasculaire et les angiopoïétines peuvent non seulement promouvoir l'angiogenèse mais aussi induire diverses étapes connexes au processus de la réaction inflammatoire, y compris la synthèse et la libération des médiateurs inflammatoires et la migration des neutrophiles. Les objectifs de notre étude étaient d'adresser si le vascular endothelial growth factor (VEGF) et les angiopoïétines (Ang1 et Ang2) sont capables de promouvoir la formation des nouveaux vaisseaux sanguins au fil du temps et d'identifier la présence de différentes cellules inflammatoires dans ce processus. Des éponges d'alcool polyvinylique stérilisées et imbibées de Matrigel appauvri en facteur de croissance (contenant PBS, VEGF, Ang1 ou Ang2 (200 ng/200 μl)) ont été insérées sous la peau de souris C57/Bl6 anesthésiées. Les éponges ont ensuite été retirées aux jours 4, 7, 14 ou 21 après la procédure pour des analyses histologiques, immunohistologiques et cytométriques. La formation des nouveaux vaisseaux a été validée par la coloration au Trichrome de Masson et des analyses histologiques et immunohistologiques contre les cellules endothéliales (anti-CD31). De plus, la maturation des vaisseaux a été démontrée par la coloration séquentielle contre les cellules endothéliales (anti-CD31) et musculaires lisses (anti-alpha-actine). Nous avons effectué la même procédure pour caractériser le recrutement de neutrophiles (anti-MPO), et de macrophages (anti-F4/80). Afin de mieux délimiter la présence de différents sous-ensembles de leucocytes recrutés dans les éponges, nous avons utilisé une technique de cytométrie en flux sur des préparations de cellules isolées à partir de ces éponges. Nous avons observé que le VEGF et les angiopoïétines favorisent le recrutement de cellules endothéliales et la formation de nouveaux vaisseaux plus rapidement qu’en présence de PBS. Une fois formé au jour 7, ces nouveaux vaisseaux restent stables en nombre, et ne subissent pas une réorganisation importante de leur surface. Ces vaisseaux maturent grâce au recrutement et au recouvrement par les cellules musculaires lisses des néovaisseaux. En outre, le microenvironnement angiogénique est composé de cellules inflammatoires, principalement de neutrophiles, macrophages et quelques cellules de type B et T. Donc, le VEGF, l’Ang1 et l’Ang2 induisent séparément la formation et la stabilisation de nouveaux vaisseaux sanguins, ainsi que le recrutement de cellules inflammatoires avec des puissances différentes et une action temps-dépendante dans un modèle d’éponge/Matrigel.
Resumo:
This thesis presents the methodology of linking Total Productive Maintenance (TPM) and Quality Function Deployment (QFD). The Synergic power ofTPM and QFD led to the formation of a new maintenance model named Maintenance Quality Function Deployment (MQFD). This model was found so powerful that, it could overcome the drawbacks of TPM, by taking care of customer voices. Those voices of customers are used to develop the house of quality. The outputs of house of quality, which are in the form of technical languages, are submitted to the top management for making strategic decisions. The technical languages, which are concerned with enhancing maintenance quality, are strategically directed by the top management towards their adoption of eight TPM pillars. The TPM characteristics developed through the development of eight pillars are fed into the production system, where their implementation is focused towards increasing the values of the maintenance quality parameters, namely overall equipment efficiency (GEE), mean time between failures (MTBF), mean time to repair (MTIR), performance quality, availability and mean down time (MDT). The outputs from production system are required to be reflected in the form of business values namely improved maintenance quality, increased profit, upgraded core competence, and enhanced goodwill. A unique feature of the MQFD model is that it is not necessary to change or dismantle the existing process ofdeveloping house ofquality and TPM projects, which may already be under practice in the company concerned. Thus, the MQFD model enables the tactical marriage between QFD and TPM.First, the literature was reviewed. The results of this review indicated that no activities had so far been reported on integrating QFD in TPM and vice versa. During the second phase, a survey was conducted in six companies in which TPM had been implemented. The objective of this survey was to locate any traces of QFD implementation in TPM programme being implemented in these companies. This survey results indicated that no effort on integrating QFD in TPM had been made in these companies. After completing these two phases of activities, the MQFD model was designed. The details of this work are presented in this research work. Followed by this, the explorative studies on implementing this MQFD model in real time environments were conducted. In addition to that, an empirical study was carried out to examine the receptivity of MQFD model among the practitioners and multifarious organizational cultures. Finally, a sensitivity analysis was conducted to find the hierarchy of various factors influencing MQFD in a company. Throughout the research work, the theory and practice of MQFD were juxtaposed by presenting and publishing papers among scholarly communities and conducting case studies in real time scenario.
Resumo:
Geometric parameters of binary (1:1) PdZn and PtZn alloys with CuAu-L10 structure were calculated with a density functional method. Based on the total energies, the alloys are predicted to feature equal formation energies. Calculated surface energies of PdZn and PtZn alloys show that (111) and (100) surfaces exposing stoichiometric layers are more stable than (001) and (110) surfaces comprising alternating Pd (Pt) and Zn layers. The surface energy values of alloys lie between the surface energies of the individual components, but they differ from their composition weighted averages. Compared with the pure metals, the valence d-band widths and the Pd or Pt partial densities of states at the Fermi level are dramatically reduced in PdZn and PtZn alloys. The local valence d-band density of states of Pd and Pt in the alloys resemble that of metallic Cu, suggesting that a similar catalytic performance of these systems can be related to this similarity in the local electronic structures.
Effectiveness Of Feature Detection Operators On The Performance Of Iris Biometric Recognition System
Resumo:
Iris Recognition is a highly efficient biometric identification system with great possibilities for future in the security systems area.Its robustness and unobtrusiveness, as opposed tomost of the currently deployed systems, make it a good candidate to replace most of thesecurity systems around. By making use of the distinctiveness of iris patterns, iris recognition systems obtain a unique mapping for each person. Identification of this person is possible by applying appropriate matching algorithm.In this paper, Daugman’s Rubber Sheet model is employed for irisnormalization and unwrapping, descriptive statistical analysis of different feature detection operators is performed, features extracted is encoded using Haar wavelets and for classification hammingdistance as a matching algorithm is used. The system was tested on the UBIRIS database. The edge detection algorithm, Canny, is found to be the best one to extract most of the iris texture. The success rate of feature detection using canny is 81%, False Accept Rate is 9% and False Reject Rate is 10%.
Resumo:
Treating e-mail filtering as a binary text classification problem, researchers have applied several statistical learning algorithms to email corpora with promising results. This paper examines the performance of a Naive Bayes classifier using different approaches to feature selection and tokenization on different email corpora
Resumo:
Thunderstorm, resulting from vigorous convective activity, is one of the most spectacular weather phenomena in the atmosphere. A common feature of the weather during the pre-monsoon season over the Indo-Gangetic Plain and northeast India is the outburst of severe local convective storms, commonly known as ‘Nor’westers’(as they move from northwest to southeast). The severe thunderstorms associated with thunder, squall lines, lightning and hail cause extensive losses in agricultural, damage to structure and also loss of life. In this paper, sensitivity experiments have been conducted with the Non-hydrostatic Mesoscale Model (NMM) to test the impact of three microphysical schemes in capturing the severe thunderstorm event occurred over Kolkata on 15 May 2009. The results show that the WRF-NMM model with Ferrier microphysical scheme appears to reproduce the cloud and precipitation processes more realistically than other schemes. Also, we have made an attempt to diagnose four severe thunderstorms that occurred during pre-monsoon seasons of 2006, 2007 and 2008 through the simulated radar reflectivity fields from NMM model with Ferrier microphysics scheme and validated the model results with Kolkata Doppler Weather Radar (DWR) observations. Composite radar reflectivity simulated by WRF-NMM model clearly shows the severe thunderstorm movement as observed by DWR imageries, but failed to capture the intensity as in observations. The results of these analyses demonstrated the capability of high resolution WRF-NMM model in the simulation of severe thunderstorm events and determined that the 3 km model improve upon current abilities when it comes to simulating severe thunderstorms over east Indian region
Resumo:
This thesis describes the development of a model-based vision system that exploits hierarchies of both object structure and object scale. The focus of the research is to use these hierarchies to achieve robust recognition based on effective organization and indexing schemes for model libraries. The goal of the system is to recognize parameterized instances of non-rigid model objects contained in a large knowledge base despite the presence of noise and occlusion. Robustness is achieved by developing a system that can recognize viewed objects that are scaled or mirror-image instances of the known models or that contain components sub-parts with different relative scaling, rotation, or translation than in models. The approach taken in this thesis is to develop an object shape representation that incorporates a component sub-part hierarchy- to allow for efficient and correct indexing into an automatically generated model library as well as for relative parameterization among sub-parts, and a scale hierarchy- to allow for a general to specific recognition procedure. After analysis of the issues and inherent tradeoffs in the recognition process, a system is implemented using a representation based on significant contour curvature changes and a recognition engine based on geometric constraints of feature properties. Examples of the system's performance are given, followed by an analysis of the results. In conclusion, the system's benefits and limitations are presented.