897 resultados para Transformation-based semi-parametric estimators
Resumo:
L'Enquête rétrospective sur les travailleurs sélectionnés au Québec a permis d’analyser la relation formation-emploi des immigrantes — arrivées comme requérantes principales — et de jeter un regard sur le parcours en emploi de ces femmes, en comparaison avec leurs homologues masculins. Une attention particulière est mise sur l'effet de genre et de la région de provenance, ainsi que l'interaction entre ces deux variables. Des modèles semi-paramétriques de Cox mettent en exergue comment les caractéristiques individuelles, mais aussi les activités de formation dans la société d’accueil, affectent au fil du temps les risques relatifs d’obtenir un premier emploi correspondant à ses qualifications scolaires prémigratoires. Puis, des régressions linéaires font état des déterminants du salaire après deux ans sur le territoire. Les résultats montrent que l'accès à l'emploi qualifié n'est pas affecté différemment selon que l'immigrant soit un homme ou une femme. Des différences intragroupes apparaissent toutefois en fonction de la région de provenance, avec un net avantage pour les immigrants de l'Europe de l'Ouest et des États-Unis. L'accès au premier emploi (sans distinction pour les qualifications) et le salaire révèlent, quant à eux, des différences sur la base du genre, avec un désavantage pour les femmes. Chez ces dernières, l'insertion en emploi se fait de façon similaire entre les groupes régionaux, alors que les groupes d'hommes sont plus hétérogènes. D'ailleurs, certaines caractéristiques individuelles, comme la connaissance du français et la catégorie d'admission, affectent différemment les immigrants et les immigrantes dans l'accès au premier emploi.
Resumo:
Multivariate lifetime data arise in various forms including recurrent event data when individuals are followed to observe the sequence of occurrences of a certain type of event; correlated lifetime when an individual is followed for the occurrence of two or more types of events, or when distinct individuals have dependent event times. In most studies there are covariates such as treatments, group indicators, individual characteristics, or environmental conditions, whose relationship to lifetime is of interest. This leads to a consideration of regression models.The well known Cox proportional hazards model and its variations, using the marginal hazard functions employed for the analysis of multivariate survival data in literature are not sufficient to explain the complete dependence structure of pair of lifetimes on the covariate vector. Motivated by this, in Chapter 2, we introduced a bivariate proportional hazards model using vector hazard function of Johnson and Kotz (1975), in which the covariates under study have different effect on two components of the vector hazard function. The proposed model is useful in real life situations to study the dependence structure of pair of lifetimes on the covariate vector . The well known partial likelihood approach is used for the estimation of parameter vectors. We then introduced a bivariate proportional hazards model for gap times of recurrent events in Chapter 3. The model incorporates both marginal and joint dependence of the distribution of gap times on the covariate vector . In many fields of application, mean residual life function is considered superior concept than the hazard function. Motivated by this, in Chapter 4, we considered a new semi-parametric model, bivariate proportional mean residual life time model, to assess the relationship between mean residual life and covariates for gap time of recurrent events. The counting process approach is used for the inference procedures of the gap time of recurrent events. In many survival studies, the distribution of lifetime may depend on the distribution of censoring time. In Chapter 5, we introduced a proportional hazards model for duration times and developed inference procedures under dependent (informative) censoring. In Chapter 6, we introduced a bivariate proportional hazards model for competing risks data under right censoring. The asymptotic properties of the estimators of the parameters of different models developed in previous chapters, were studied. The proposed models were applied to various real life situations.
Resumo:
In this paper we propose a cryptographic transformation based on matrix manipulations for image encryption. Substitution and diffusion operations, based on the matrix, facilitate fast conversion of plaintext and images into ciphertext and cipher images. The paper describes the encryption algorithm, discusses the simulation results and compares with results obtained from Advanced Encryption Standard (AES). It is shown that the proposed algorithm is capable of encrypting images eight times faster than AES.
Más allá de la infraestructura: el impacto de las bibliotecas públicas en la calidad de la educación
Resumo:
La literatura sobre la calidad de la educación ha prestado poca atención al papel que tienen las bibliotecas públicas dentro de los determinantes del desempeño educativo. Las bibliotecas públicas son activos externos al colegio y al hogar del estudiante, pero hacen parte del entorno social que les rodea. La puesta en marcha a finales de 2001 de tres bibliotecas de gran tamaño en Bogotá, conocidas como megabibliotecas, nos permite analizar el impacto de estas iniciativas sobre la calidad de la educación en los colegios aledaños. Dicho impacto se daría a través de mecanismos adicionales a la simple reducción de costos al acceso a la información: las bibliotecas renovaron el espacio público mediante la generación de espacios agradables y amigables hacia la educación, además ofrecen regularmente actividades lúdicas dirigidas a las habitantes del sector. Aprovechando la distancia del plantel educativo a la biblioteca como una aproximación al costo de acceso a la misma, utilizando para ello Diferencia en Diferencias junto a la descomposición Blinder Oaxaca. Encontramos que las mismas parecen no tener un impacto significativo sobre el desempeño académico general en los exámenes oficiales SABER 11 durante los años posteriores a su implementación. Se recomienda analizar programas específicos que aprovechen las bibliotecas para actividades escolares y otras posibles variables de impacto como actitudes hacia el estudio y aspiraciones a la educación superior.
Resumo:
The work reported in this paper is motivated by the need to investigate general methods for pattern transformation. A formal definition for pattern transformation is provided and four special cases namely, elementary and geometric transformation based on repositioning all and some agents in the pattern are introduced. The need for a mathematical tool and simulations for visualizing the behavior of a transformation method is highlighted. A mathematical method based on the Moebius transformation is proposed. The transformation method involves discretization of events for planning paths of individual robots in a pattern. Simulations on a particle physics simulator are used to validate the feasibility of the proposed method.
Resumo:
The work reported in this paper is motivated by the need to investigate general methods for pattern transformation. A formal definition for pattern transformation is provided and four special cases namely, elementary and geometric transformation based on repositioning all and some agents in the pattern are introduced. The need for a mathematical tool and simulations for visualizing the behavior of a transformation method is highlighted. A mathematical method based on the Moebius transformation is proposed. The transformation method involves discretization of events for planning paths of individual robots in a pattern. Simulations on a particle physics simulator are used to validate the feasibility of the proposed method.
Resumo:
A modified radial basis function (RBF) neural network and its identification algorithm based on observational data with heterogeneous noise are introduced. The transformed system output of Box-Cox is represented by the RBF neural network. To identify the model from observational data, the singular value decomposition of the full regression matrix consisting of basis functions formed by system input data is initially carried out and a new fast identification method is then developed using Gauss-Newton algorithm to derive the required Box-Cox transformation, based on a maximum likelihood estimator (MLE) for a model base spanned by the largest eigenvectors. Finally, the Box-Cox transformation-based RBF neural network, with good generalisation and sparsity, is identified based on the derived optimal Box-Cox transformation and an orthogonal forward regression algorithm using a pseudo-PRESS statistic to select a sparse RBF model with good generalisation. The proposed algorithm and its efficacy are demonstrated with numerical examples.
Resumo:
We address the problem of automatically identifying and restoring damaged and contaminated images. We suggest a novel approach based on a semi-parametric model. This has two components, a parametric component describing known physical characteristics and a more flexible non-parametric component. The latter avoids the need for a detailed model for the sensor, which is often costly to produce and lacking in robustness. We assess our approach using an analysis of electroencephalographic images contaminated by eye-blink artefacts and highly damaged photographs contaminated by non-uniform lighting. These experiments show that our approach provides an effective solution to problems of this type.
Resumo:
In this article, we introduce a semi-parametric Bayesian approach based on Dirichlet process priors for the discrete calibration problem in binomial regression models. An interesting topic is the dosimetry problem related to the dose-response model. A hierarchical formulation is provided so that a Markov chain Monte Carlo approach is developed. The methodology is applied to simulated and real data.
Resumo:
This paper presents a methodology to estimate and identify different kinds of economic interaction, whenever these interactions can be established in the form of spatial dependence. First, we apply the semi-parametric approach of Chen and Conley (2001) to the estimation of reaction functions. Then, the methodology is applied to the analysis financial providers in Thailand. Based on a sample of financial institutions, we provide an economic framework to test if the actual spatial pattern is compatible with strategic competition (local interactions) or social planning (global interactions). Our estimates suggest that the provision of commercial banks and suppliers credit access is determined by spatial competition, while the Thai Bank of Agriculture and Agricultural Cooperatives is distributed as in a social planner problem.
Resumo:
The goal of this paper is to introduce a class of tree-structured models that combines aspects of regression trees and smooth transition regression models. The model is called the Smooth Transition Regression Tree (STR-Tree). The main idea relies on specifying a multiple-regime parametric model through a tree-growing procedure with smooth transitions among different regimes. Decisions about splits are entirely based on a sequence of Lagrange Multiplier (LM) tests of hypotheses.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Concept drift is a problem of increasing importance in machine learning and data mining. Data sets under analysis are no longer only static databases, but also data streams in which concepts and data distributions may not be stable over time. However, most learning algorithms produced so far are based on the assumption that data comes from a fixed distribution, so they are not suitable to handle concept drifts. Moreover, some concept drifts applications requires fast response, which means an algorithm must always be (re) trained with the latest available data. But the process of labeling data is usually expensive and/or time consuming when compared to unlabeled data acquisition, thus only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are also based on the assumption that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenge in machine learning. Recently, a particle competition and cooperation approach was used to realize graph-based semi-supervised learning from static data. In this paper, we extend that approach to handle data streams and concept drift. The result is a passive algorithm using a single classifier, which naturally adapts to concept changes, without any explicit drift detection mechanism. Its built-in mechanisms provide a natural way of learning from new data, gradually forgetting older knowledge as older labeled data items became less influent on the classification of newer data items. Some computer simulation are presented, showing the effectiveness of the proposed method.
Resumo:
Semi-supervised learning is applied to classification problems where only a small portion of the data items is labeled. In these cases, the reliability of the labels is a crucial factor, because mislabeled items may propagate wrong labels to a large portion or even the entire data set. This paper aims to address this problem by presenting a graph-based (network-based) semi-supervised learning method, specifically designed to handle data sets with mislabeled samples. The method uses teams of walking particles, with competitive and cooperative behavior, for label propagation in the network constructed from the input data set. The proposed model is nature-inspired and it incorporates some features to make it robust to a considerable amount of mislabeled data items. Computer simulations show the performance of the method in the presence of different percentage of mislabeled data, in networks of different sizes and average node degree. Importantly, these simulations reveals the existence of the critical points of the mislabeled subset size, below which the network is free of wrong label contamination, but above which the mislabeled samples start to propagate their labels to the rest of the network. Moreover, numerical comparisons have been made among the proposed method and other representative graph-based semi-supervised learning methods using both artificial and real-world data sets. Interestingly, the proposed method has increasing better performance than the others as the percentage of mislabeled samples is getting larger. © 2012 IEEE.
Resumo:
Concept drift, which refers to non stationary learning problems over time, has increasing importance in machine learning and data mining. Many concept drift applications require fast response, which means an algorithm must always be (re)trained with the latest available data. But the process of data labeling is usually expensive and/or time consuming when compared to acquisition of unlabeled data, thus usually only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are based on assumptions that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenging task in machine learning. Recently, a particle competition and cooperation approach has been developed to realize graph-based semi-supervised learning from static data. We have extend that approach to handle data streams and concept drift. The result is a passive algorithm which uses a single classifier approach, naturally adapted to concept changes without any explicit drift detection mechanism. It has built-in mechanisms that provide a natural way of learning from new data, gradually "forgetting" older knowledge as older data items are no longer useful for the classification of newer data items. The proposed algorithm is applied to the KDD Cup 1999 Data of network intrusion, showing its effectiveness.