940 resultados para maximum likelihood analysis
Resumo:
Le modèle GARCH à changement de régimes est le fondement de cette thèse. Ce modèle offre de riches dynamiques pour modéliser les données financières en combinant une structure GARCH avec des paramètres qui varient dans le temps. Cette flexibilité donne malheureusement lieu à un problème de path dependence, qui a empêché l'estimation du modèle par le maximum de vraisemblance depuis son introduction, il y a déjà près de 20 ans. La première moitié de cette thèse procure une solution à ce problème en développant deux méthodologies permettant de calculer l'estimateur du maximum de vraisemblance du modèle GARCH à changement de régimes. La première technique d'estimation proposée est basée sur l'algorithme Monte Carlo EM et sur l'échantillonnage préférentiel, tandis que la deuxième consiste en la généralisation des approximations du modèle introduites dans les deux dernières décennies, connues sous le nom de collapsing procedures. Cette généralisation permet d'établir un lien méthodologique entre ces approximations et le filtre particulaire. La découverte de cette relation est importante, car elle permet de justifier la validité de l'approche dite par collapsing pour estimer le modèle GARCH à changement de régimes. La deuxième moitié de cette thèse tire sa motivation de la crise financière de la fin des années 2000 pendant laquelle une mauvaise évaluation des risques au sein de plusieurs compagnies financières a entraîné de nombreux échecs institutionnels. À l'aide d'un large éventail de 78 modèles économétriques, dont plusieurs généralisations du modèle GARCH à changement de régimes, il est démontré que le risque de modèle joue un rôle très important dans l'évaluation et la gestion du risque d'investissement à long terme dans le cadre des fonds distincts. Bien que la littérature financière a dévoué beaucoup de recherche pour faire progresser les modèles économétriques dans le but d'améliorer la tarification et la couverture des produits financiers, les approches permettant de mesurer l'efficacité d'une stratégie de couverture dynamique ont peu évolué. Cette thèse offre une contribution méthodologique dans ce domaine en proposant un cadre statistique, basé sur la régression, permettant de mieux mesurer cette efficacité.
Resumo:
Le nombre important de véhicules sur le réseau routier peut entraîner des problèmes d'encombrement et de sécurité. Les usagers des réseaux routiers qui nous intéressent sont les camionneurs qui transportent des marchandises, pouvant rouler avec des véhicules non conformes ou emprunter des routes interdites pour gagner du temps. Le transport de matières dangereuses est réglementé et certains lieux, surtout les ponts et les tunnels, leur sont interdits d'accès. Pour aider à faire appliquer les lois en vigueur, il existe un système de contrôles routiers composé de structures fixes et de patrouilles mobiles. Le déploiement stratégique de ces ressources de contrôle mise sur la connaissance du comportement des camionneurs que nous allons étudier à travers l'analyse de leurs choix de routes. Un problème de choix de routes peut se modéliser en utilisant la théorie des choix discrets, elle-même fondée sur la théorie de l'utilité aléatoire. Traiter ce type de problème avec cette théorie est complexe. Les modèles que nous utiliserons sont tels, que nous serons amenés à faire face à des problèmes de corrélation, puisque plusieurs routes partagent probablement des arcs. De plus, puisque nous travaillons sur le réseau routier du Québec, le choix de routes peut se faire parmi un ensemble de routes dont le nombre est potentiellement infini si on considère celles ayant des boucles. Enfin, l'étude des choix faits par un humain n'est pas triviale. Avec l'aide du modèle de choix de routes retenu, nous pourrons calculer une expression de la probabilité qu'une route soit prise par le camionneur. Nous avons abordé cette étude du comportement en commençant par un travail de description des données collectées. Le questionnaire utilisé par les contrôleurs permet de collecter des données concernant les camionneurs, leurs véhicules et le lieu du contrôle. La description des données observées est une étape essentielle, car elle permet de présenter clairement à un analyste potentiel ce qui est accessible pour étudier les comportements des camionneurs. Les données observées lors d'un contrôle constitueront ce que nous appellerons une observation. Avec les attributs du réseau, il sera possible de modéliser le réseau routier du Québec. Une sélection de certains attributs permettra de spécifier la fonction d'utilité et par conséquent la fonction permettant de calculer les probabilités de choix de routes par un camionneur. Il devient alors possible d'étudier un comportement en se basant sur des observations. Celles provenant du terrain ne nous donnent pas suffisamment d'information actuellement et même en spécifiant bien un modèle, l'estimation des paramètres n'est pas possible. Cette dernière est basée sur la méthode du maximum de vraisemblance. Nous avons l'outil, mais il nous manque la matière première que sont les observations, pour continuer l'étude. L'idée est de poursuivre avec des observations de synthèse. Nous ferons des estimations avec des observations complètes puis, pour se rapprocher des conditions réelles, nous continuerons avec des observations partielles. Ceci constitue d'ailleurs un défi majeur. Nous proposons pour ces dernières, de nous servir des résultats des travaux de (Bierlaire et Frejinger, 2008) en les combinant avec ceux de (Fosgerau, Frejinger et Karlström, 2013). Bien qu'elles soient de nature synthétiques, les observations que nous utilisons nous mèneront à des résultats tels, que nous serons en mesure de fournir une proposition concrète qui pourrait aider à optimiser les décisions des responsables des contrôles routiers. En effet, nous avons réussi à estimer, sur le réseau réel du Québec, avec un seuil de signification de 0,05 les valeurs des paramètres d'un modèle de choix de routes discrets, même lorsque les observations sont partielles. Ces résultats donneront lieu à des recommandations sur les changements à faire dans le questionnaire permettant de collecter des données.
Resumo:
Identification and Control of Non‐linear dynamical systems are challenging problems to the control engineers.The topic is equally relevant in communication,weather prediction ,bio medical systems and even in social systems,where nonlinearity is an integral part of the system behavior.Most of the real world systems are nonlinear in nature and wide applications are there for nonlinear system identification/modeling.The basic approach in analyzing the nonlinear systems is to build a model from known behavior manifest in the form of system output.The problem of modeling boils down to computing a suitably parameterized model,representing the process.The parameters of the model are adjusted to optimize a performanace function,based on error between the given process output and identified process/model output.While the linear system identification is well established with many classical approaches,most of those methods cannot be directly applied for nonlinear system identification.The problem becomes more complex if the system is completely unknown but only the output time series is available.Blind recognition problem is the direct consequence of such a situation.The thesis concentrates on such problems.Capability of Artificial Neural Networks to approximate many nonlinear input-output maps makes it predominantly suitable for building a function for the identification of nonlinear systems,where only the time series is available.The literature is rich with a variety of algorithms to train the Neural Network model.A comprehensive study of the computation of the model parameters,using the different algorithms and the comparison among them to choose the best technique is still a demanding requirement from practical system designers,which is not available in a concise form in the literature.The thesis is thus an attempt to develop and evaluate some of the well known algorithms and propose some new techniques,in the context of Blind recognition of nonlinear systems.It also attempts to establish the relative merits and demerits of the different approaches.comprehensiveness is achieved in utilizing the benefits of well known evaluation techniques from statistics. The study concludes by providing the results of implementation of the currently available and modified versions and newly introduced techniques for nonlinear blind system modeling followed by a comparison of their performance.It is expected that,such comprehensive study and the comparison process can be of great relevance in many fields including chemical,electrical,biological,financial and weather data analysis.Further the results reported would be of immense help for practical system designers and analysts in selecting the most appropriate method based on the goodness of the model for the particular context.
Resumo:
In dieser Arbeit werden mithilfe der Likelihood-Tiefen, eingeführt von Mizera und Müller (2004), (ausreißer-)robuste Schätzfunktionen und Tests für den unbekannten Parameter einer stetigen Dichtefunktion entwickelt. Die entwickelten Verfahren werden dann auf drei verschiedene Verteilungen angewandt. Für eindimensionale Parameter wird die Likelihood-Tiefe eines Parameters im Datensatz als das Minimum aus dem Anteil der Daten, für die die Ableitung der Loglikelihood-Funktion nach dem Parameter nicht negativ ist, und dem Anteil der Daten, für die diese Ableitung nicht positiv ist, berechnet. Damit hat der Parameter die größte Tiefe, für den beide Anzahlen gleich groß sind. Dieser wird zunächst als Schätzer gewählt, da die Likelihood-Tiefe ein Maß dafür sein soll, wie gut ein Parameter zum Datensatz passt. Asymptotisch hat der Parameter die größte Tiefe, für den die Wahrscheinlichkeit, dass für eine Beobachtung die Ableitung der Loglikelihood-Funktion nach dem Parameter nicht negativ ist, gleich einhalb ist. Wenn dies für den zu Grunde liegenden Parameter nicht der Fall ist, ist der Schätzer basierend auf der Likelihood-Tiefe verfälscht. In dieser Arbeit wird gezeigt, wie diese Verfälschung korrigiert werden kann sodass die korrigierten Schätzer konsistente Schätzungen bilden. Zur Entwicklung von Tests für den Parameter, wird die von Müller (2005) entwickelte Simplex Likelihood-Tiefe, die eine U-Statistik ist, benutzt. Es zeigt sich, dass für dieselben Verteilungen, für die die Likelihood-Tiefe verfälschte Schätzer liefert, die Simplex Likelihood-Tiefe eine unverfälschte U-Statistik ist. Damit ist insbesondere die asymptotische Verteilung bekannt und es lassen sich Tests für verschiedene Hypothesen formulieren. Die Verschiebung in der Tiefe führt aber für einige Hypothesen zu einer schlechten Güte des zugehörigen Tests. Es werden daher korrigierte Tests eingeführt und Voraussetzungen angegeben, unter denen diese dann konsistent sind. Die Arbeit besteht aus zwei Teilen. Im ersten Teil der Arbeit wird die allgemeine Theorie über die Schätzfunktionen und Tests dargestellt und zudem deren jeweiligen Konsistenz gezeigt. Im zweiten Teil wird die Theorie auf drei verschiedene Verteilungen angewandt: Die Weibull-Verteilung, die Gauß- und die Gumbel-Copula. Damit wird gezeigt, wie die Verfahren des ersten Teils genutzt werden können, um (robuste) konsistente Schätzfunktionen und Tests für den unbekannten Parameter der Verteilung herzuleiten. Insgesamt zeigt sich, dass für die drei Verteilungen mithilfe der Likelihood-Tiefen robuste Schätzfunktionen und Tests gefunden werden können. In unverfälschten Daten sind vorhandene Standardmethoden zum Teil überlegen, jedoch zeigt sich der Vorteil der neuen Methoden in kontaminierten Daten und Daten mit Ausreißern.
Resumo:
An analysis of historical Corona images, Landsat images, recent radar and Google Earth® images was conducted to determine land use and land cover changes of oases settlements and surrounding rangelands at the fringe of the Altay Mountains from 1964 to 2008. For the Landsat datasets supervised classification methods were used to test the suitability of the Maximum Likelihood Classifier with subsequent smoothing and the Sequential Maximum A Posteriori Classifier (SMAPC). The results show a trend typical for the steppe and desert regions of northern China. From 1964 to 2008 farmland strongly increased (+ 61%), while the area of grassland and forest in the floodplains decreased (- 43%). The urban areas increased threefold and 400 ha of former agricultural land were abandoned. Farmland apparently affected by soil salinity decreased in size from 1990 (1180 ha) to 2008 (630 ha). The vegetated areas of the surrounding rangelands decreased, mainly as a result of overgrazing and drought events.The SMAPC with subsequent post processing revealed the highest classification accuracy. However, the specific landscape characteristics of mountain oasis systems required labour intensive post processing. Further research is needed to test the use of ancillary information for an automated classification of the examined landscape features.
Resumo:
The present study investigates the systematics and evolution of the Neotropical genus Deuterocohnia Mez (Bromeliaceae). It provides a comprehensive taxonomic revision as well as phylogenetic analyses based on chloroplast and nuclear DNA sequences and presents a hypothesis on the evolution of the genus. A broad morphological, anatomical, biogeographical and ecological overview of the genus is given in the first part of the study. For morphological character assessment more than 700 herbarium specimens from 39 herbaria as well as living plant material in the field and in the living collections of botanical gardens were carefully examined. The arid habitats, in which the species of Deuterocohnia grow, are reflected by the morphological and anatomical characters of the species. Important characters for species delimitation were identified, like the length of the inflorescence, the branching order, the density of flowers on partial inflorescences, the relation of the length of the primary bracts to that of the partial inflorescence, the sizes of floral bracts, sepals and petals, flower colour, the presence or absence of a pedicel, the curvature of the stamina and the petals during anthesis. After scrutinizing the nomenclatural history of the taxa belonging to Deuterocohnia – including the 1992 syonymized genus Abromeitiella – 17 species, 4 subspecies and 4 varieties are accepted in the present revision. Taxonomic changes were made in the following cases: (I) New combinations: A. abstrusa (A. Cast.) N. Schütz is re-established – as defined by Castellanos (1931) – and transfered to D. abstrusa; D. brevifolia (Griseb.) M.A. Spencer & L.B. Sm. includes accessions of the former D. lorentziana (Mez) M.A. Spencer & L.B. Sm., which are not assigned to D. abstrusa; D. bracteosa W. Till is synonymized to D. strobilifera Mez; D. meziana Kuntze ex Mez var. carmineo-viridiflora Rauh is classified as a subspecies of D. meziana (ssp. carmineo-viridiflora (Rauh) N. Schütz); D. pedicellata W. Till is classified as a subspecies of D. meziana (ssp. pedicellata (W. Till) N. Schütz); D. scapigera (Rauh & L. Hrom.) M.A. Spencer & L.B. Sm ssp. sanctae-crucis R. Vásquez & Ibisch is classified as a species (D. sanctae-crucis (R. Vásquez & Ibisch) N. Schütz); (II) New taxa: a new subspecies of D. meziana Kuntze ex Mez is established; a new variety of D. scapigera is established; (the new taxa will be validly published elsewhere); (III) New type: an epitype for D. longipetala was chosen. All other species were kept according to Spencer and Smith (1992) or – in the case of more recently described species – according to the protologue. Beside the nomenclatural notes and the detailed descriptions, information on distribution, habitat and ecology, etymology and taxonomic delimitation is provided for the genus and for each of its species. An key was constructed for the identification of currently accepted species, subspecies and varieties. The key is based on easily detectable morphological characters. The former synonymization of the genus Abromeitiella into Deuterocohnia (Spencer and Smith 1992) is re-evalutated in the present study. Morphological as well as molecular investigations revealed Deuterocohnia incl. Abromeitiella as being monophyletic, with some indications that a monophyletic Abromeitiella lineage arose from within Deuterocohnia. Thus the union of both genera is confirmed. The second part of the present thesis describes and discusses the molecular phylogenies and networks. Molecular analyses of three chloroplast intergenic spacers (rpl32-trnL, rps16-trnK, trnS-ycf3) were conducted with a sample set of 119 taxa. This set included 103 Deuterocohnia accessions from all 17 described species of the genus and 16 outgroup taxa from the remainder of Pitcairnioideae s.str. (Dyckia (8 sp.), Encholirium (2 sp.), Fosterella (4 sp.) and Pitcairnia (2 sp.)). With its high sampling density, the present investigation by far represents the most comprehensive molecular study of Deuterocohnia up till now. All data sets were analyzed separately as well as in combination, and various optimality criteria for phylogenetic tree construction were applied (Maximum Parsimony, Maximum Likelihood, Bayesian inferences and the distance method Neighbour Joining). Congruent topologies were generally obtained with different algorithms and optimality criteria, but individual clades received different degrees of statistical support in some analyses. The rps16-trnK locus was the most informative among the three spacer regions examined. The results of the chloroplast DNA analyses revealed a highly supported paraphyly of Deuterocohnia. Thus, the cpDNA trees divide the genus into two subclades (A and B), of which Deuterocohnia subclade B is sister to the included Dyckia and Encholirium accessions, and both together are sister to Deuterocohnia subclade A. To further examine the relationship between Deuterocohnia and Dyckia/Encholirium at the generic level, two nuclear low copy markers (PRK exon2-5 and PHYC exon1) were analysed with a reduced taxon set. This set included 22 Deuterocohnia accessions (including members of both cpDNA subclades), 2 Dyckia, 2 Encholirium and 2 Fosterella species. Phylogenetic trees were constructed as described above, and for comparison the same reduced taxon set was also analysed at the three cpDNA data loci. In contrast to the cpDNA results, the nuclear DNA data strongly supported the monophyly of Deuterocohnia, which takes a sister position to a clade of Dyckia and Encholirium samples. As morphology as well as nuclear DNA data generated in the present study and in a former AFLP analysis (Horres 2003) all corroborate the monophyly of Deuterocohnia, the apparent paraphyly displayed in cpDNA analyses is interpreted to be the consequence of a chloroplast capture event. This involves the introgression of the chloroplast genome from the common ancestor of the Dyckia/ Encholirium lineage into the ancestor of Deuterocohnia subclade B species. The chloroplast haplotypes are not species-specific in Deuterocohnia. Thus, one haplotype was sometimes shared by several species, where the same species may harbour different haplotypes. The arrangement of haplotypes followed geographical patterns rather than taxonomic boundaries, which may indicate some residual gene flow among populations from different Deuteroccohnia species. Phenotypic species coherence on the background of ongoing gene flow may then be maintained by sets of co-adapted alleles, as was suggested by the porous genome concept (Wu 2001, Palma-Silva et al. 2011). The results of the present study suggest the following scenario for the evolution of Deuterocohnia and its species. Deuterocohnia longipetala may be envisaged as a representative of the ancestral state within the genus. This is supported by (1) the wide distribution of this species; (2) the overlap in distribution area with species of Dyckia; (3) the laxly flowered inflorescences, which are also typical for Dyckia; (4) the yellow petals with a greenish tip, present in most other Deuterocohnia species. The following six extant lineages within Deuterocohnia might have independently been derived from this ancestral state with a few changes each: (I) D. meziana, D. brevispicata and D. seramisiana (Bolivia, lowland to montane areas, mostly reddish-greenish coloured, very laxly to very densely flowered); (II) D. strobilifera (Bolivia, high Andean mountains, yellow flowers, densely flowered); (III) D. glandulosa (Bolivia, montane areas, yellow-greenish flowers, densely flowered); (IV) D. haumanii, D. schreiteri, D. digitata, and D. chrysantha (Argentina, Chile, E Andean mountains and Atacama desert, yellow-greenish flowers, densely flowered); (V) D. recurvipetala (Argentina, foothills of the Andes, recurved yellow flowers, laxly flowered); (VI) D. gableana, D. scapigera, D. sanctae-crucis, D. abstrusa, D. brevifolia, D. lotteae (former Abromeitiella species, Bolivia, Argentina, higher Andean mountains, greenish-yellow flowers, inflorescence usually simple). Originating from the lower montane Andean regions, at least four lineages of the genus (I, II, IV, VI) adapted in part to higher altitudes by developing densely flowered partial inflorescences, shorter flowers and – in at least three lineages (II, IV, VI) – smaller rosettes, whereas species spreading into the lowlands (I, V) developed larger plants, laxly flowered, amply branched inflorescences and in part larger flowers (I).
Resumo:
Modeling and predicting co-occurrences of events is a fundamental problem of unsupervised learning. In this contribution we develop a statistical framework for analyzing co-occurrence data in a general setting where elementary observations are joint occurrences of pairs of abstract objects from two finite sets. The main challenge for statistical models in this context is to overcome the inherent data sparseness and to estimate the probabilities for pairs which were rarely observed or even unobserved in a given sample set. Moreover, it is often of considerable interest to extract grouping structure or to find a hierarchical data organization. A novel family of mixture models is proposed which explain the observed data by a finite number of shared aspects or clusters. This provides a common framework for statistical inference and structure discovery and also includes several recently proposed models as special cases. Adopting the maximum likelihood principle, EM algorithms are derived to fit the model parameters. We develop improved versions of EM which largely avoid overfitting problems and overcome the inherent locality of EM--based optimization. Among the broad variety of possible applications, e.g., in information retrieval, natural language processing, data mining, and computer vision, we have chosen document retrieval, the statistical analysis of noun/adjective co-occurrence and the unsupervised segmentation of textured images to test and evaluate the proposed algorithms.
Resumo:
Our goal in this paper is to assess reliability and validity of egocentered network data using multilevel analysis (Muthen, 1989, Hox, 1993) under the multitrait-multimethod approach. The confirmatory factor analysis model for multitrait-multimethod data (Werts & Linn, 1970; Andrews, 1984) is used for our analyses. In this study we reanalyse a part of data of another study (Kogovšek et al., 2002) done on a representative sample of the inhabitants of Ljubljana. The traits used in our article are the name interpreters. We consider egocentered network data as hierarchical; therefore a multilevel analysis is required. We use Muthen's partial maximum likelihood approach, called pseudobalanced solution (Muthen, 1989, 1990, 1994) which produces estimations close to maximum likelihood for large ego sample sizes (Hox & Mass, 2001). Several analyses will be done in order to compare this multilevel analysis to classic methods of analysis such as the ones made in Kogovšek et al. (2002), who analysed the data only at group (ego) level considering averages of all alters within the ego. We show that some of the results obtained by classic methods are biased and that multilevel analysis provides more detailed information that much enriches the interpretation of reliability and validity of hierarchical data. Within and between-ego reliabilities and validities and other related quality measures are defined, computed and interpreted
Resumo:
La crisis que se desató en el mercado hipotecario en Estados Unidos en 2008 y que logró propagarse a lo largo de todo sistema financiero, dejó en evidencia el nivel de interconexión que actualmente existe entre las entidades del sector y sus relaciones con el sector productivo, dejando en evidencia la necesidad de identificar y caracterizar el riesgo sistémico inherente al sistema, para que de esta forma las entidades reguladoras busquen una estabilidad tanto individual, como del sistema en general. El presente documento muestra, a través de un modelo que combina el poder informativo de las redes y su adecuación a un modelo espacial auto regresivo (tipo panel), la importancia de incorporar al enfoque micro-prudencial (propuesto en Basilea II), una variable que capture el efecto de estar conectado con otras entidades, realizando así un análisis macro-prudencial (propuesto en Basilea III).
Resumo:
Este trabajo estudia el efecto del estado de salud sobre la afiliación al Régimen Contributivo y el efecto del seguro público (Régimen Contributivo) y el seguro privado sobre el uso de servicios de salud (Consulta externa).
Resumo:
Se estima la tasa de retorno de la educación en Bogotá para 1997 y 2003 por medio de la metodología de Heckman. Se encuentra que los retornos de la educación y de la experiencia potencial son menores en 2003. El ingreso laboral promedio también disminuye.
Resumo:
We propose and estimate a financial distress model that explicitly accounts for the interactions or spill-over effects between financial institutions, through the use of a spatial continuity matrix that is build from financial network data of inter bank transactions. Such setup of the financial distress model allows for the empirical validation of the importance of network externalities in determining financial distress, in addition to institution specific and macroeconomic covariates. The relevance of such specification is that it incorporates simultaneously micro-prudential factors (Basel 2) as well as macro-prudential and systemic factors (Basel 3) as determinants of financial distress. Results indicate network externalities are an important determinant of financial health of a financial institutions. The parameter that measures the effect of network externalities is both economically and statistical significant and its inclusion as a risk factor reduces the importance of the firm specific variables such as the size or degree of leverage of the financial institution. In addition we analyze the policy implications of the network factor model for capital requirements and deposit insurance pricing.
Resumo:
Históricamente se ha reconocido que los conflictos internos afectan de manera directa variables a nivel individual como la salud de las personas, los niveles de escolaridad y el desplazamiento forzoso de los afectados. Sin embargo, solo hasta la última década las investigaciones académicas se han inclinado en documentar y cuantificar rigurosamente los efectos colaterales de la violencia sobre las condiciones de vida de los individuos. La presente investigación estudia cómo la exposición al conflicto en Colombia ha afectado las decisiones en términos de mercado laboral de las personas. La estrategia de identificación internaliza los reconocidos problemas de endogeneidad del conflicto con variables de actividad y desarrollo económico y presenta resultados robustos a fenómenos de migración interna y desplazamiento. En términos de participación laboral y desempleo, se encuentran efectos heterogéneos a nivel de género como respuestas a la violencia experimentada. En particular, la probabilidad de participación laboral de las mujeres se incremente como consecuencia de la exposición al conflicto, mientras que la de desempleo disminuye. Para los hombres, los resultados muestran una menor probabilidad de participación, efecto contrario al de las mujeres, y un efecto análogo en términos de desempleo. La investigación no encuentra efectos diferenciales en términos de informalidad laboral.
Resumo:
This paper investigates the applications of capture–recapture methods to human populations. Capture–recapture methods are commonly used in estimating the size of wildlife populations but can also be used in epidemiology and social sciences, for estimating prevalence of a particular disease or the size of the homeless population in a certain area. Here we focus on estimating the prevalence of infectious diseases. Several estimators of population size are considered: the Lincoln–Petersen estimator and its modified version, the Chapman estimator, Chao’s lower bound estimator, the Zelterman’s estimator, McKendrick’s moment estimator and the maximum likelihood estimator. In order to evaluate these estimators, they are applied to real, three-source, capture-recapture data. By conditioning on each of the sources of three source data, we have been able to compare the estimators with the true value that they are estimating. The Chapman and Chao estimators were compared in terms of their relative bias. A variance formula derived through conditioning is suggested for Chao’s estimator, and normal 95% confidence intervals are calculated for this and the Chapman estimator. We then compare the coverage of the respective confidence intervals. Furthermore, a simulation study is included to compare Chao’s and Chapman’s estimator. Results indicate that Chao’s estimator is less biased than Chapman’s estimator unless both sources are independent. Chao’s estimator has also the smaller mean squared error. Finally, the implications and limitations of the above methods are discussed, with suggestions for further development.
Resumo:
This article assesses the extent to which sampling variation affects findings about Malmquist productivity change derived using data envelopment analysis (DEA), in the first stage by calculating productivity indices and in the second stage by investigating the farm-specific change in productivity. Confidence intervals for Malmquist indices are constructed using Simar and Wilson's (1999) bootstrapping procedure. The main contribution of this article is to account in the second stage for the information in the second stage provided by the first-stage bootstrap. The DEA SEs of the Malmquist indices given by bootstrapping are employed in an innovative heteroscedastic panel regression, using a maximum likelihood procedure. The application is to a sample of 250 Polish farms over the period 1996 to 2000. The confidence intervals' results suggest that the second half of 1990s for Polish farms was characterized not so much by productivity regress but rather by stagnation. As for the determinants of farm productivity change, we find that the integration of the DEA SEs in the second-stage regression is significant in explaining a proportion of the variance in the error term. Although our heteroscedastic regression results differ with those from the standard OLS, in terms of significance and sign, they are consistent with theory and previous research.