912 resultados para Automatic classifier
Resumo:
This paper shows that countries characterized by a financial accelerator mechanism may reverse the usual finding of the literature -- flexible exchange rate regimes do a worse job of insulating open economies from external shocks. I obtain this result with a calibrated small open economy model that endogenizes foreign interest rates by linking them to the banking sector's foreign currency leverage. This relationship renders exchange rate policy more important compared to the usual exogeneity assumption. I find empirical support for this prediction using the Local Projections method. Finally, 2nd order approximation to the model finds larger welfare losses under flexible regimes.
Resumo:
Random Forests™ is reported to be one of the most accurate classification algorithms in complex data analysis. It shows excellent performance even when most predictors are noisy and the number of variables is much larger than the number of observations. In this thesis Random Forests was applied to a large-scale lung cancer case-control study. A novel way of automatically selecting prognostic factors was proposed. Also, synthetic positive control was used to validate Random Forests method. Throughout this study we showed that Random Forests can deal with large number of weak input variables without overfitting. It can account for non-additive interactions between these input variables. Random Forests can also be used for variable selection without being adversely affected by collinearities. ^ Random Forests can deal with the large-scale data sets without rigorous data preprocessing. It has robust variable importance ranking measure. Proposed is a novel variable selection method in context of Random Forests that uses the data noise level as the cut-off value to determine the subset of the important predictors. This new approach enhanced the ability of the Random Forests algorithm to automatically identify important predictors for complex data. The cut-off value can also be adjusted based on the results of the synthetic positive control experiments. ^ When the data set had high variables to observations ratio, Random Forests complemented the established logistic regression. This study suggested that Random Forests is recommended for such high dimensionality data. One can use Random Forests to select the important variables and then use logistic regression or Random Forests itself to estimate the effect size of the predictors and to classify new observations. ^ We also found that the mean decrease of accuracy is a more reliable variable ranking measurement than mean decrease of Gini. ^
Resumo:
This talk illustrates how results from various Stata commands can be processed efficiently for inclusion in customized reports. A two-step procedure is proposed in which results are gathered and archived in the first step and then tabulated in the second step. Such an approach disentangles the tasks of computing results (which may take long) and preparing results for inclusion in presentations, papers, and reports (which you may have to do over and over). Examples using results from model estimation commands and various other Stata commands such as tabulate, summarize, or correlate are presented. Users will also be shown how to dynamically link results into word processors or into LaTeX documents.
Resumo:
A plan to construct a canal through the Kra Isthmus in Southern Thailand has been proposed many times since the 17th century. The proposed canal would become an alternative route to the over-crowded Straits of Malacca. In this paper, we attempt to utilize a Geographical Information System (GIS) to calculate the realistic distances between ports that would be affected by the Kra Canal and to estimate the economic impact of the canal using a simulation model based on spatial economics. We find that China, India, Japan, and Europe gain the most from the construction of the canal, besides Thailand. On the other hand, the routes through the Straits of Malacca are largely beneficial to Malaysia, Brunei, and Indonesia, besides Singapore. Thus, it is beneficial for all ASEAN member countries that the Kra Canal and the Straits of Malacca coexist and complement one another.
Resumo:
This paper describes a preprocessing module for improving the performance of a Spanish into Spanish Sign Language (Lengua de Signos Espanola: LSE) translation system when dealing with sparse training data. This preprocessing module replaces Spanish words with associated tags. The list with Spanish words (vocabulary) and associated tags used by this module is computed automatically considering those signs that show the highest probability of being the translation of every Spanish word. This automatic tag extraction has been compared to a manual strategy achieving almost the same improvement. In this analysis, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not assigned to any sign. The preprocessing module has been incorporated into two well-known statistical translation architectures: a phrase-based system and a Statistical Finite State Transducer (SFST). This system has been developed for a specific application domain: the renewal of Identity Documents and Driver's License. In order to evaluate the system a parallel corpus made up of 4080 Spanish sentences and their LSE translation has been used. The evaluation results revealed a significant performance improvement when including this preprocessing module. In the phrase-based system, the proposed module has given rise to an increase in BLEU (Bilingual Evaluation Understudy) from 73.8% to 81.0% and an increase in the human evaluation score from 0.64 to 0.83. In the case of SFST, BLEU increased from 70.6% to 78.4% and the human evaluation score from 0.65 to 0.82.