855 resultados para optimal feature selection


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The study investigates the role of credit risk in a continuous time stochastic asset allocation model, since the traditional dynamic framework does not provide credit risk flexibility. The general model of the study extends the traditional dynamic efficiency framework by explicitly deriving the optimal value function for the infinite horizon stochastic control problem via a weighted volatility measure of market and credit risk. The model's optimal strategy was then compared to that obtained from a benchmark Markowitz-type dynamic optimization framework to determine which specification adequately reflects the optimal terminal investment returns and strategy under credit and market risks. The paper shows that an investor's optimal terminal return is lower than typically indicated under the traditional mean-variance framework during periods of elevated credit risk. Hence I conclude that, while the traditional dynamic mean-variance approach may indicate the ideal, in the presence of credit-risk it does not accurately reflect the observed optimal returns, terminal wealth and portfolio selection strategies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This prospective cohort study estimated how antibacterial resistance affected the time until clinical response. Relative rates of improvement and cure were estimated by proportional-hazards regression for 391 patients with culture-confirmed bacterial keratitis who had the ciprofloxacin minimal inhibitory concentration (MIC) measured of the principal corneal isolate and who were treated with ciprofloxacin 0.3% solution or ointment. After adjusting for age and hypopyon status and stratifying by ulcer size, clinic, and ciprofloxacin formulation, the summary rate of clinical improvement with ciprofloxacin therapy was reduced by 42% (95% confidence limits [CL], 3%, 65%) among patients whose corneal isolate's ciprofloxacin MIC exceeded 1.0 μg/mL compared to those with more sensitive isolates. The summary rate of resolution to improvement and cure was reduced by 36% (95% CL, 11%, 53%) among corneal infections having a higher ciprofloxacin MIC. Rate ratios were modified by the size of the presenting corneal ulceration; for ulcer diameters of 4 mm or less and of more than 4 mm, improvement rate ratios were 0.56 (95% CL, 0.31, 1.02) and 0.65 (95% CL, 0.23, 1.80), respectively; resolution rate ratios were 0.63 (95% CL, 0.44, 0.91) and 0.67 (95% CL, 0.32, 1.39), respectively. Sensitivity analysis showed that the summary improvement rate ratio could be maximally overestimated by 24% (95% CL, −29%, 114%) because of informative censoring or by 33% (95% CL, −21%, 126%) from loss to follow up. Based on reported corneal pharmacokinetics of topical ciprofloxacin, the probability of clinical improvement was 90% or more if the ratio of the achievable corneal ciprofloxacin concentration to the corneal isolate's ciprofloxacin MIC was above 8 or the ratio of the area under the 24-hour corneal concentration curve to the ciprofloxacin MIC was greater than 151. This study suggests that corneal infections by bacteria having a higher ciprofloxacin MIC respond more slowly to ciprofloxacin treatment than those with a lower MIC. While the rate of clinical resolution is affected by patient age and clinical severity, antimicrobial susceptibility testing of corneal cultures can indicate the relative effectiveness of antibacterial therapy. A pharmacodynamic approach to treating bacterial keratitis offers the prospect of optimal antimicrobial selection and modification. ^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The main purpose of a gene interaction network is to map the relationships of the genes that are out of sight when a genomic study is tackled. DNA microarrays allow the measure of gene expression of thousands of genes at the same time. These data constitute the numeric seed for the induction of the gene networks. In this paper, we propose a new approach to build gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling. The interactions induced by the Bayesian classifiers are based both on the expression levels and on the phenotype information of the supervised variable. Feature selection and bootstrap resampling add reliability and robustness to the overall process removing the false positive findings. The consensus among all the induced models produces a hierarchy of dependences and, thus, of variables. Biologists can define the depth level of the model hierarchy so the set of interactions and genes involved can vary from a sparse to a dense set. Experimental results show how these networks perform well on classification tasks. The biological validation matches previous biological findings and opens new hypothesis for future studies

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Many diseases have a genetic origin, and a great effort is being made to detect the genes that are responsible for their insurgence. One of the most promising techniques is the analysis of genetic information through the use of complex networks theory. Yet, a practical problem of this approach is its computational cost, which scales as the square of the number of features included in the initial dataset. In this paper, we propose the use of an iterative feature selection strategy to identify reduced subsets of relevant features, and show an application to the analysis of congenital Obstructive Nephropathy. Results demonstrate that, besides achieving a drastic reduction of the computational cost, the topologies of the obtained networks still hold all the relevant information, and are thus able to fully characterize the severity of the disease.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we analyze the performance of several well-known pattern recognition and dimensionality reduction techniques when applied to mass-spectrometry data for odor biometric identification. Motivated by the successful results of previous works capturing the odor from other parts of the body, this work attempts to evaluate the feasibility of identifying people by the odor emanated from the hands. By formulating this task according to a machine learning scheme, the problem is identified with a small-sample-size supervised classification problem in which the input data is formed by mass spectrograms from the hand odor of 13 subjects captured in different sessions. The high dimensionality of the data makes it necessary to apply feature selection and extraction techniques together with a simple classifier in order to improve the generalization capabilities of the model. Our experimental results achieve recognition rates over 85% which reveals that there exists discriminatory information in the hand odor and points at body odor as a promising biometric identifier.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Nonlinear analysis tools for studying and characterizing the dynamics of physiological signals have gained popularity, mainly because tracking sudden alterations of the inherent complexity of biological processes might be an indicator of altered physiological states. Typically, in order to perform an analysis with such tools, the physiological variables that describe the biological process under study are used to reconstruct the underlying dynamics of the biological processes. For that goal, a procedure called time-delay or uniform embedding is usually employed. Nonetheless, there is evidence of its inability for dealing with non-stationary signals, as those recorded from many physiological processes. To handle with such a drawback, this paper evaluates the utility of non-conventional time series reconstruction procedures based on non uniform embedding, applying them to automatic pattern recognition tasks. The paper compares a state of the art non uniform approach with a novel scheme which fuses embedding and feature selection at once, searching for better reconstructions of the dynamics of the system. Moreover, results are also compared with two classic uniform embedding techniques. Thus, the goal is comparing uniform and non uniform reconstruction techniques, including the one proposed in this work, for pattern recognition in biomedical signal processing tasks. Once the state space is reconstructed, the scheme followed characterizes with three classic nonlinear dynamic features (Largest Lyapunov Exponent, Correlation Dimension and Recurrence Period Density Entropy), while classification is carried out by means of a simple k-nn classifier. In order to test its generalization capabilities, the approach was tested with three different physiological databases (Speech Pathologies, Epilepsy and Heart Murmurs). In terms of the accuracy obtained to automatically detect the presence of pathologies, and for the three types of biosignals analyzed, the non uniform techniques used in this work lightly outperformed the results obtained using the uniform methods, suggesting their usefulness to characterize non-stationary biomedical signals in pattern recognition applications. On the other hand, in view of the results obtained and its low computational load, the proposed technique suggests its applicability for the applications under study.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

INTRODUCAO: A comunicação interatrial tipo \"ostium secundum\" é um defeito cardíaco congênito caracterizado pela deficiência parcial ou total da lâmina da fossa oval, também chamada de septo primo. Corresponde a 10 a 12% do total de cardiopatias congênitas, sendo a mais frequente na idade adulta. Atualmente a oclusão percutânea é o método terapêutico de escolha em defeitos com características anatômicas favoráveis para o implante de próteses na maioria dos grandes centros mundiais. A ecocardiografia transesofágica bidimensional com mapeamento de fluxo em cores é considerada a ferramenta padrão-ouro para a avaliação anatômica e monitoração durante do procedimento, sendo crucial para a ótima seleção do dispositivo. Neste sentido, um balão medidor é introduzido e insuflado através do defeito de forma a ocluí-lo temporariamente. A medida da cintura que se visualiza no balão (diâmetro estirado) é utilizada como referência para a escolha do tamanho da prótese. Recentemente a ecocardiografia tridimensional transesofágica em tempo real tem sido utilizada neste tipo de intervenção percutânea. Neste estudo avaliamos o papel da mesma na ótima seleção do dispositivo levando-se em consideração as dimensões e a geometria do defeito e a espessura das bordas do septo interatrial. METODO: Estudo observacional, prospectivo, não randomizado, de único braço, de uma coorte de 33 pacientes adultos portadores de comunicação interatrial submetidos a fechamento percutâneo utilizando dispositivo de nitinol autocentrável (Cera ®, Lifetech Scientific, Shenzhen, China). Foram analisadas as medidas do maior e menor diâmetro do defeito, sua área e as medidas do diâmetro estirado com balão medidor obtidas por meio das duas modalidades ecocardiográficas. Os defeitos foram considerados como elípticos ou circulares segundo a sua geometria; as bordas ao redor da comunicação foram consideradas espessas (>2 mm) ou finas. O dispositivo selecionado foi igual ou ate 2 mm maior que o diâmetro estirado na ecocardiografia transesofágica bidimensional (padrão-ouro). Na tentativa de identificar uma variável que pudesse substituir o diâmetro estirado do balão para a ótima escolha do dispositivo uma série de correlações lineares foram realizadas. RESULTADOS: A idade e peso médio foram de 42,1 ± 14,9 anos e 66,0 ± 9,4kg, respectivamente; sendo 22 de sexo feminino. Não houve diferenças estatísticas entre os diâmetros maior e menor ou no diâmetro estirado dos defeitos determinados por ambas as modalidades ecocardiográficas. A correlação entre as medidas obtidas com ambos os métodos foi ótima (r > 0,90). O maior diâmetro do defeito, obtido à ecoardiografia transesofágica tridimensional, foi a variável com melhor correlação com o tamanho do dispositivo selecionado no grupo como um todo (r= 0,89) e, especialmente, nos subgrupos com geometria elíptica (r= 0,96) e com bordas espessas ao redor do defeito (r= 0,96). CONCLUSÃO: Neste estudo em adultos com comunicações interatriais tipo ostium secundum submetidos à oclusão percutânea com a prótese Cera ®, a ótima seleção do dispositivo pôde ser realizada utilizando-se apenas a maior medida do defeito obtida na ecocardiografia transesofágica tridimensional em tempo real, especialmente nos pacientes com defeitos elípticos e com bordas espessas.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a preliminary study in which Machine Learning experiments applied to Opinion Mining in blogs have been carried out. We created and annotated a blog corpus in Spanish using EmotiBlog. We evaluated the utility of the features labelled firstly carrying out experiments with combinations of them and secondly using the feature selection techniques, we also deal with several problems, such as the noisy character of the input texts, the small size of the training set, the granularity of the annotation scheme and the language object of our study, Spanish, with less resource than English. We obtained promising results considering that it is a preliminary study.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Hypertrophic cardiomyopathy (HCM) is a cardiovascular disease where the heart muscle is partially thickened and blood flow is - potentially fatally - obstructed. It is one of the leading causes of sudden cardiac death in young people. Electrocardiography (ECG) and Echocardiography (Echo) are the standard tests for identifying HCM and other cardiac abnormalities. The American Heart Association has recommended using a pre-participation questionnaire for young athletes instead of ECG or Echo tests due to considerations of cost and time involved in interpreting the results of these tests by an expert cardiologist. Initially we set out to develop a classifier for automated prediction of young athletes’ heart conditions based on the answers to the questionnaire. Classification results and further in-depth analysis using computational and statistical methods indicated significant shortcomings of the questionnaire in predicting cardiac abnormalities. Automated methods for analyzing ECG signals can help reduce cost and save time in the pre-participation screening process by detecting HCM and other cardiac abnormalities. Therefore, the main goal of this dissertation work is to identify HCM through computational analysis of 12-lead ECG. ECG signals recorded on one or two leads have been analyzed in the past for classifying individual heartbeats into different types of arrhythmia as annotated primarily in the MIT-BIH database. In contrast, we classify complete sequences of 12-lead ECGs to assign patients into two groups: HCM vs. non-HCM. The challenges and issues we address include missing ECG waves in one or more leads and the dimensionality of a large feature-set. We address these by proposing imputation and feature-selection methods. We develop heartbeat-classifiers by employing Random Forests and Support Vector Machines, and propose a method to classify full 12-lead ECGs based on the proportion of heartbeats classified as HCM. The results from our experiments show that the classifiers developed using our methods perform well in identifying HCM. Thus the two contributions of this thesis are the utilization of computational and statistical methods for discovering shortcomings in a current screening procedure and the development of methods to identify HCM through computational analysis of 12-lead ECG signals.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Document classification is a supervised machine learning process, where predefined category labels are assigned to documents based on the hypothesis derived from training set of labelled documents. Documents cannot be directly interpreted by a computer system unless they have been modelled as a collection of computable features. Rogati and Yang [M. Rogati and Y. Yang, Resource selection for domain-specific cross-lingual IR, in SIGIR 2004: Proceedings of the 27th annual international conference on Research and Development in Information Retrieval, ACM Press, Sheffied: United Kingdom, pp. 154-161.] pointed out that the effectiveness of document classification system may vary in different domains. This implies that the quality of document model contributes to the effectiveness of document classification. Conventionally, model evaluation is accomplished by comparing the effectiveness scores of classifiers on model candidates. However, this kind of evaluation methods may encounter either under-fitting or over-fitting problems, because the effectiveness scores are restricted by the learning capacities of classifiers. We propose a model fitness evaluation method to determine whether a model is sufficient to distinguish positive and negative instances while still competent to provide satisfactory effectiveness with a small feature subset. Our experiments demonstrated how the fitness of models are assessed. The results of our work contribute to the researches of feature selection, dimensionality reduction and document classification.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we explore the use of text-mining methods for the identification of the author of a text. We apply the support vector machine (SVM) to this problem, as it is able to cope with half a million of inputs it requires no feature selection and can process the frequency vector of all words of a text. We performed a number of experiments with texts from a German newspaper. With nearly perfect reliability the SVM was able to reject other authors and detected the target author in 60–80% of the cases. In a second experiment, we ignored nouns, verbs and adjectives and replaced them by grammatical tags and bigrams. This resulted in slightly reduced performance. Author detection with SVMs on full word forms was remarkably robust even if the author wrote about different topics.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Conventionally, document classification researches focus on improving the learning capabilities of classifiers. Nevertheless, according to our observation, the effectiveness of classification is limited by the suitability of document representation. Intuitively, the more features that are used in representation, the more comprehensive that documents are represented. However, if a representation contains too many irrelevant features, the classifier would suffer from not only the curse of high dimensionality, but also overfitting. To address this problem of suitableness of document representations, we present a classifier-independent approach to measure the effectiveness of document representations. Our approach utilises a labelled document corpus to estimate the distribution of documents in the feature space. By looking through documents in this way, we can clearly identify the contributions made by different features toward the document classification. Some experiments have been performed to show how the effectiveness is evaluated. Our approach can be used as a tool to assist feature selection, dimensionality reduction and document classification.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present results that compare the performance of neural networks trained with two Bayesian methods, (i) the Evidence Framework of MacKay (1992) and (ii) a Markov Chain Monte Carlo method due to Neal (1996) on a task of classifying segmented outdoor images. We also investigate the use of the Automatic Relevance Determination method for input feature selection.