940 resultados para Feature Classification
Resumo:
Adrenocortical tumors (ACT) in children under 15 years of age exhibit some clinical and biological features distinct from ACT in adults. Cell proliferation, hypertrophy and cell death in adrenal cortex during the last months of gestation and the immediate postnatal period seem to be critical for the origin of ACT in children. Studies with large numbers of patients with childhood ACT have indicated a median age at diagnosis of about 4 years. In our institution, the median age was 3 years and 5 months, while the median age for first signs and symptoms was 2 years and 5 months (N = 72). Using the comparative genomic hybridization technique, we have reported a high frequency of 9q34 amplification in adenomas and carcinomas. This finding has been confirmed more recently by investigators in England. The lower socioeconomic status, the distinctive ethnic groups and all the regional differences in Southern Brazil in relation to patients in England indicate that these differences are not important to determine 9q34 amplification. Candidate amplified genes mapped to this locus are currently being investigated and Southern blot results obtained so far have discarded amplification of the abl oncogene. Amplification of 9q34 has not been found to be related to tumor size, staging, or malignant histopathological features, nor does it seem to be responsible for the higher incidence of ACT observed in Southern Brazil, but could be related to an ACT from embryonic origin.
Resumo:
The predominant type of liver alteration in asymptomatic or oligosymptomatic chronic male alcoholics (N = 169) admitted to a psychiatric hospital for detoxification was classified by two independent methods: liver palpation and multiple quadratic discriminant analysis (QDA), the latter applied to two parameters reported by the patient (duration of alcoholism and daily amount ingested) and to the data obtained from eight biochemical blood determinations (total bilirubin, alkaline phosphatase, glycemia, potassium, aspartate aminotransferase, albumin, globulin, and sodium). All 11 soft and sensitive, and 13 firm and sensitive livers formed fully concordant groups as determined by QDA. Among the 22 soft and not sensitive livers, 95% were concordant by QDA grouping. Concordance rates were low (55%) in the 73 firm and not sensitive livers, and intermediate (76%) in the 50 not palpable livers. Prediction of the liver palpation characteristics by QDA was 95% correct for the firm and not sensitive livers and moderate for the other groups. On a preliminary basis, the variables considered to be most informative by QDA were the two anamnestic data and bilirubin levels, followed by alkaline phosphatase, glycemia and potassium, and then by aspartate aminotransferase and albumin. We conclude that, when biopsies would be too costly or potentially injurious to the patients to varying extents, clinical data could be considered valid to guide patient care, at least in the three groups (soft, not sensitive; soft, sensitive; firm, sensitive livers) in which the two noninvasive procedures were highly concordant in the present study.
Resumo:
The objective of this thesis is to develop and generalize further the differential evolution based data classification method. For many years, evolutionary algorithms have been successfully applied to many classification tasks. Evolution algorithms are population based, stochastic search algorithms that mimic natural selection and genetics. Differential evolution is an evolutionary algorithm that has gained popularity because of its simplicity and good observed performance. In this thesis a differential evolution classifier with pool of distances is proposed, demonstrated and initially evaluated. The differential evolution classifier is a nearest prototype vector based classifier that applies a global optimization algorithm, differential evolution, to determine the optimal values for all free parameters of the classifier model during the training phase of the classifier. The differential evolution classifier applies the individually optimized distance measure for each new data set to be classified is generalized to cover a pool of distances. Instead of optimizing a single distance measure for the given data set, the selection of the optimal distance measure from a predefined pool of alternative measures is attempted systematically and automatically. Furthermore, instead of only selecting the optimal distance measure from a set of alternatives, an attempt is made to optimize the values of the possible control parameters related with the selected distance measure. Specifically, a pool of alternative distance measures is first created and then the differential evolution algorithm is applied to select the optimal distance measure that yields the highest classification accuracy with the current data. After determining the optimal distance measures for the given data set together with their optimal parameters, all determined distance measures are aggregated to form a single total distance measure. The total distance measure is applied to the final classification decisions. The actual classification process is still based on the nearest prototype vector principle; a sample belongs to the class represented by the nearest prototype vector when measured with the optimized total distance measure. During the training process the differential evolution algorithm determines the optimal class vectors, selects optimal distance metrics, and determines the optimal values for the free parameters of each selected distance measure. The results obtained with the above method confirm that the choice of distance measure is one of the most crucial factors for obtaining higher classification accuracy. The results also demonstrate that it is possible to build a classifier that is able to select the optimal distance measure for the given data set automatically and systematically. After finding optimal distance measures together with optimal parameters from the particular distance measure results are then aggregated to form a total distance, which will be used to form the deviation between the class vectors and samples and thus classify the samples. This thesis also discusses two types of aggregation operators, namely, ordered weighted averaging (OWA) based multi-distances and generalized ordered weighted averaging (GOWA). These aggregation operators were applied in this work to the aggregation of the normalized distance values. The results demonstrate that a proper combination of aggregation operator and weight generation scheme play an important role in obtaining good classification accuracy. The main outcomes of the work are the six new generalized versions of previous method called differential evolution classifier. All these DE classifier demonstrated good results in the classification tasks.
Resumo:
The authors propose a clinical classification to monitor the evolution of tetanus patients, ranging from grade I to IV according to severity. It was applied on admission and repeated on alternate days up to the 10th day to patients aged > or = 12 years admitted to the State University Hospital, Recife, Brazil. Patients were also classified upon admission according to three prognostic indicators to determine if the proposed classification is in agreement with the traditionally used indicators. Upon admission, the distribution of the 64 patients among the different levels of the proposed classification was similar for the groups of better and worse prognosis according to the three indicators (P > 0.05), most of the patients belonging to grades I and II of the proposed classification. In the later reclassifications, severe forms of tetanus (grades III and IV) were more frequent in the categories of worse prognosis and these differences were statistically significant. There was a reduction in the proportion of mild forms (grades I and II) of tetanus with time for the categories of worse prognostic indicators (chi-square for trend: P = 0.00006, 0.03, and 0.00000) whereas no such trend was observed for the categories of better prognosis (grades I and II). This serially used classification reflected the prognosis of the traditional indicators and permitted the comparison of the dynamics of the disease in different groups. Thus, it becomes a useful tool for monitoring patients by determining clinical category changes with time, and for assessing responses to different therapeutic measures.
Resumo:
Feature extraction is the part of pattern recognition, where the sensor data is transformed into a more suitable form for the machine to interpret. The purpose of this step is also to reduce the amount of information passed to the next stages of the system, and to preserve the essential information in the view of discriminating the data into different classes. For instance, in the case of image analysis the actual image intensities are vulnerable to various environmental effects, such as lighting changes and the feature extraction can be used as means for detecting features, which are invariant to certain types of illumination changes. Finally, classification tries to make decisions based on the previously transformed data. The main focus of this thesis is on developing new methods for the embedded feature extraction based on local non-parametric image descriptors. Also, feature analysis is carried out for the selected image features. Low-level Local Binary Pattern (LBP) based features are in a main role in the analysis. In the embedded domain, the pattern recognition system must usually meet strict performance constraints, such as high speed, compact size and low power consumption. The characteristics of the final system can be seen as a trade-off between these metrics, which is largely affected by the decisions made during the implementation phase. The implementation alternatives of the LBP based feature extraction are explored in the embedded domain in the context of focal-plane vision processors. In particular, the thesis demonstrates the LBP extraction with MIPA4k massively parallel focal-plane processor IC. Also higher level processing is incorporated to this framework, by means of a framework for implementing a single chip face recognition system. Furthermore, a new method for determining optical flow based on LBPs, designed in particular to the embedded domain is presented. Inspired by some of the principles observed through the feature analysis of the Local Binary Patterns, an extension to the well known non-parametric rank transform is proposed, and its performance is evaluated in face recognition experiments with a standard dataset. Finally, an a priori model where the LBPs are seen as combinations of n-tuples is also presented
Resumo:
In a serial feature-positive conditional discrimination procedure the properties of a target stimulus A are defined by the presence or not of a feature stimulus X preceding it. In the present experiment, composite features preceded targets associated with two different topography operant responses (right and left bar pressing); matching and non-matching-to-sample arrangements were also used. Five water-deprived Wistar rats were trained in 6 different trials: X-R®Ar and X-L®Al, in which X and A were same modality visual stimuli and the reinforcement was contingent to pressing either the right (r) or left (l) bar that had the light on during the feature (matching-to-sample); Y-R®Bl and Y-L®Br, in which Y and B were same modality auditory stimuli and the reinforcement was contingent to pressing the bar that had the light off during the feature (non-matching-to-sample); A- and B- alone. After 100 training sessions, the animals were submitted to transfer tests with the targets used plus a new one (auditory click). Average percentages of stimuli with a response were measured. Acquisition occurred completely only for Y-L®Br+; however, complex associations were established along training. Transfer was not complete during the tests since concurrent effects of extinction and response generalization also occurred. Results suggest the use of both simple conditioning and configurational strategies, favoring the most recent theories of conditional discrimination learning. The implications of the use of complex arrangements for discussing these theories are considered.
Resumo:
Since the times preceding the Second World War the subject of aircraft tracking has been a core interest to both military and non-military aviation. During subsequent years both technology and configuration of the radars allowed the users to deploy it in numerous fields, such as over-the-horizon radar, ballistic missile early warning systems or forward scatter fences. The latter one was arranged in a bistatic configuration. The bistatic radar has continuously re-emerged over the last eighty years for its intriguing capabilities and challenging configuration and formulation. The bistatic radar arrangement is used as the basis of all the analyzes presented in this work. The aircraft tracking method of VHF Doppler-only information, developed in the first part of this study, is solely based on Doppler frequency readings in relation to time instances of their appearance. The corresponding inverse problem is solved by utilising a multistatic radar scenario with two receivers and one transmitter and using their frequency readings as a base for aircraft trajectory estimation. The quality of the resulting trajectory is then compared with ground-truth information based on ADS-B data. The second part of the study deals with the developement of a method for instantaneous Doppler curve extraction from within a VHF time-frequency representation of the transmitted signal, with a three receivers and one transmitter configuration, based on a priori knowledge of the probability density function of the first order derivative of the Doppler shift, and on a system of blocks for identifying, classifying and predicting the Doppler signal. The extraction capabilities of this set-up are tested with a recorded TV signal and simulated synthetic spectrograms. Further analyzes are devoted to more comprehensive testing of the capabilities of the extraction method. Besides testing the method, the classification of aircraft is performed on the extracted Bistatic Radar Cross Section profiles and the correlation between them for different types of aircraft. In order to properly estimate the profiles, the ADS-B aircraft location information is adjusted based on extracted Doppler frequency and then used for Bistatic Radar Cross Section estimation. The classification is based on seven types of aircraft grouped by their size into three classes.
Resumo:
The aim of this study was to analyze clinical aspects, hearing evolution and efficacy of clinical treatment of patients with sudden sensorineural hearing loss (SSNHL). This was a prospective clinical study of 136 consecutive patients with SSNHL divided into three groups after diagnostic evaluation: patients with defined etiology (DE, N = 13, 10%), concurrent diseases (CD, N = 63, 46.04%) and idiopathic sudden sensorineural hearing loss (ISSHL, N = 60, 43.9%). Initial treatment consisted of prednisone and pentoxifylline. Clinical aspects and hearing evolution for up to 6 months were evaluated. Group CD comprised 73% of patients with metabolic decompensation in the initial evaluation and was significantly older (53.80 years) than groups DE (41.93 years) and ISSHL (39.13 years). Comparison of the mean initial and final hearing loss of the three groups revealed a significant hearing improvement for group CD (P = 0.001) and group ISSHL (P = 0.001). Group DE did not present a significant difference in thresholds. The clinical classification for SSNHL allows the identification of significant differences regarding age, initial and final hearing impairment and likelihood of response to therapy. Elevated age and presence of coexisting disease were associated with a greater initial hearing impact and poorer hearing recovery after 6 months. Patients with defined etiology presented a much more limited response to therapy. The occurrence of decompensated metabolic and cardiovascular diseases and the possibility of first manifestation of auto-immune disease and cerebello-pontine angle tumors justify an adequate protocol for investigation of SSNHL.
Resumo:
The objective of the present study was to evaluate the characteristics of acute kidney injury (AKI) in AIDS patients and the value of RIFLE classification for predicting outcome. The study was conducted on AIDS patients admitted to an infectious diseases hospital inBrazil. The patients with AKI were classified according to the RIFLE classification: R (risk), I (injury), F (failure), L (loss), and E (end-stage renal disease). Univariate and multivariate analyses were used to evaluate the factors associated with AKI. A total of 532 patients with a mean age of 35 ± 8.5 years were included in this study. AKI was observed in 37% of the cases. Patients were classified as "R" (18%), "I" (7.7%) and "F" (11%). Independent risk factors for AKI were thrombocytopenia (OR = 2.9, 95%CI = 1.5-5.6, P < 0.001) and elevation of aspartate aminotransferase (AST) (OR = 3.5, 95%CI = 1.8-6.6, P < 0.001). General mortality was 25.7% and was higher among patients with AKI (40.2 vs17%, P < 0.001). AKI was associated with death and mortality increased according to RIFLE classification - "R" (OR 2.4), "I" (OR 3.0) and "F" (OR 5.1), P < 0.001. AKI is a frequent complication in AIDS patients, which is associated with increased mortality. RIFLE classification is an important indicator of poor outcome for AIDS patients.
Resumo:
High resolution proton nuclear magnetic resonance spectroscopy (¹H MRS) can be used to detect biochemical changes in vitro caused by distinct pathologies. It can reveal distinct metabolic profiles of brain tumors although the accurate analysis and classification of different spectra remains a challenge. In this study, the pattern recognition method partial least squares discriminant analysis (PLS-DA) was used to classify 11.7 T ¹H MRS spectra of brain tissue extracts from patients with brain tumors into four classes (high-grade neuroglial, low-grade neuroglial, non-neuroglial, and metastasis) and a group of control brain tissue. PLS-DA revealed 9 metabolites as the most important in group differentiation: γ-aminobutyric acid, acetoacetate, alanine, creatine, glutamate/glutamine, glycine, myo-inositol, N-acetylaspartate, and choline compounds. Leave-one-out cross-validation showed that PLS-DA was efficient in group characterization. The metabolic patterns detected can be explained on the basis of previous multimodal studies of tumor metabolism and are consistent with neoplastic cell abnormalities possibly related to high turnover, resistance to apoptosis, osmotic stress and tumor tendency to use alternative energetic pathways such as glycolysis and ketogenesis.
Resumo:
In vivo proton magnetic resonance spectroscopy (¹H-MRS) is a technique capable of assessing biochemical content and pathways in normal and pathological tissue. In the brain, ¹H-MRS complements the information given by magnetic resonance images. The main goal of the present study was to assess the accuracy of ¹H-MRS for the classification of brain tumors in a pilot study comparing results obtained by manual and semi-automatic quantification of metabolites. In vivo single-voxel ¹H-MRS was performed in 24 control subjects and 26 patients with brain neoplasms that included meningiomas, high-grade neuroglial tumors and pilocytic astrocytomas. Seven metabolite groups (lactate, lipids, N-acetyl-aspartate, glutamate and glutamine group, total creatine, total choline, myo-inositol) were evaluated in all spectra by two methods: a manual one consisting of integration of manually defined peak areas, and the advanced method for accurate, robust and efficient spectral fitting (AMARES), a semi-automatic quantification method implemented in the jMRUI software. Statistical methods included discriminant analysis and the leave-one-out cross-validation method. Both manual and semi-automatic analyses detected differences in metabolite content between tumor groups and controls (P < 0.005). The classification accuracy obtained with the manual method was 75% for high-grade neuroglial tumors, 55% for meningiomas and 56% for pilocytic astrocytomas, while for the semi-automatic method it was 78, 70, and 98%, respectively. Both methods classified all control subjects correctly. The study demonstrated that ¹H-MRS accurately differentiated normal from tumoral brain tissue and confirmed the superiority of the semi-automatic quantification method.
Resumo:
In experimental studies, several parameters, such as body weight, body mass index, adiposity index, and dual-energy X-ray absorptiometry, have commonly been used to demonstrate increased adiposity and investigate the mechanisms underlying obesity and sedentary lifestyles. However, these investigations have not classified the degree of adiposity nor defined adiposity categories for rats, such as normal, overweight, and obese. The aim of the study was to characterize the degree of adiposity in rats fed a high-fat diet using cluster analysis and to create adiposity intervals in an experimental model of obesity. Thirty-day-old male Wistar rats were fed a normal (n=41) or a high-fat (n=43) diet for 15 weeks. Obesity was defined based on the adiposity index; and the degree of adiposity was evaluated using cluster analysis. Cluster analysis allowed the rats to be classified into two groups (overweight and obese). The obese group displayed significantly higher total body fat and a higher adiposity index compared with those of the overweight group. No differences in systolic blood pressure or nonesterified fatty acid, glucose, total cholesterol, or triglyceride levels were observed between the obese and overweight groups. The adiposity index of the obese group was positively correlated with final body weight, total body fat, and leptin levels. Despite the classification of sedentary rats into overweight and obese groups, it was not possible to identify differences in the comorbidities between the two groups.
Resumo:
Personalized medicine will revolutionize our capabilities to combat disease. Working toward this goal, a fundamental task is the deciphering of geneticvariants that are predictive of complex diseases. Modern studies, in the formof genome-wide association studies (GWAS) have afforded researchers with the opportunity to reveal new genotype-phenotype relationships through the extensive scanning of genetic variants. These studies typically contain over half a million genetic features for thousands of individuals. Examining this with methods other than univariate statistics is a challenging task requiring advanced algorithms that are scalable to the genome-wide level. In the future, next-generation sequencing studies (NGS) will contain an even larger number of common and rare variants. Machine learning-based feature selection algorithms have been shown to have the ability to effectively create predictive models for various genotype-phenotype relationships. This work explores the problem of selecting genetic variant subsets that are the most predictive of complex disease phenotypes through various feature selection methodologies, including filter, wrapper and embedded algorithms. The examined machine learning algorithms were demonstrated to not only be effective at predicting the disease phenotypes, but also doing so efficiently through the use of computational shortcuts. While much of the work was able to be run on high-end desktops, some work was further extended so that it could be implemented on parallel computers helping to assure that they will also scale to the NGS data sets. Further, these studies analyzed the relationships between various feature selection methods and demonstrated the need for careful testing when selecting an algorithm. It was shown that there is no universally optimal algorithm for variant selection in GWAS, but rather methodologies need to be selected based on the desired outcome, such as the number of features to be included in the prediction model. It was also demonstrated that without proper model validation, for example using nested cross-validation, the models can result in overly-optimistic prediction accuracies and decreased generalization ability. It is through the implementation and application of machine learning methods that one can extract predictive genotype–phenotype relationships and biological insights from genetic data sets.
Resumo:
The subject of the thesis is automatic sentence compression with machine learning, so that the compressed sentences remain both grammatical and retain their essential meaning. There are multiple possible uses for the compression of natural language sentences. In this thesis the focus is generation of television program subtitles, which often are compressed version of the original script of the program. The main part of the thesis consists of machine learning experiments for automatic sentence compression using different approaches to the problem. The machine learning methods used for this work are linear-chain conditional random fields and support vector machines. Also we take a look which automatic text analysis methods provide useful features for the task. The data used for machine learning is supplied by Lingsoft Inc. and consists of subtitles in both compressed an uncompressed form. The models are compared to a baseline system and comparisons are made both automatically and also using human evaluation, because of the potentially subjective nature of the output. The best result is achieved using a CRF - sequence classification using a rich feature set. All text analysis methods help classification and most useful method is morphological analysis. Tutkielman aihe on suomenkielisten lauseiden automaattinen tiivistäminen koneellisesti, niin että lyhennetyt lauseet säilyttävät olennaisen informaationsa ja pysyvät kieliopillisina. Luonnollisen kielen lauseiden tiivistämiselle on monta käyttötarkoitusta, mutta tässä tutkielmassa aihetta lähestytään television ohjelmien tekstittämisen kautta, johon käytännössä kuuluu alkuperäisen tekstin lyhentäminen televisioruudulle paremmin sopivaksi. Tutkielmassa kokeillaan erilaisia koneoppimismenetelmiä tekstin automaatiseen lyhentämiseen ja tarkastellaan miten hyvin erilaiset luonnollisen kielen analyysimenetelmät tuottavat informaatiota, joka auttaa näitä menetelmiä lyhentämään lauseita. Lisäksi tarkastellaan minkälainen lähestymistapa tuottaa parhaan lopputuloksen. Käytetyt koneoppimismenetelmät ovat tukivektorikone ja lineaarisen sekvenssin mallinen CRF. Koneoppimisen tukena käytetään tekstityksiä niiden eri käsittelyvaiheissa, jotka on saatu Lingsoft OY:ltä. Luotuja malleja vertaillaan Lopulta mallien lopputuloksia evaluoidaan automaattisesti ja koska teksti lopputuksena on jossain määrin subjektiivinen myös ihmisarviointiin perustuen. Vertailukohtana toimii kirjallisuudesta poimittu menetelmä. Tutkielman tuloksena paras lopputulos saadaan aikaan käyttäen CRF sekvenssi-luokittelijaa laajalla piirrejoukolla. Kaikki kokeillut teksin analyysimenetelmät auttavat luokittelussa, joista tärkeimmän panoksen antaa morfologinen analyysi.
Resumo:
Enzyme technology is an ever-growing field of knowledge and, in recent years, this technology has raised renewed interest, due to the search for new paradigms in several productive processes. Lipases, esterases and cutinases are enzymes used in a wide range of processes involving synthesis and hydrolysis reactions. The objective of this work was to investigate and compare the specific lipase and esterase activities of five enzymes - four already classified as lipases and one classified as cutinase - in the presence of natural and synthetic substrates. All tested enzymes presented both esterase and lipase specific activities. The highest specific esterase activity was observed for Aspergillus 1068 lipase in natural substrate and for F. oxysporum cutinase in synthetic substrate, while the highest specific lipase activity was observed for Geotrichum sp. lipase in natural substrate and for F. oxysporum cutinase in synthetic substrate. These results display some interface-independent lipolytic activity for all lipases tested. This is in accordance with the rationale that a new and broader definition of lipases may be necessary.