940 resultados para MULTI-LABEL CLASSIFICATION


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multi-element analysis of honey samples was carried out with the aim of developing a reliable method of tracing the origin of honey. Forty-two chemical elements were determined (Al, Cu, Pb, Zn, Mn, Cd, Tl, Co, Ni, Rb, Ba, Be, Bi, U, V, Fe, Pt, Pd, Te, Hf, Mo, Sn, Sb, P, La, Mg, I, Sm, Tb, Dy, Sd, Th, Pr, Nd, Tm, Yb, Lu, Gd, Ho, Er, Ce, Cr) by inductively coupled plasma mass spectrometry (ICP-MS). Then, three machine learning tools for classification and two for attribute selection were applied in order to prove that it is possible to use data mining tools to find the region where honey originated. Our results clearly demonstrate the potential of Support Vector Machine (SVM), Multilayer Perceptron (MLP) and Random Forest (RF) chemometric tools for honey origin identification. Moreover, the selection tools allowed a reduction from 42 trace element concentrations to only 5. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Metronidazole is a BCS (Biopharmaceutics Classification System) class 1 drug, traditionally considered the choice drug in the infections treatment caused by protozoa and anaerobic microorganisms. This study aimed to evaluate bioequivalence between 2 different marketed 250 mg metronidazole immediate release tablets. A randomized, open-label, 2 x 2 crossover study was performed in healthy Brazilian volunteers under fasting conditions with a 7-day washout period. The formulations were administered as single oral dose and blood was sampled over 48 h. Metronidazole plasma concentrations were determined by a liquid chromatography mass spectrometry (LC-MS/MS) method. The plasma concentration vs. time profile was generated for each volunteer and the pharmacokinetic parameters C-max, T-max, AUC(0-t), AUC(0-infinity), k(e), and t(1/2) were calculated using a noncompartmental model. Bioequivalence between pharmaceutical formulations was determined by calculating 90% CIs (Confidence Intervall) for the ratios of C-max, AUC(0-t), and AUC(0-infinity) values for test and reference using log-transformed data. 22 healthy volunteers (11 men, 11 women; mean (SD) age, 28 (6.5) years [range, 21-45 years]; mean (SD) weight, 66 (9.3) kg [range, 51-81 kg]; mean (SD) height, 169 (6.5) cm [range, 156-186 cm]) were enrolled in and completed the study. The 90% CIs for C-max (0.92-1.06), AUC(0-t) (0.97-1.02), and AUC(0-infinity) (0.97-1.03) values for the test and reference products fitted in the interval of 0.80-1.25 proposed by most regulatory agencies, including the Brazilian agency ANVISA. No clinically significant adverse effects were reported. After pharmacokinetics analysis, it concluded that test 250 mg metronidazole formulation is bioequivalent to the reference product according to the Brazilian agency requirements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Decision tree induction algorithms represent one of the most popular techniques for dealing with classification problems. However, traditional decision-tree induction algorithms implement a greedy approach for node splitting that is inherently susceptible to local optima convergence. Evolutionary algorithms can avoid the problems associated with a greedy search and have been successfully employed to the induction of decision trees. Previously, we proposed a lexicographic multi-objective genetic algorithm for decision-tree induction, named LEGAL-Tree. In this work, we propose extending this approach substantially, particularly w.r.t. two important evolutionary aspects: the initialization of the population and the fitness function. We carry out a comprehensive set of experiments to validate our extended algorithm. The experimental results suggest that it is able to outperform both traditional algorithms for decision-tree induction and another evolutionary algorithm in a variety of application domains.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Semi-supervised learning is a classification paradigm in which just a few labeled instances are available for the training process. To overcome this small amount of initial label information, the information provided by the unlabeled instances is also considered. In this paper, we propose a nature-inspired semi-supervised learning technique based on attraction forces. Instances are represented as points in a k-dimensional space, and the movement of data points is modeled as a dynamical system. As the system runs, data items with the same label cooperate with each other, and data items with different labels compete among them to attract unlabeled points by applying a specific force function. In this way, all unlabeled data items can be classified when the system reaches its stable state. Stability analysis for the proposed dynamical system is performed and some heuristics are proposed for parameter setting. Simulation results show that the proposed technique achieves good classification results on artificial data sets and is comparable to well-known semi-supervised techniques using benchmark data sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Given a large image set, in which very few images have labels, how to guess labels for the remaining majority? How to spot images that need brand new labels different from the predefined ones? How to summarize these data to route the user’s attention to what really matters? Here we answer all these questions. Specifically, we propose QuMinS, a fast, scalable solution to two problems: (i) Low-labor labeling (LLL) – given an image set, very few images have labels, find the most appropriate labels for the rest; and (ii) Mining and attention routing – in the same setting, find clusters, the top-'N IND.O' outlier images, and the 'N IND.R' images that best represent the data. Experiments on satellite images spanning up to 2.25 GB show that, contrasting to the state-of-the-art labeling techniques, QuMinS scales linearly on the data size, being up to 40 times faster than top competitors (GCap), still achieving better or equal accuracy, it spots images that potentially require unpredicted labels, and it works even with tiny initial label sets, i.e., nearly five examples. We also report a case study of our method’s practical usage to show that QuMinS is a viable tool for automatic coffee crop detection from remote sensing images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work proposes a novel texture descriptor based on fractal theory. The method is based on the Bouligand- Minkowski descriptors. We decompose the original image recursively into four equal parts. In each recursion step, we estimate the average and the deviation of the Bouligand-Minkowski descriptors computed over each part. Thus, we extract entropy features from both average and deviation. The proposed descriptors are provided by concatenating such measures. The method is tested in a classification experiment under well known datasets, that is, Brodatz and Vistex. The results demonstrate that the novel technique achieves better results than classical and state-of-the-art texture descriptors, such as Local Binary Patterns, Gabor-wavelets and co-occurrence matrix.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this report it was designed an innovative satellite-based monitoring approach applied on the Iraqi Marshlands to survey the extent and distribution of marshland re-flooding and assess the development of wetland vegetation cover. The study, conducted in collaboration with MEEO Srl , makes use of images collected from the sensor (A)ATSR onboard ESA ENVISAT Satellite to collect data at multi-temporal scales and an analysis was adopted to observe the evolution of marshland re-flooding. The methodology uses a multi-temporal pixel-based approach based on classification maps produced by the classification tool SOIL MAPPER ®. The catalogue of the classification maps is available as web service through the Service Support Environment Portal (SSE, supported by ESA). The inundation of the Iraqi marshlands, which has been continuous since April 2003, is characterized by a high degree of variability, ad-hoc interventions and uncertainty. Given the security constraints and vastness of the Iraqi marshlands, as well as cost-effectiveness considerations, satellite remote sensing was the only viable tool to observe the changes taking place on a continuous basis. The proposed system (ALCS – AATSR LAND CLASSIFICATION SYSTEM) avoids the direct use of the (A)ATSR images and foresees the application of LULCC evolution models directly to „stock‟ of classified maps. This approach is made possible by the availability of a 13 year classified image database, conceived and implemented in the CARD project (http://earth.esa.int/rtd/Projects/#CARD).The approach here presented evolves toward an innovative, efficient and fast method to exploit the potentiality of multi-temporal LULCC analysis of (A)ATSR images. The two main objectives of this work are both linked to a sort of assessment: the first is to assessing the ability of modeling with the web-application ALCS using image-based AATSR classified with SOIL MAPPER ® and the second is to evaluate the magnitude, the character and the extension of wetland rehabilitation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nowadays communication is switching from a centralized scenario, where communication media like newspapers, radio, TV programs produce information and people are just consumers, to a completely different decentralized scenario, where everyone is potentially an information producer through the use of social networks, blogs, forums that allow a real-time worldwide information exchange. These new instruments, as a result of their widespread diffusion, have started playing an important socio-economic role. They are the most used communication media and, as a consequence, they constitute the main source of information enterprises, political parties and other organizations can rely on. Analyzing data stored in servers all over the world is feasible by means of Text Mining techniques like Sentiment Analysis, which aims to extract opinions from huge amount of unstructured texts. This could lead to determine, for instance, the user satisfaction degree about products, services, politicians and so on. In this context, this dissertation presents new Document Sentiment Classification methods based on the mathematical theory of Markov Chains. All these approaches bank on a Markov Chain based model, which is language independent and whose killing features are simplicity and generality, which make it interesting with respect to previous sophisticated techniques. Every discussed technique has been tested in both Single-Domain and Cross-Domain Sentiment Classification areas, comparing performance with those of other two previous works. The performed analysis shows that some of the examined algorithms produce results comparable with the best methods in literature, with reference to both single-domain and cross-domain tasks, in $2$-classes (i.e. positive and negative) Document Sentiment Classification. However, there is still room for improvement, because this work also shows the way to walk in order to enhance performance, that is, a good novel feature selection process would be enough to outperform the state of the art. Furthermore, since some of the proposed approaches show promising results in $2$-classes Single-Domain Sentiment Classification, another future work will regard validating these results also in tasks with more than $2$ classes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We conducted an explorative, cross-sectional, multi-centre study in order to identify the most common problems of people with any kind of (primary) sleep disorder in a clinical setting using the International Classification of Functioning, Disability and Health (ICF) as a frame of reference. Data were collected from patients using a structured face-to-face interview of 45-60 min duration. A case record form for health professionals containing the extended ICF Checklist, sociodemographic variables and disease-specific variables was used. The study centres collected data of 99 individuals with sleep disorders. The identified categories include 48 (32%) for body functions, 13 (9%) body structures, 55 (37%) activities and participation and 32 (22%) for environmental factors. 'Sleep functions' (100%) and 'energy and drive functions', respectively, (85%) were the most severely impaired second-level categories of body functions followed by 'attention functions' (78%) and 'temperament and personality functions' (77%). With regard to the component activities and participation, patients felt most restricted in the categories of 'watching' (e.g. TV) (82%), 'recreation and leisure' (75%) and 'carrying out daily routine' (74%). Within the component environmental factors the categories 'support of immediate family', 'health services, systems and policies' and 'products or substances for personal consumption [medication]' were the most important facilitators; 'time-related changes', 'light' and 'climate' were the most important barriers. The study identified a large variety of functional problems reflecting the complexity of sleep disorders. The ICF has the potential to provide a comprehensive framework for the description of functional health in individuals with sleep disorders in a clinical setting.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Intravenous immunoglobulin (IVIG) is the first-line therapy for multifocal motor neuropathy (MMN). This open-label multi-centre study (NCT00701662) assessed the efficacy, safety, and convenience of subcutaneous immunoglobulin (SCIG) in patients with MMN over 6 months, as an alternative to IVIG. Eight MMN patients (42-66 years), on stable IVIG dosing, received weekly SCIG at doses equivalent to previous IVIG using a "smooth transition protocol". Primary efficacy endpoint was the change from baseline to week 24 in muscle strength. Disability, motor function, and health-related quality of life (HRQL) endpoints were also assessed. One patient deteriorated despite dose increase and was withdrawn. Muscle strength, disability, motor function, and health status were unchanged in all seven study completers who rated home treatment as extremely good. Four experienced 18 adverse events, of which only two were moderate. This study suggests that MMN patients with stable clinical course on regular IVIG can be switched to SCIG at the same monthly dose without deterioration and with a sustained overall improvement in HRQL.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In 1998-2001 Finland suffered the most severe insect outbreak ever recorded, over 500,000 hectares. The outbreak was caused by the common pine sawfly (Diprion pini L.). The outbreak has continued in the study area, Palokangas, ever since. To find a good method to monitor this type of outbreaks, the purpose of this study was to examine the efficacy of multi-temporal ERS-2 and ENVISAT SAR imagery for estimating Scots pine (Pinus sylvestris L.) defoliation. Three methods were tested: unsupervised k-means clustering, supervised linear discriminant analysis (LDA) and logistic regression. In addition, I assessed if harvested areas could be differentiated from the defoliated forest using the same methods. Two different speckle filters were used to determine the effect of filtering on the SAR imagery and subsequent results. The logistic regression performed best, producing a classification accuracy of 81.6% (kappa 0.62) with two classes (no defoliation, >20% defoliation). LDA accuracy was with two classes at best 77.7% (kappa 0.54) and k-means 72.8 (0.46). In general, the largest speckle filter, 5 x 5 image window, performed best. When additional classes were added the accuracy was usually degraded on a step-by-step basis. The results were good, but because of the restrictions in the study they should be confirmed with independent data, before full conclusions can be made that results are reliable. The restrictions include the small size field data and, thus, the problems with accuracy assessment (no separate testing data) as well as the lack of meteorological data from the imaging dates.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background and purpose: Breast cancer continues to be a health problem for women, representing 28 percent of all female cancers and remaining one of the leading causes of death for women. Breast cancer incidence rates become substantial before the age of 50. After menopause, breast cancer incidence rates continue to increase with age creating a long-lasting source of concern (Harris et al., 1992). Mammography, a technique for the detection of breast tumors in their nonpalpable stage when they are most curable, has taken on considerable importance as a public health measure. The lifetime risk of breast cancer is approximately 1 in 9 and occurs over many decades. Recommendations are that screening be periodic in order to detect cancer at early stages. These recommendations, largely, are not followed. Not only are most women not getting regular mammograms, but this circumstance is particularly the case among older women where regular mammography has been proven to reduce mortality by approximately 30 percent. The purpose of this project was to increase our understanding of factors that are associated with stage of readiness to obtain subsequent mammograms. A secondary purpose of this research was to suggest further conceptual considerations toward the extension of the Transtheoretical Model (TTM) of behavior change to repeat screening mammography. ^ Methods. A sample (n = 1,222) of women 50 years and older in a large multi-specialty clinic in Houston, Texas was surveyed by mail questionnaire regarding their previous screening experience and stage of readiness to obtain repeat screening. A computerized database, maintained on all women who undergo mammography at the clinic, was used to identify women who are eligible for the project. The major statistical technique employed to select the significant variables and to examine the man and interaction effects of independent variables on dependent variables was polychotomous stepwise, logistic regression. A prediction model for each stage of readiness definition was estimated. The expected probabilities for stage of readiness were calculated to assess the magnitude and direction of significant predictors. ^ Results. Analysis showed that both ways of defining stage of readiness for obtaining a screening mammogram were associated with specific constructs, including decisional balance and processes of the change. ^ Conclusions. The results of the present study demonstrate that the TTM appears to translate to repeat mammography screening. Findings in the current study also support finding of previous studies that suggest that stage of readiness is associated with respondent decisional balance and the processes of change. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Anticancer drugs typically are administered in the clinic in the form of mixtures, sometimes called combinations. Only in rare cases, however, are mixtures approved as drugs. Rather, research on mixtures tends to occur after single drugs have been approved. The goal of this research project was to develop modeling approaches that would encourage rational preclinical mixture design. To this end, a series of models were developed. First, several QSAR classification models were constructed to predict the cytotoxicity, oral clearance, and acute systemic toxicity of drugs. The QSAR models were applied to a set of over 115,000 natural compounds in order to identify promising ones for testing in mixtures. Second, an improved method was developed to assess synergistic, antagonistic, and additive effects between drugs in a mixture. This method, dubbed the MixLow method, is similar to the Median-Effect method, the de facto standard for assessing drug interactions. The primary difference between the two is that the MixLow method uses a nonlinear mixed-effects model to estimate parameters of concentration-effect curves, rather than an ordinary least squares procedure. Parameter estimators produced by the MixLow method were more precise than those produced by the Median-Effect Method, and coverage of Loewe index confidence intervals was superior. Third, a model was developed to predict drug interactions based on scores obtained from virtual docking experiments. This represents a novel approach for modeling drug mixtures and was more useful for the data modeled here than competing approaches. The model was applied to cytotoxicity data for 45 mixtures, each composed of up to 10 selected drugs. One drug, doxorubicin, was a standard chemotherapy agent and the others were well-known natural compounds including curcumin, EGCG, quercetin, and rhein. Predictions of synergism/antagonism were made for all possible fixed-ratio mixtures, cytotoxicities of the 10 best-scoring mixtures were tested, and drug interactions were assessed. Predicted and observed responses were highly correlated (r2 = 0.83). Results suggested that some mixtures allowed up to an 11-fold reduction of doxorubicin concentrations without sacrificing efficacy. Taken together, the models developed in this project present a general approach to rational design of mixtures during preclinical drug development. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is well accepted that tumorigenesis is a multi-step procedure involving aberrant functioning of genes regulating cell proliferation, differentiation, apoptosis, genome stability, angiogenesis and motility. To obtain a full understanding of tumorigenesis, it is necessary to collect information on all aspects of cell activity. Recent advances in high throughput technologies allow biologists to generate massive amounts of data, more than might have been imagined decades ago. These advances have made it possible to launch comprehensive projects such as (TCGA) and (ICGC) which systematically characterize the molecular fingerprints of cancer cells using gene expression, methylation, copy number, microRNA and SNP microarrays as well as next generation sequencing assays interrogating somatic mutation, insertion, deletion, translocation and structural rearrangements. Given the massive amount of data, a major challenge is to integrate information from multiple sources and formulate testable hypotheses. This thesis focuses on developing methodologies for integrative analyses of genomic assays profiled on the same set of samples. We have developed several novel methods for integrative biomarker identification and cancer classification. We introduce a regression-based approach to identify biomarkers predictive to therapy response or survival by integrating multiple assays including gene expression, methylation and copy number data through penalized regression. To identify key cancer-specific genes accounting for multiple mechanisms of regulation, we have developed the integIRTy software that provides robust and reliable inferences about gene alteration by automatically adjusting for sample heterogeneity as well as technical artifacts using Item Response Theory. To cope with the increasing need for accurate cancer diagnosis and individualized therapy, we have developed a robust and powerful algorithm called SIBER to systematically identify bimodally expressed genes using next generation RNAseq data. We have shown that prediction models built from these bimodal genes have the same accuracy as models built from all genes. Further, prediction models with dichotomized gene expression measurements based on their bimodal shapes still perform well. The effectiveness of outcome prediction using discretized signals paves the road for more accurate and interpretable cancer classification by integrating signals from multiple sources.