373 resultados para unsupervised


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Pós-graduação em Medicina Veterinária - FCAV

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Large gene expression studies, such as those conducted using DNA arrays, often provide millions of different pieces of data. To address the problem of analyzing such data, we describe a statistical method, which we have called ‘gene shaving’. The method identifies subsets of genes with coherent expression patterns and large variation across conditions. Gene shaving differs from hierarchical clustering and other widely used methods for analyzing gene expression studies in that genes may belong to more than one cluster, and the clustering may be supervised by an outcome measure. The technique can be ‘unsupervised’, that is, the genes and samples are treated as unlabeled, or partially or fully supervised by using known properties of the genes or samples to assist in finding meaningful groupings. Results: We illustrate the use of the gene shaving method to analyze gene expression measurements made on samples from patients with diffuse large B-cell lymphoma. The method identifies a small cluster of genes whose expression is highly predictive of survival. Conclusions: The gene shaving method is a potentially useful tool for exploration of gene expression data and identification of interesting clusters of genes worth further investigation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Restinga of Marambaia is an emerged sand bar located between the Sepetiba Bay and the South Atlantic Ocean, on the south-east coast of Brazil. The objective of this study was to observe the geomorphologic evolution of the coastal zone of the Restinga of Marambaia using multitemporal satellite images acquired by multisensors from 1975 to 2004. The images were digitally segmented by a region growth algorithm and submitted to an unsupervised classification procedure (ISOSEG) followed by a raster edit based on visual interpretation. The image time-series showed a general trend of decrease in the total sand bar area with values varying from 80.61km(2) in 1975 to 78.15km(2) in 2004. The total area calculation based on the 1975 and 1978 Landsat MSS data was shown to be super-estimated in relation to the Landsat TM, Landsat ETM+, and CBERS-2 CCD data. These differences can also be associated to the relatively poorer spatial resolution of the MSS data, nominally 79m, against the 20m of the CCD data and 30m of the TM and ETM+ data. For the estimates of the width in the central portion of the sand bar the variation was from 158m (1975) to 100m (2004). The formation of a spit in the northern region of the study area was visually observed. The area of the spit was estimated, with values varying from 0.82km(2) (1975) to 0.55km(2) (2004).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The attributes describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that simultaneously performs fuzzy clustering and aspects weighting was proposed in the literature. However, SCAD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to reduce the number of parameters required to be set by the user. In this paper we prove that each step of the resulting algorithm, named ASCAD, globally minimizes its cost-function with respect to the argument being optimized. The asymptotic analysis of ASCAD leads to a time complexity which is the same as that of fuzzy c-means. A hard version of the algorithm and a novel validity criterion that considers aspect weights in order to estimate the number of clusters are also described. The proposed method is assessed over several artificial and real data sets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Thiosemicarbazones are cruzain inhibitors which have been identified as potential antitrypanosomal agents. In this work, several molecular properties were calculated at the density functional theory (DFT)/B3LYP/6-311G* level for a set of 44 thiosemicarbazones. Unsupervised and supervised pattern recognition techniques (hierarchical cluster analysis, principal component analysis, kth-nearest neighbors, and soft independent modeling by class analogy) were used to obtain structureactivity relationship models, which are able to classify unknown compounds according to their activities. The chemometric analyses performed here revealed that 12 descriptors can be considered responsible for the discrimination between high and low activity compounds. Classification models were validated with an external test set, showing that predictive classifications were achieved with the selected variable set. The results obtained here are in good agreement with previous findings from the literature, suggesting that our models can be useful on further investigations on the molecular determinants for the antichagasic activity. (C) 2012 Wiley Periodicals, Inc.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Semi-supervised learning techniques have gained increasing attention in the machine learning community, as a result of two main factors: (1) the available data is exponentially increasing; (2) the task of data labeling is cumbersome and expensive, involving human experts in the process. In this paper, we propose a network-based semi-supervised learning method inspired by the modularity greedy algorithm, which was originally applied for unsupervised learning. Changes have been made in the process of modularity maximization in a way to adapt the model to propagate labels throughout the network. Furthermore, a network reduction technique is introduced, as well as an extensive analysis of its impact on the network. Computer simulations are performed for artificial and real-world databases, providing a numerical quantitative basis for the performance of the proposed method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In protein databases there is a substantial number of proteins structurally determined but without function annotation. Understanding the relationship between function and structure can be useful to predict function on a large scale. We have analyzed the similarities in global physicochemical parameters for a set of enzymes which were classified according to the four Enzyme Commission (EC) hierarchical levels. Using relevance theory we introduced a distance between proteins in the space of physicochemical characteristics. This was done by minimizing a cost function of the metric tensor built to reflect the EC classification system. Using an unsupervised clustering method on a set of 1025 enzymes, we obtained no relevant clustering formation compatible with EC classification. The distance distributions between enzymes from the same EC group and from different EC groups were compared by histograms. Such analysis was also performed using sequence alignment similarity as a distance. Our results suggest that global structure parameters are not sufficient to segregate enzymes according to EC hierarchy. This indicates that features essential for function are rather local than global. Consequently, methods for predicting function based on global attributes should not obtain high accuracy in main EC classes prediction without relying on similarities between enzymes from training and validation datasets. Furthermore, these results are consistent with a substantial number of studies suggesting that function evolves fundamentally by recruitment, i.e., a same protein motif or fold can be used to perform different enzymatic functions and a few specific amino acids (AAs) are actually responsible for enzyme activity. These essential amino acids should belong to active sites and an effective method for predicting function should be able to recognize them. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Falling in older age is a major public health concern due to its costly and disabling consequences. However very few randomised controlled trials (RCTs) have been conducted in developing countries, in which population ageing is expected to be particularly substantial in coming years. This article describes the design of an RCT to evaluate the effectiveness of a multifactorial falls prevention program in reducing the rate of falls in community-dwelling older people. Methods/design Multicentre parallel-group RCT involving 612 community-dwelling men and women aged 60 years and over, who have fallen at least once in the previous year. Participants will be recruited in multiple settings in Sao Paulo, Brazil and will be randomly allocated to a control group or an intervention group. The usual care control group will undergo a fall risk factor assessment and be referred to their clinicians with the risk assessment report so that individual modifiable risk factors can be managed without any specific guidance. The intervention group will receive a 12-week Multifactorial Falls Prevention Program consisting of: an individualised medical management of modifiable risk factors, a group-based, supervised balance training exercise program plus an unsupervised home-based exercise program, an educational/behavioral intervention. Both groups will receive a leaflet containing general information about fall prevention strategies. Primary outcome measures will be the rate of falls and the proportion of fallers recorded by monthly falls diaries and telephone calls over a 12 month period. Secondary outcomes measures will include risk of falling, fall-related self-efficacy score, measures of balance, mobility and strength, fall-related health services use and independence with daily tasks. Data will be analysed using the intention-to-treat principle.The incidence of falls in the intervention and control groups will be calculated and compared using negative binomial regression analysis. Discussion This study is the first trial to be conducted in Brazil to evaluate the effectiveness of an intervention to prevent falls. If proven to reduce falls this study has the potential to benefit older adults and assist health care practitioners and policy makers to implement and promote effective falls prevention interventions. Trial registration ClinicalTrials.gov (NCT01698580)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

[ES]El spam, o correo no deseado enviado masivamente, es una amenaza que afecta al correo electrónico y otros medios de comunicación telemática. Su alto volumen de circulación genera pérdidas temporales y económicas considerables. Se presenta una solución a este problema: un sistema inteligente híbrido de filtrado antispam, basado en redes neuronales artificiales (RNA) no supervisadas. Consta de una etapa de preprocesado y de otra de procesado, basadas en distintos modelos de computación: programada (con 2 fases: manual y computacional) y neuronal (mediante mapas autoorganizados de Kohonen, SOM), respectivamente. Este sistema ha sido optimizado usando, como cuerpo de datos, ham de “Enron Email” y spam de dos fuentes diferentes. Se analiza la calidad y el rendimiento del mismo mediante diferentes métricas.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

[ES]The aim of the Kinship Verification in the Wild Evaluation (held in conjunction with the 2015 IEEE International Conference on Automatic Face and Gesture Recognition, Ljubljana, Slovenia) was to evaluate different kinship verification algorithms. For this task, two datasets were made available and three possible experimental protocols (unsupervised, image-restricted, and image-unrestricted) were designed. Five institutions submitted their results to the evaluation: (i) Politecnico di Torino, Italy; (ii) LIRIS-University of Lyon, France; (iii) Universidad de Las Palmas de Gran Canaria, Spain; (iv) Nanjing University of Aeronautics and Astronautics, China; and (v) Bar Ilan University, Israel. Most of the participants tackled the image-restricted challenge and experimental results demonstrated better kinship verification performance than the baseline methods provided by the organizers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The purpose of this Thesis is to develop a robust and powerful method to classify galaxies from large surveys, in order to establish and confirm the connections between the principal observational parameters of the galaxies (spectral features, colours, morphological indices), and help unveil the evolution of these parameters from $z \sim 1$ to the local Universe. Within the framework of zCOSMOS-bright survey, and making use of its large database of objects ($\sim 10\,000$ galaxies in the redshift range $0 < z \lesssim 1.2$) and its great reliability in redshift and spectral properties determinations, first we adopt and extend the \emph{classification cube method}, as developed by Mignoli et al. (2009), to exploit the bimodal properties of galaxies (spectral, photometric and morphologic) separately, and then combining together these three subclassifications. We use this classification method as a test for a newly devised statistical classification, based on Principal Component Analysis and Unsupervised Fuzzy Partition clustering method (PCA+UFP), which is able to define the galaxy population exploiting their natural global bimodality, considering simultaneously up to 8 different properties. The PCA+UFP analysis is a very powerful and robust tool to probe the nature and the evolution of galaxies in a survey. It allows to define with less uncertainties the classification of galaxies, adding the flexibility to be adapted to different parameters: being a fuzzy classification it avoids the problems due to a hard classification, such as the classification cube presented in the first part of the article. The PCA+UFP method can be easily applied to different datasets: it does not rely on the nature of the data and for this reason it can be successfully employed with others observables (magnitudes, colours) or derived properties (masses, luminosities, SFRs, etc.). The agreement between the two classification cluster definitions is very high. ``Early'' and ``late'' type galaxies are well defined by the spectral, photometric and morphological properties, both considering them in a separate way and then combining the classifications (classification cube) and treating them as a whole (PCA+UFP cluster analysis). Differences arise in the definition of outliers: the classification cube is much more sensitive to single measurement errors or misclassifications in one property than the PCA+UFP cluster analysis, in which errors are ``averaged out'' during the process. This method allowed us to behold the \emph{downsizing} effect taking place in the PC spaces: the migration between the blue cloud towards the red clump happens at higher redshifts for galaxies of larger mass. The determination of $M_{\mathrm{cross}}$ the transition mass is in significant agreement with others values in literature.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

3D video-fluoroscopy is an accurate but cumbersome technique to estimate natural or prosthetic human joint kinematics. This dissertation proposes innovative methodologies to improve the 3D fluoroscopic analysis reliability and usability. Being based on direct radiographic imaging of the joint, and avoiding soft tissue artefact that limits the accuracy of skin marker based techniques, the fluoroscopic analysis has a potential accuracy of the order of mm/deg or better. It can provide fundamental informations for clinical and methodological applications, but, notwithstanding the number of methodological protocols proposed in the literature, time consuming user interaction is exploited to obtain consistent results. The user-dependency prevented a reliable quantification of the actual accuracy and precision of the methods, and, consequently, slowed down the translation to the clinical practice. The objective of the present work was to speed up this process introducing methodological improvements in the analysis. In the thesis, the fluoroscopic analysis was characterized in depth, in order to evaluate its pros and cons, and to provide reliable solutions to overcome its limitations. To this aim, an analytical approach was followed. The major sources of error were isolated with in-silico preliminary studies as: (a) geometric distortion and calibration errors, (b) 2D images and 3D models resolutions, (c) incorrect contour extraction, (d) bone model symmetries, (e) optimization algorithm limitations, (f) user errors. The effect of each criticality was quantified, and verified with an in-vivo preliminary study on the elbow joint. The dominant source of error was identified in the limited extent of the convergence domain for the local optimization algorithms, which forced the user to manually specify the starting pose for the estimating process. To solve this problem, two different approaches were followed: to increase the optimal pose convergence basin, the local approach used sequential alignments of the 6 degrees of freedom in order of sensitivity, or a geometrical feature-based estimation of the initial conditions for the optimization; the global approach used an unsupervised memetic algorithm to optimally explore the search domain. The performances of the technique were evaluated with a series of in-silico studies and validated in-vitro with a phantom based comparison with a radiostereometric gold-standard. The accuracy of the method is joint-dependent, and for the intact knee joint, the new unsupervised algorithm guaranteed a maximum error lower than 0.5 mm for in-plane translations, 10 mm for out-of-plane translation, and of 3 deg for rotations in a mono-planar setup; and lower than 0.5 mm for translations and 1 deg for rotations in a bi-planar setups. The bi-planar setup is best suited when accurate results are needed, such as for methodological research studies. The mono-planar analysis may be enough for clinical application when the analysis time and cost may be an issue. A further reduction of the user interaction was obtained for prosthetic joints kinematics. A mixed region-growing and level-set segmentation method was proposed and halved the analysis time, delegating the computational burden to the machine. In-silico and in-vivo studies demonstrated that the reliability of the new semiautomatic method was comparable to a user defined manual gold-standard. The improved fluoroscopic analysis was finally applied to a first in-vivo methodological study on the foot kinematics. Preliminary evaluations showed that the presented methodology represents a feasible gold-standard for the validation of skin marker based foot kinematics protocols.