838 resultados para text and data mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently, there has been a growing interest in the field of metabolomics, materialized by a remarkable growth in experimental techniques, available data and related biological applications. Indeed, techniques as Nuclear Magnetic Resonance, Gas or Liquid Chromatography, Mass Spectrometry, Infrared and UV-visible spectroscopies have provided extensive datasets that can help in tasks as biological and biomedical discovery, biotechnology and drug development. However, as it happens with other omics data, the analysis of metabolomics datasets provides multiple challenges, both in terms of methodologies and in the development of appropriate computational tools. Indeed, from the available software tools, none addresses the multiplicity of existing techniques and data analysis tasks. In this work, we make available a novel R package, named specmine, which provides a set of methods for metabolomics data analysis, including data loading in different formats, pre-processing, metabolite identification, univariate and multivariate data analysis, machine learning, and feature selection. Importantly, the implemented methods provide adequate support for the analysis of data from diverse experimental techniques, integrating a large set of functions from several R packages in a powerful, yet simple to use environment. The package, already available in CRAN, is accompanied by a web site where users can deposit datasets, scripts and analysis reports to be shared with the community, promoting the efficient sharing of metabolomics data analysis pipelines.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Magdeburg, Univ., Fak. für Informatik, Diss., 2013

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This volume contains publications of the 1st International Conference on Applied Innovations in IT (ICAIIT), which took place in Koethen March 25th 2013. The conference is devoted to problems of applied research in the fields of mechanical and economical engineering, auotmation and communications as well as of data mining. The research results can be of interest for researchers and development engineers, who deal with theoretical base and the application of the knowledge in the respective areas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigated the following aspects of the biology of a population of Cnemidophorus vacariensis Feltrim & Lema, 2000 during the four seasons: thermal biology, relationship with the thermal environment, daily and seasonal activity, population structure and growth rate. Cnemidophorus vacariensis is restricted to rocky outcrops of the "campos de cima da serra" grasslands on the Araucaria Plateau, southern Brazil, and is currently listed as regionally and nationally threatened with extinction. Data were collected from October 2004 through September 2007 in the state of Rio Grande do Sul. Sampling was conducted randomly from 08:00 a.m. to 6:00 p.m. The capture-mark-recapture method was employed. The lizards were captured by hand, and their cloacal temperature, sex, snout-ventral length (SVL), mass, and the temperature of their microhabitat (substrate temperature and air temperature) were recorded. Individuals were then marked by toe-clipping and released at the site of capture. Body temperatures were obtained for 175 individuals, activity data for 96 individuals, and data on population structure and growth for 59 individuals. All data were obtained monthly, at different times of the day. Cnemidophorus vacariensis average body temperature was 23.84ºC, ranging between 9.6 and 38.2ºC. Temperatures ranged between 21 and 29ºC. The correlation between external heat sources, substrate and air were positive and significant and there was a greater correlation between lizard's temperature and the temperature of the substrate (tigmothermic species). The relatively low body temperatures of individuals are associated with the climate of their environment (altitude up to 1,400 m), with large variations in temperature throughout the day and the year, and low temperatures in winter. The average body temperature observed for C. vacariensis was low when compared with that of phylogenetically related species, suggesting that the thermal biology of this species reflects adaptations to the temperate region where it lives. The monthly rates of activity of lizards were related to monthly variations in the ambient temperatures. Our data suggest that the daily and seasonal activity of C. vacariensis result from the interaction between two factors: changes in the environment temperature and the relationship between individuals and their thermal environment. The population structure of C. vacariensis varied throughout the study period, with maximum biomass in January and maximum density in February (recruitment period). The sex ratio diverged from the expected 1:1. The growth analysis showed a negative relationship between the growth rate of individuals and the SVL, revealing that young individuals grow faster than adults, a typical pattern for short-lived species. The population studied showed a seasonal and cyclical variation associated with the reproductive cycle. The life strategy of C. vacariensis seems to include adaptations to the seasonal variations in temperature, typical of its environment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En la presente memoria se detallan con exactitud los pasos y procesos realizados para construir una aplicación que posibilite el cruce de datos genéticos a partir de información contenida en bases de datos remotas. Desarrolla un estudio en profundidad del contenido y estructura de las bases de datos remotas del NCBI y del KEGG, documentando una minería de datos con el objetivo de extraer de ellas la información necesaria para desarrollar la aplicación de cruce de datos genéticos. Finalmente se establecen los programas, scripts y entornos gráficos que han sido implementados para la construcción y posterior puesta en marcha de la aplicación que proporciona la funcionalidad de cruce de la que es objeto este proyecto fin de carrera.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

SUMMARYSpecies distribution models (SDMs) represent nowadays an essential tool in the research fields of ecology and conservation biology. By combining observations of species occurrence or abundance with information on the environmental characteristic of the observation sites, they can provide information on the ecology of species, predict their distributions across the landscape or extrapolate them to other spatial or time frames. The advent of SDMs, supported by geographic information systems (GIS), new developments in statistical models and constantly increasing computational capacities, has revolutionized the way ecologists can comprehend species distributions in their environment. SDMs have brought the tool that allows describing species realized niches across a multivariate environmental space and predict their spatial distribution. Predictions, in the form of probabilistic maps showing the potential distribution of the species, are an irreplaceable mean to inform every single unit of a territory about its biodiversity potential. SDMs and the corresponding spatial predictions can be used to plan conservation actions for particular species, to design field surveys, to assess the risks related to the spread of invasive species, to select reserve locations and design reserve networks, and ultimately, to forecast distributional changes according to scenarios of climate and/or land use change.By assessing the effect of several factors on model performance and on the accuracy of spatial predictions, this thesis aims at improving techniques and data available for distribution modelling and at providing the best possible information to conservation managers to support their decisions and action plans for the conservation of biodiversity in Switzerland and beyond. Several monitoring programs have been put in place from the national to the global scale, and different sources of data now exist and start to be available to researchers who want to model species distribution. However, because of the lack of means, data are often not gathered at an appropriate resolution, are sampled only over limited areas, are not spatially explicit or do not provide a sound biological information. A typical example of this is data on 'habitat' (sensu biota). Even though this is essential information for an effective conservation planning, it often has to be approximated from land use, the closest available information. Moreover, data are often not sampled according to an established sampling design, which can lead to biased samples and consequently to spurious modelling results. Understanding the sources of variability linked to the different phases of the modelling process and their importance is crucial in order to evaluate the final distribution maps that are to be used for conservation purposes.The research presented in this thesis was essentially conducted within the framework of the Landspot Project, a project supported by the Swiss National Science Foundation. The main goal of the project was to assess the possible contribution of pre-modelled 'habitat' units to model the distribution of animal species, in particular butterfly species, across Switzerland. While pursuing this goal, different aspects of data quality, sampling design and modelling process were addressed and improved, and implications for conservation discussed. The main 'habitat' units considered in this thesis are grassland and forest communities of natural and anthropogenic origin as defined in the typology of habitats for Switzerland. These communities are mainly defined at the phytosociological level of the alliance. For the time being, no comprehensive map of such communities is available at the national scale and at fine resolution. As a first step, it was therefore necessary to create distribution models and maps for these communities across Switzerland and thus to gather and collect the necessary data. In order to reach this first objective, several new developments were necessary such as the definition of expert models, the classification of the Swiss territory in environmental domains, the design of an environmentally stratified sampling of the target vegetation units across Switzerland, the development of a database integrating a decision-support system assisting in the classification of the relevés, and the downscaling of the land use/cover data from 100 m to 25 m resolution.The main contributions of this thesis to the discipline of species distribution modelling (SDM) are assembled in four main scientific papers. In the first, published in Journal of Riogeography different issues related to the modelling process itself are investigated. First is assessed the effect of five different stepwise selection methods on model performance, stability and parsimony, using data of the forest inventory of State of Vaud. In the same paper are also assessed: the effect of weighting absences to ensure a prevalence of 0.5 prior to model calibration; the effect of limiting absences beyond the environmental envelope defined by presences; four different methods for incorporating spatial autocorrelation; and finally, the effect of integrating predictor interactions. Results allowed to specifically enhance the GRASP tool (Generalized Regression Analysis and Spatial Predictions) that now incorporates new selection methods and the possibility of dealing with interactions among predictors as well as spatial autocorrelation. The contribution of different sources of remotely sensed information to species distribution models was also assessed. The second paper (to be submitted) explores the combined effects of sample size and data post-stratification on the accuracy of models using data on grassland distribution across Switzerland collected within the framework of the Landspot project and supplemented with other important vegetation databases. For the stratification of the data, different spatial frameworks were compared. In particular, environmental stratification by Swiss Environmental Domains was compared to geographical stratification either by biogeographic regions or political states (cantons). The third paper (to be submitted) assesses the contribution of pre- modelled vegetation communities to the modelling of fauna. It is a two-steps approach that combines the disciplines of community ecology and spatial ecology and integrates their corresponding concepts of habitat. First are modelled vegetation communities per se and then these 'habitat' units are used in order to model animal species habitat. A case study is presented with grassland communities and butterfly species. Different ways of integrating vegetation information in the models of butterfly distribution were also evaluated. Finally, a glimpse to climate change is given in the fourth paper, recently published in Ecological Modelling. This paper proposes a conceptual framework for analysing range shifts, namely a catalogue of the possible patterns of change in the distribution of a species along elevational or other environmental gradients and an improved quantitative methodology to identify and objectively describe these patterns. The methodology was developed using data from the Swiss national common breeding bird survey and the article presents results concerning the observed shifts in the elevational distribution of breeding birds in Switzerland.The overall objective of this thesis is to improve species distribution models as potential inputs for different conservation tools (e.g. red lists, ecological networks, risk assessment of the spread of invasive species, vulnerability assessment in the context of climate change). While no conservation issues or tools are directly tested in this thesis, the importance of the proposed improvements made in species distribution modelling is discussed in the context of the selection of reserve networks.RESUMELes modèles de distribution d'espèces (SDMs) représentent aujourd'hui un outil essentiel dans les domaines de recherche de l'écologie et de la biologie de la conservation. En combinant les observations de la présence des espèces ou de leur abondance avec des informations sur les caractéristiques environnementales des sites d'observation, ces modèles peuvent fournir des informations sur l'écologie des espèces, prédire leur distribution à travers le paysage ou l'extrapoler dans l'espace et le temps. Le déploiement des SDMs, soutenu par les systèmes d'information géographique (SIG), les nouveaux développements dans les modèles statistiques, ainsi que la constante augmentation des capacités de calcul, a révolutionné la façon dont les écologistes peuvent comprendre la distribution des espèces dans leur environnement. Les SDMs ont apporté l'outil qui permet de décrire la niche réalisée des espèces dans un espace environnemental multivarié et prédire leur distribution spatiale. Les prédictions, sous forme de carte probabilistes montrant la distribution potentielle de l'espèce, sont un moyen irremplaçable d'informer chaque unité du territoire de sa biodiversité potentielle. Les SDMs et les prédictions spatiales correspondantes peuvent être utilisés pour planifier des mesures de conservation pour des espèces particulières, pour concevoir des plans d'échantillonnage, pour évaluer les risques liés à la propagation d'espèces envahissantes, pour choisir l'emplacement de réserves et les mettre en réseau, et finalement, pour prévoir les changements de répartition en fonction de scénarios de changement climatique et/ou d'utilisation du sol. En évaluant l'effet de plusieurs facteurs sur la performance des modèles et sur la précision des prédictions spatiales, cette thèse vise à améliorer les techniques et les données disponibles pour la modélisation de la distribution des espèces et à fournir la meilleure information possible aux gestionnaires pour appuyer leurs décisions et leurs plans d'action pour la conservation de la biodiversité en Suisse et au-delà. Plusieurs programmes de surveillance ont été mis en place de l'échelle nationale à l'échelle globale, et différentes sources de données sont désormais disponibles pour les chercheurs qui veulent modéliser la distribution des espèces. Toutefois, en raison du manque de moyens, les données sont souvent collectées à une résolution inappropriée, sont échantillonnées sur des zones limitées, ne sont pas spatialement explicites ou ne fournissent pas une information écologique suffisante. Un exemple typique est fourni par les données sur 'l'habitat' (sensu biota). Même s'il s'agit d'une information essentielle pour des mesures de conservation efficaces, elle est souvent approximée par l'utilisation du sol, l'information qui s'en approche le plus. En outre, les données ne sont souvent pas échantillonnées selon un plan d'échantillonnage établi, ce qui biaise les échantillons et par conséquent les résultats de la modélisation. Comprendre les sources de variabilité liées aux différentes phases du processus de modélisation s'avère crucial afin d'évaluer l'utilisation des cartes de distribution prédites à des fins de conservation.La recherche présentée dans cette thèse a été essentiellement menée dans le cadre du projet Landspot, un projet soutenu par le Fond National Suisse pour la Recherche. L'objectif principal de ce projet était d'évaluer la contribution d'unités 'd'habitat' pré-modélisées pour modéliser la répartition des espèces animales, notamment de papillons, à travers la Suisse. Tout en poursuivant cet objectif, différents aspects touchant à la qualité des données, au plan d'échantillonnage et au processus de modélisation sont abordés et améliorés, et leurs implications pour la conservation des espèces discutées. Les principaux 'habitats' considérés dans cette thèse sont des communautés de prairie et de forêt d'origine naturelle et anthropique telles que définies dans la typologie des habitats de Suisse. Ces communautés sont principalement définies au niveau phytosociologique de l'alliance. Pour l'instant aucune carte de la distribution de ces communautés n'est disponible à l'échelle nationale et à résolution fine. Dans un premier temps, il a donc été nécessaire de créer des modèles de distribution de ces communautés à travers la Suisse et par conséquent de recueillir les données nécessaires. Afin d'atteindre ce premier objectif, plusieurs nouveaux développements ont été nécessaires, tels que la définition de modèles experts, la classification du territoire suisse en domaines environnementaux, la conception d'un échantillonnage environnementalement stratifié des unités de végétation cibles dans toute la Suisse, la création d'une base de données intégrant un système d'aide à la décision pour la classification des relevés, et le « downscaling » des données de couverture du sol de 100 m à 25 m de résolution. Les principales contributions de cette thèse à la discipline de la modélisation de la distribution d'espèces (SDM) sont rassemblées dans quatre articles scientifiques. Dans le premier article, publié dans le Journal of Biogeography, différentes questions liées au processus de modélisation sont étudiées en utilisant les données de l'inventaire forestier de l'Etat de Vaud. Tout d'abord sont évalués les effets de cinq méthodes de sélection pas-à-pas sur la performance, la stabilité et la parcimonie des modèles. Dans le même article sont également évalués: l'effet de la pondération des absences afin d'assurer une prévalence de 0.5 lors de la calibration du modèle; l'effet de limiter les absences au-delà de l'enveloppe définie par les présences; quatre méthodes différentes pour l'intégration de l'autocorrélation spatiale; et enfin, l'effet de l'intégration d'interactions entre facteurs. Les résultats présentés dans cet article ont permis d'améliorer l'outil GRASP qui intègre désonnais de nouvelles méthodes de sélection et la possibilité de traiter les interactions entre variables explicatives, ainsi que l'autocorrélation spatiale. La contribution de différentes sources de données issues de la télédétection a également été évaluée. Le deuxième article (en voie de soumission) explore les effets combinés de la taille de l'échantillon et de la post-stratification sur le la précision des modèles. Les données utilisées ici sont celles concernant la répartition des prairies de Suisse recueillies dans le cadre du projet Landspot et complétées par d'autres sources. Pour la stratification des données, différents cadres spatiaux ont été comparés. En particulier, la stratification environnementale par les domaines environnementaux de Suisse a été comparée à la stratification géographique par les régions biogéographiques ou par les cantons. Le troisième article (en voie de soumission) évalue la contribution de communautés végétales pré-modélisées à la modélisation de la faune. C'est une approche en deux étapes qui combine les disciplines de l'écologie des communautés et de l'écologie spatiale en intégrant leurs concepts de 'habitat' respectifs. Les communautés végétales sont modélisées d'abord, puis ces unités de 'habitat' sont utilisées pour modéliser les espèces animales. Une étude de cas est présentée avec des communautés prairiales et des espèces de papillons. Différentes façons d'intégrer l'information sur la végétation dans les modèles de répartition des papillons sont évaluées. Enfin, un clin d'oeil aux changements climatiques dans le dernier article, publié dans Ecological Modelling. Cet article propose un cadre conceptuel pour l'analyse des changements dans la distribution des espèces qui comprend notamment un catalogue des différentes formes possibles de changement le long d'un gradient d'élévation ou autre gradient environnemental, et une méthode quantitative améliorée pour identifier et décrire ces déplacements. Cette méthodologie a été développée en utilisant des données issues du monitoring des oiseaux nicheurs répandus et l'article présente les résultats concernant les déplacements observés dans la distribution altitudinale des oiseaux nicheurs en Suisse.L'objectif général de cette thèse est d'améliorer les modèles de distribution des espèces en tant que source d'information possible pour les différents outils de conservation (par exemple, listes rouges, réseaux écologiques, évaluation des risques de propagation d'espèces envahissantes, évaluation de la vulnérabilité des espèces dans le contexte de changement climatique). Bien que ces questions de conservation ne soient pas directement testées dans cette thèse, l'importance des améliorations proposées pour la modélisation de la distribution des espèces est discutée à la fin de ce travail dans le contexte de la sélection de réseaux de réserves.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This book combines geostatistics and global mapping systems to present an up-to-the-minute study of environmental data. Featuring numerous case studies, the reference covers model dependent (geostatistics) and data driven (machine learning algorithms) analysis techniques such as risk mapping, conditional stochastic simulations, descriptions of spatial uncertainty and variability, artificial neural networks (ANN) for spatial data, Bayesian maximum entropy (BME), and more.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Imaging mass spectrometry (IMS) is an emergent and innovative approach for measuring the composition, abundance and regioselectivity of molecules within an investigated area of fixed dimension. Although providing unprecedented molecular information compared with conventional MS techniques, enhancement of protein signature by IMS is still necessary and challenging. This paper demonstrates the combination of conventional organic washes with an optimized aqueous-based buffer for tissue section preparation before matrix-assisted laser desorption/ionization (MALDI) IMS of proteins. Based on a 500 mM ammonium formate in water-acetonitrile (9:1; v/v, 0.1% trifluororacetic acid, 0.1% Triton) solution, this buffer wash has shown to significantly enhance protein signature by profiling and IMS (~fourfold) when used after organic washes (70% EtOH followed by 90% EtOH), improving the quality and number of ion images obtained from mouse kidney and a 14-day mouse fetus whole-body tissue sections, while maintaining a similar reproducibility with conventional tissue rinsing. Even if some protein losses were observed, the data mining has demonstrated that it was primarily low abundant signals and that the number of new peaks found is greater with the described procedure. The proposed buffer has thus demonstrated to be of high efficiency for tissue section preparation providing novel and complementary information for direct on-tissue MALDI analysis compared with solely conventional organic rinsing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are provided in a single format that rationalises the results that would be obtained by searching the member databases individually. The latest release of InterPro contains 5629 entries describing 4280 families, 1239 domains, 95 repeats and 15 post-translational modifications. Currently, the combined signatures in InterPro cover more than 74% of all proteins in SWISS-PROT and TrEMBL, an increase of nearly 15% since the inception of InterPro. New features of the database include improved searching capabilities and enhanced graphical user interfaces for visualisation of the data. The database is available via a webserver (http://www.ebi.ac.uk/interpro) and anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aquest treball vol implementar un projecte de mineria de dades en l'àrea de la petrologia ígnia, especialitat englobada dins la geologia clàssica.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Consumer reviews, opinions and shared experiences in the use of a product is a powerful source of information about consumer preferences that can be used in recommender systems. Despite the importance and value of such information, there is no comprehensive mechanism that formalizes the opinions selection and retrieval process and the utilization of retrieved opinions due to the difficulty of extracting information from text data. In this paper, a new recommender system that is built on consumer product reviews is proposed. A prioritizing mechanism is developed for the system. The proposed approach is illustrated using the case study of a recommender system for digital cameras

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data concerning the relationship between hepatitis B virus (HBV) genotypes and liver histology are scarce. The aim of this study was to compare HBV non-B and non-C genotypes according to demographic features, clinical status, HBV-DNA levels and liver histology in Rio de Janeiro. One hundred twenty one consecutive chronic HBV-infected patients were enrolled during two-year period and data were prospectively collected. Sera were tested for HBV genotyping using restriction fragment length polymorphism. Liver biopsy was obtained from patients with either increased alanine aminotransferase (ALT) or HBV-DNA levels. Genotype A was the most common, found in 82 (68%) patients, followed by F in 19 (15%), D in 17 (14%), B in one (1%) and C in two (2%). There was no association between HBV genotypes A, D and F and gender (p = 0.37), age (p = 0.78), race (p = 0.22), mode of infection (p = 0.94), HB "e" antigen status (p = 0.37) and HBV-DNA levels (p = 0.47). The ALT levels were lower in genotype D (75%) compared with A (47%) and F (55%) (p = 0.05). Liver biopsy showed lower inflammation [histological activity index (HAI) = 4] and fibrosis (F) (= 0) scores in genotype D than in genotypes A (HAI = 5, p < 0.001; F = 2, p = 0.008) or F (HAI = 5, p = 0.009; F = 2, p = 0.01). Genotype A was the most prevalent in chronic HBV-infected patients and genotype D patients presented with less intense liver disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human immunodeficiency virus (HIV)-positive patients have a greater prevalence of coinfection with human papillomavirus (HPV) is of high oncogenic risk. Indeed, the presence of the virus favours intraepithelial squamous cell lesion progression and may induce cancer. The aim of this study was to evaluate the prevalence of HPV infection, distribution of HPV types and risk factors among HIV-positive patients. Cervical samples from 450 HIV-positive patients were analysed with regard to oncotic cytology, colposcopy and HPV presence and type by means of polymerase chain reaction and sequencing. The results were analysed by comparing demographic data and data relating to HPV and HIV infection. The prevalence of HPV was 47.5%. Among the HPV-positive samples, 59% included viral types of high oncogenic risk. Multivariate analysis showed an association between HPV infection and the presence of cytological alterations (p = 0.003), age greater than or equal to 35 years (p = 0.002), number of partners greater than three (p = 0.002), CD4+ lymphocyte count < 200/mm3 (p = 0.041) and alcohol abuse (p = 0.004). Although high-risk HPV was present in the majority of the lesions studied, the low frequency of HPV 16 (3.3%), low occurrence of cervical lesions and preserved immunological state in most of the HIV-positive patients were factors that may explain the low occurrence of precancerous cervical lesions in this population.