966 resultados para Probabilistic Model
Resumo:
El acceso a la telefonía móvil en Colombia evidencia ciertas particularidades con respecto a otros países. En Colombia surgió una nueva alternativa de comunicación que consistía en la venta de minutos de celular en las calles y en pequeños negocios durante los primeros años de ésta década. En este documento se analizan las principales características de quienes usan esta modalidad de comunicación con base en una encuesta dirigida a usuarios y no usuarios de bajos ingresos. Se usa un modelo probabilístico para explicar las características de las personas que lo usan y se encuentra que las personas que están en la modalidad de contrato prepago y que viven en ciudades pequeñas tienen una mayor probabilidad de usar esta alternativa de comunicación. De otro lado se encuentra que quienes están con el operador dominante tienden a usar este servicio también de una forma más notoria. Estos resultados parecen indicar que los diferenciales de precios entre las llamadas off-net y on-net así como entre prepago y postpago son los que alimentaron el surgimiento de esta actividad
Body mass index as a standard of living measure: a different interpretation for the case of Colombia
Resumo:
We analyze the Body Mass Index (BMI) in a distinct way of its traditional use and it lets us use it as a proxy of standard of living for the case of Colombia. Our approach is focused on studying how far the people are from the normal range and not on the score of each one and this lets us to treat equally extreme cases as severe thinness and obesity. We use a probabilistic model (Ordered Probit) that evaluates the probability of being within the normal range or another level. We found that socioeconomic variables have a significant effect on the dependent variable and that there are no linear effects. Besides, people with difficulties for walking and adults have less probability of having a normal BMI.
Resumo:
Early detection of breast cancer (BC) with mammography may cause overdiagnosis and overtreatment, detecting tumors which would remain undiagnosed during a lifetime. The aims of this study were: first, to model invasive BC incidence trends in Catalonia (Spain) taking into account reproductive and screening data; and second, to quantify the extent of BC overdiagnosis. We modeled the incidence of invasive BC using a Poisson regression model. Explanatory variables were: age at diagnosis and cohort characteristics (completed fertility rate, percentage of women that use mammography at age 50, and year of birth). This model also was used to estimate the background incidence in the absence of screening. We used a probabilistic model to estimate the expected BC incidence if women in the population used mammography as reported in health surveys. The difference between the observed and expected cumulative incidences provided an estimate of overdiagnosis.Incidence of invasive BC increased, especially in cohorts born from 1940 to 1955. The biggest increase was observed in these cohorts between the ages of 50 to 65 years, where the final BC incidence rates more than doubled the initial ones. Dissemination of mammography was significantly associated with BC incidence and overdiagnosis. Our estimates of overdiagnosis ranged from 0.4% to 46.6%, for women born around 1935 and 1950, respectively.Our results support the existence of overdiagnosis in Catalonia attributed to mammography usage, and the limited malignant potential of some tumors may play an important role. Women should be better informed about this risk. Research should be oriented towards personalized screening and risk assessment tools
Resumo:
We present a well-dated, high-resolution, ~ 45 kyr lake sediment record reflecting regional temperature and precipitation change in the continental interior of the Southern Hemisphere (SH) tropics of South America. The study site is Laguna La Gaiba (LLG), a large lake (95 km2) hydrologically-linked to the Pantanal, an immense, seasonally-flooded basin and the world's largest tropical wetland (135,000 km2). Lake-level changes at LLG are therefore reflective of regional precipitation. We infer past fluctuations in precipitation at this site through changes in: i) pollen-inferred extent of flood-tolerant forest; ii) relative abundance of terra firme humid tropical forest versus seasonally-dry tropical forest pollen types; and iii) proportions of deep- versus shallow-water diatoms. A probabilistic model, based on plant family and genus climatic optima, was used to generate quantitative estimates of past temperature from the fossil pollen data. Our temperature reconstruction demonstrates rising temperature (by 4 °C) at 19.5 kyr BP, synchronous with the onset of deglacial warming in the central Andes, strengthening the evidence that climatic warming in the SH tropics preceded deglacial warming in the Northern Hemisphere (NH) by at least 5 kyr. We provide unequivocal evidence that the climate at LLG was markedly drier during the last glacial period (45.0–12.2 kyr BP) than during the Holocene, contrasting with SH tropical Andean and Atlantic records that demonstrate a strengthening of the South American summer monsoon during the global Last Glacial Maximum (~ 21 kyr BP), in tune with the ~ 20 kyr precession orbital cycle. Holocene climate conditions occurred as early as 12.8–12.2 kyr BP, when increased precipitation in the Pantanal catchment caused heightened flooding and rising lake levels in LLG. In contrast to this strong geographic variation in LGM precipitation across the continent, expansion of tropical dry forest between 10 and 3 kyr BP at LLG strengthens the body of evidence for widespread early–mid Holocene drought across tropical South America.
Resumo:
The present study investigates the parsing of pre-nominal relative clauses (RCs) in children for the first time with a realtime methodology that reveals moment-to-moment processing patterns as the sentence unfolds. A self-paced listening experiment with Turkish-speaking children (aged 5–8) and adults showed that both groups display a sign of processing cost both in subject and object RCs at different points through the flow of the utterance when integrating the cues that are uninformative (i.e., ambiguous in function) and that are structurally and probabilistically unexpected. Both groups show a processing facilitation as soon as the morphosyntactic dependencies are completed and parse the unbounded dependencies rapidly using the morphosyntactic cues rather than waiting for the clause-final filler. These findings show that five-year-old children show similar patterns to adults in processing the morphosyntactic cues incrementally and in forming expectations about the rest of the utterance on the basis of the probabilistic model of their language.
Resumo:
O objetivo deste trabalho é testar a aplicação de um modelo gráfico probabilístico, denominado genericamente de Redes Bayesianas, para desenvolver modelos computacionais que possam ser utilizados para auxiliar a compreensão de problemas e/ou na previsão de variáveis de natureza econômica. Com este propósito, escolheu-se um problema amplamente abordado na literatura e comparou-se os resultados teóricos e experimentais já consolidados com os obtidos utilizando a técnica proposta. Para tanto,foi construído um modelo para a classificação da tendência do "risco país" para o Brasil a partir de uma base de dados composta por variáveis macroeconômicas e financeiras. Como medida do risco adotou-se o EMBI+ (Emerging Markets Bond Index Plus), por ser um indicador amplamente utilizado pelo mercado.
Resumo:
A Sigatoka-negra (Mycosphaerella fijiensis) ameaça os bananais comerciais em todas as áreas produtoras do mundo e provoca danos quantitativos e qualitativos na produção, acarretando sérios prejuízos financeiros. Faz-se necessário o estudo da vulnerabilidade das plantas em diversos estádios de desenvolvimento e das condições climáticas favoráveis à ocorrência da doença. Objetivou-se com este trabalho desenvolver um modelo probabilístico baseado em funções polinomiais que represente o risco de ocorrência da Sigatokanegra em função da vulnerabilidade decorrente de fatores intrínsecos à planta e ao ambiente. Realizou-se um estudo de caso, em bananal comercial localizado em Jacupiranga, Vale do Ribeira, SP, considerando o monitoramento semanal do estado da evolução da doença, séries temporais de dados meteorológicos e dados de sensoriamento remoto. Foram gerados mapas georreferenciados do risco da Sigatoka-negra em diferentes épocas do ano. Um modelo para estimar a evolução da doença a partir de imagens de satélite foi obtido com coeficiente de determinação R² igual a 0,9. A metodologia foi desenvolvida para a detecção de épocas e locais que reúnem condições favoráveis à ocorrência da Sigatoka-negra e pode ser aplicada, com os devidos ajustes, em diferentes localidades, para avaliar o risco da ocorrência da doença em polos produtores de banana.
Resumo:
Female broiler breeder productivity is based on the principles of thermal comfort that are directly related with the microclimate inside the housing. This research had the objective of monitoring the behavior of female broiler breeders, using the technology of radio-frequency, injectable transponders and readers in different existing microclimates inside a small scale distorted housing model. Eight birds with electronic identification were used. Three readers were used, in three different points inside the model: on the floor of the nest, in the passage besides the lateral wall and below the water facility. Dry bulb (DBT), wet bulb (WBT) and black globe (BGT) temperature were measured continuously. The results point out a distinct behavioral pattern of the birds regarding the environment exposition during the experiment. Three probabilistic models of behavior were developed from the recorded data: probabilistic model for the passage use: FP = 1.10 - 0.244 ln(DBT), probabilistic model for the water facility use: FB = 0.398 + 0.00866(DBT), and probabilistic model for the nest use: FN = 2.22 - 0.272 DBT + 0,011 DBT 2 - 0.000144 DBT 3.
Resumo:
Pós-graduação em Ciências Cartográficas - FCT
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Os sequenciadores de nova geração como as plataformas Illumina e SOLiD geram uma grande quantidade de dados, comumente, acima de 10 Gigabytes de arquivos-texto. Particularmente, a plataforma SOLiD permite o sequenciamento de múltiplas amostras em uma única corrida (denominada de corrida multiplex) por meio de um sistema de marcação chamado Barcode. Esta funcionalidade requer um processo computacional para separação dos dados por amostra, pois, o sequenciador fornece a mistura de todas amostras em uma única saída. Este processo deve ser seguro a fim de evitar eventuais embaralhamentos que possam prejudicar as análises posteriores. Neste contexto, o presente trabalho propõe desenvolvimento de um modelo probabilístico capaz de caracterizar sistema de marcação utilizado em sequenciamentos multiplex. Os resultados obtidos corroboraram a suficiência do modelo obtido, o qual permite, dentre outras coisas, identificar faltas em algum passo do processo de sequenciamento; adaptar e desenvolver de novos protocolos para preparação de amostras, além de atribuir um Grau de Confiança aos dados gerados e guiar um processo de filtragem que respeite as características de cada sequenciamento, não descartando sequências úteis de forma arbitrária.
Resumo:
Pós-graduação em Agronomia (Entomologia Agrícola) - FCAV
Resumo:
Abstract Background One goal of gene expression profiling is to identify signature genes that robustly distinguish different types or grades of tumors. Several tumor classifiers based on expression profiling have been proposed using microarray technique. Due to important differences in the probabilistic models of microarray and SAGE technologies, it is important to develop suitable techniques to select specific genes from SAGE measurements. Results A new framework to select specific genes that distinguish different biological states based on the analysis of SAGE data is proposed. The new framework applies the bolstered error for the identification of strong genes that separate the biological states in a feature space defined by the gene expression of a training set. Credibility intervals defined from a probabilistic model of SAGE measurements are used to identify the genes that distinguish the different states with more reliability among all gene groups selected by the strong genes method. A score taking into account the credibility and the bolstered error values in order to rank the groups of considered genes is proposed. Results obtained using SAGE data from gliomas are presented, thus corroborating the introduced methodology. Conclusion The model representing counting data, such as SAGE, provides additional statistical information that allows a more robust analysis. The additional statistical information provided by the probabilistic model is incorporated in the methodology described in the paper. The introduced method is suitable to identify signature genes that lead to a good separation of the biological states using SAGE and may be adapted for other counting methods such as Massive Parallel Signature Sequencing (MPSS) or the recent Sequencing-By-Synthesis (SBS) technique. Some of such genes identified by the proposed method may be useful to generate classifiers.
Resumo:
Abstract Background A large number of probabilistic models used in sequence analysis assign non-zero probability values to most input sequences. To decide when a given probability is sufficient the most common way is bayesian binary classification, where the probability of the model characterizing the sequence family of interest is compared to that of an alternative probability model. We can use as alternative model a null model. This is the scoring technique used by sequence analysis tools such as HMMER, SAM and INFERNAL. The most prevalent null models are position-independent residue distributions that include: the uniform distribution, genomic distribution, family-specific distribution and the target sequence distribution. This paper presents a study to evaluate the impact of the choice of a null model in the final result of classifications. In particular, we are interested in minimizing the number of false predictions in a classification. This is a crucial issue to reduce costs of biological validation. Results For all the tests, the target null model presented the lowest number of false positives, when using random sequences as a test. The study was performed in DNA sequences using GC content as the measure of content bias, but the results should be valid also for protein sequences. To broaden the application of the results, the study was performed using randomly generated sequences. Previous studies were performed on aminoacid sequences, using only one probabilistic model (HMM) and on a specific benchmark, and lack more general conclusions about the performance of null models. Finally, a benchmark test with P. falciparum confirmed these results. Conclusions Of the evaluated models the best suited for classification are the uniform model and the target model. However, the use of the uniform model presents a GC bias that can cause more false positives for candidate sequences with extreme compositional bias, a characteristic not described in previous studies. In these cases the target model is more dependable for biological validation due to its higher specificity.
Resumo:
Precipitation retrieval over high latitudes, particularly snowfall retrieval over ice and snow, using satellite-based passive microwave spectrometers, is currently an unsolved problem. The challenge results from the large variability of microwave emissivity spectra for snow and ice surfaces, which can mimic, to some degree, the spectral characteristics of snowfall. This work focuses on the investigation of a new snowfall detection algorithm specific for high latitude regions, based on a combination of active and passive sensors able to discriminate between snowing and non snowing areas. The space-borne Cloud Profiling Radar (on CloudSat), the Advanced Microwave Sensor units A and B (on NOAA-16) and the infrared spectrometer MODIS (on AQUA) have been co-located for 365 days, from October 1st 2006 to September 30th, 2007. CloudSat products have been used as truth to calibrate and validate all the proposed algorithms. The methodological approach followed can be summarised into two different steps. In a first step, an empirical search for a threshold, aimed at discriminating the case of no snow, was performed, following Kongoli et al. [2003]. This single-channel approach has not produced appropriate results, a more statistically sound approach was attempted. Two different techniques, which allow to compute the probability above and below a Brightness Temperature (BT) threshold, have been used on the available data. The first technique is based upon a Logistic Distribution to represent the probability of Snow given the predictors. The second technique, defined Bayesian Multivariate Binary Predictor (BMBP), is a fully Bayesian technique not requiring any hypothesis on the shape of the probabilistic model (such as for instance the Logistic), which only requires the estimation of the BT thresholds. The results obtained show that both methods proposed are able to discriminate snowing and non snowing condition over the Polar regions with a probability of correct detection larger than 0.5, highlighting the importance of a multispectral approach.