932 resultados para predictive power


Relevância:

60.00% 60.00%

Publicador:

Resumo:

El aprendizaje automático y la cienciometría son las disciplinas científicas que se tratan en esta tesis. El aprendizaje automático trata sobre la construcción y el estudio de algoritmos que puedan aprender a partir de datos, mientras que la cienciometría se ocupa principalmente del análisis de la ciencia desde una perspectiva cuantitativa. Hoy en día, los avances en el aprendizaje automático proporcionan las herramientas matemáticas y estadísticas para trabajar correctamente con la gran cantidad de datos cienciométricos almacenados en bases de datos bibliográficas. En este contexto, el uso de nuevos métodos de aprendizaje automático en aplicaciones de cienciometría es el foco de atención de esta tesis doctoral. Esta tesis propone nuevas contribuciones en el aprendizaje automático que podrían arrojar luz sobre el área de la cienciometría. Estas contribuciones están divididas en tres partes: Varios modelos supervisados (in)sensibles al coste son aprendidos para predecir el éxito científico de los artículos y los investigadores. Los modelos sensibles al coste no están interesados en maximizar la precisión de clasificación, sino en la minimización del coste total esperado derivado de los errores ocasionados. En este contexto, los editores de revistas científicas podrían disponer de una herramienta capaz de predecir el número de citas de un artículo en el fututo antes de ser publicado, mientras que los comités de promoción podrían predecir el incremento anual del índice h de los investigadores en los primeros años. Estos modelos predictivos podrían allanar el camino hacia nuevos sistemas de evaluación. Varios modelos gráficos probabilísticos son aprendidos para explotar y descubrir nuevas relaciones entre el gran número de índices bibliométricos existentes. En este contexto, la comunidad científica podría medir cómo algunos índices influyen en otros en términos probabilísticos y realizar propagación de la evidencia e inferencia abductiva para responder a preguntas bibliométricas. Además, la comunidad científica podría descubrir qué índices bibliométricos tienen mayor poder predictivo. Este es un problema de regresión multi-respuesta en el que el papel de cada variable, predictiva o respuesta, es desconocido de antemano. Los índices resultantes podrían ser muy útiles para la predicción, es decir, cuando se conocen sus valores, el conocimiento de cualquier valor no proporciona información sobre la predicción de otros índices bibliométricos. Un estudio bibliométrico sobre la investigación española en informática ha sido realizado bajo la cultura de publicar o morir. Este estudio se basa en una metodología de análisis de clusters que caracteriza la actividad en la investigación en términos de productividad, visibilidad, calidad, prestigio y colaboración internacional. Este estudio también analiza los efectos de la colaboración en la productividad y la visibilidad bajo diferentes circunstancias. ABSTRACT Machine learning and scientometrics are the scientific disciplines which are covered in this dissertation. Machine learning deals with the construction and study of algorithms that can learn from data, whereas scientometrics is mainly concerned with the analysis of science from a quantitative perspective. Nowadays, advances in machine learning provide the mathematical and statistical tools for properly working with the vast amount of scientometrics data stored in bibliographic databases. In this context, the use of novel machine learning methods in scientometrics applications is the focus of attention of this dissertation. This dissertation proposes new machine learning contributions which would shed light on the scientometrics area. These contributions are divided in three parts: Several supervised cost-(in)sensitive models are learned to predict the scientific success of articles and researchers. Cost-sensitive models are not interested in maximizing classification accuracy, but in minimizing the expected total cost of the error derived from mistakes in the classification process. In this context, publishers of scientific journals could have a tool capable of predicting the citation count of an article in the future before it is published, whereas promotion committees could predict the annual increase of the h-index of researchers within the first few years. These predictive models would pave the way for new assessment systems. Several probabilistic graphical models are learned to exploit and discover new relationships among the vast number of existing bibliometric indices. In this context, scientific community could measure how some indices influence others in probabilistic terms and perform evidence propagation and abduction inference for answering bibliometric questions. Also, scientific community could uncover which bibliometric indices have a higher predictive power. This is a multi-output regression problem where the role of each variable, predictive or response, is unknown beforehand. The resulting indices could be very useful for prediction purposes, that is, when their index values are known, knowledge of any index value provides no information on the prediction of other bibliometric indices. A scientometric study of the Spanish computer science research is performed under the publish-or-perish culture. This study is based on a cluster analysis methodology which characterizes the research activity in terms of productivity, visibility, quality, prestige and international collaboration. This study also analyzes the effects of collaboration on productivity and visibility under different circumstances.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The human β2-adrenergic receptor gene has multiple single-nucleotide polymorphisms (SNPs), but the relevance of chromosomally phased SNPs (haplotypes) is not known. The phylogeny and the in vitro and in vivo consequences of variations in the 5′ upstream and ORF were delineated in a multiethnic reference population and an asthmatic cohort. Thirteen SNPs were found organized into 12 haplotypes out of the theoretically possible 8,192 combinations. Deep divergence in the distribution of some haplotypes was noted in Caucasian, African-American, Asian, and Hispanic-Latino ethnic groups with >20-fold differences among the frequencies of the four major haplotypes. The relevance of the five most common β2-adrenergic receptor haplotype pairs was determined in vivo by assessing the bronchodilator response to β agonist in asthmatics. Mean responses by haplotype pair varied by >2-fold, and response was significantly related to the haplotype pair (P = 0.007) but not to individual SNPs. Expression vectors representing two of the haplotypes differing at eight of the SNP loci and associated with divergent in vivo responsiveness to agonist were used to transfect HEK293 cells. β2-adrenergic receptor mRNA levels and receptor density in cells transfected with the haplotype associated with the greater physiologic response were ≈50% greater than those transfected with the lower response haplotype. The results indicate that the unique interactions of multiple SNPs within a haplotype ultimately can affect biologic and therapeutic phenotype and that individual SNPs may have poor predictive power as pharmacogenetic loci.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recent experiments have exposed significant discrepancies between experimental data and predictive models for DNA structure. These results strongly suggest that DNA structural parameters incorporated in the models are not always sufficient to account for the influence of sequence context and of specific ion effects. In an attempt to evaluate these two effects, we have investigated repetitive DNA sequences with the sequence motif GAGAG.CTCTC located in different helical phasing arrangements with respect to poly(A) tracts and GGGCCC.GGGCCC sequence motifs. Methods used are ligase-mediated cyclization and gel mobility experiments along with DNase I cutting and chemical probe studies. The results provide new evidence for curvature in poly(A) tracts. They also show that the sequence context in which bending and flexible sequence elements are found is an important aspect of sequence-dependent DNA conformation. Although dinucleotide models generally have good predictive power, this work demonstrates that in some instances sequence elements larger than the dinucleotide must be taken into account, and hence it provides a starting point for the appropriate modification and refinement of existing structural models for DNA.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The purpose of this research was to apply the use of direct ablation plasma spectroscopic techniques, including spark-induced breakdown spectroscopy (SIBS) and laser-induced breakdown spectroscopy (LIBS), to a variety of environmental matrices. These were applied to two different analytical problems. SIBS instrumentation was adapted in order to develop a fieldable monitor for the measurement of carbon in soil. SIBS spectra in the 200 nm to 400 nm region of several soils were collected, and the neutral carbon line (247.85 nm) was compared to total carbon concentration determined by standard dry combustion analysis. Additionally, Fe and Si were evaluated in a multivariate model in order to determine their impacts on the model's predictive power for total carbon concentrations. The results indicate that SIBS is a viable method to quantify total carbon levels in soils; obtaining a good correlation between measured and predicated carbon in soils. These results indicate that multivariate analysis can be used to construct a calibration model for SIBS soil spectra, and SIBS is a promising method for the determination of total soil carbon. SIBS was also applied to the study of biological warfare agent simulants. Elemental compositions (determined independently) of bioaerosol samples were compared to the SIBS atomic (Ca, Al, Fe and Si) and molecular (CN, N2 and OH) emission signals. Results indicate a linear relationship between the temporally integrated emission strength and the concentration of the associated element. Finally, LIBS signals of hematite were analyzed under low pressures of pure CO2 and compared with signals acquired with a mixture of CO2, N2 and Ar, which is representative of the Martian atmosphere. This research was in response to the potential use of LIBS instrumentation on the Martian surface and to the challenges associated with these measurements. Changes in Ca, Fe and Al lineshapes observed in the LIBS spectra at different gas compositions and pressures were studied. It was observed that the size of the plasma formed on the hematite changed in a non-linear way as a function of decreasing pressure in a CO2 atmosphere and a simulated Martian atmosphere.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We examined the psychometric properties of the School Attitude Assessment Survey–Revised in a Spanish population (n = 1,398). Confirmatory factor analysis procedures supported the instrument’s five-factor structure. The results of discriminant analysis demonstrated the predictive power of the School Attitude Assessment Survey–Revised scales as regards academic performance. Implications for education and assessment are discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

El objetivo de este estudio fue analizar la capacidad predictiva de la autoeficacia académica sobre las dimensiones del autoconcepto en una muestra de 860 estudiantes chilenos. El análisis de regresión logística reveló que la autoeficacia académica fue un predictor positivo y significativo de las escalas académicas (Matemáticas, Verbal y Académica General), no académicas (Habilidades Físicas, Apariencia Física, Relaciones con el Sexo Opuesto, Relaciones con el Mismo Sexo, Relación con los Padres, Sinceridad- Veracidad), y de la escala de Autoestima, excepto de la escala de Estabilidad Emocional. Esta relación de predicción fue de mayor magnitud con las escalas académicas y autoestima.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Trabalho Final do Curso de Mestrado Integrado em Medicina, Faculdade de Medicina, Universidade de Lisboa, 2014

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Global current account imbalances widened before the 2007/2008 crisis and have narrowed since. While the post-crisis adjustment of European current account deficits was in line with global developments (though more forceful), European current account surpluses defied global trends and increased. We use panel econometric models to analyse the determinants of medium-term current account balances. Our results confirm that higher fiscal balances, higher GDP per capita, more rapidly aging populations, larger net foreign assets, larger oil rents and better legal systems increase the medium-term current account balance, while a larger growth differential and a higher old-age dependency ratio reduce it. European current account surpluses became excessive during the past twelve years according to our estimates, while they were in line with model predictions in the preceding three decades. Generally, the gap between the actual current account and its fitted value in the model has a strong predictive power for future current account changes. Excess deficits adjust more forcefully than excess surpluses. However, in the 2004-07 period, excess imbalances were amplified, which was followed by a forceful correction in 2008-15, with the exception of European surpluses.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Species distribution models (SDM) predict species occurrence based on statistical relationships with environmental conditions. The R-package biomod2 which includes 10 different SDM techniques and 10 different evaluation methods was used in this study. Macroalgae are the main biomass producers in Potter Cove, King George Island (Isla 25 de Mayo), Antarctica, and they are sensitive to climate change factors such as suspended particulate matter (SPM). Macroalgae presence and absence data were used to test SDMs suitability and, simultaneously, to assess the environmental response of macroalgae as well as to model four scenarios of distribution shifts by varying SPM conditions due to climate change. According to the averaged evaluation scores of Relative Operating Characteristics (ROC) and True scale statistics (TSS) by models, those methods based on a multitude of decision trees such as Random Forest and Classification Tree Analysis, reached the highest predictive power followed by generalized boosted models (GBM) and maximum-entropy approaches (Maxent). The final ensemble model used 135 of 200 calculated models (TSS > 0.7) and identified hard substrate and SPM as the most influencing parameters followed by distance to glacier, total organic carbon (TOC), bathymetry and slope. The climate change scenarios show an invasive reaction of the macroalgae in case of less SPM and a retreat of the macroalgae in case of higher assumed SPM values.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Accurate estimates of body mass in fossil taxa are fundamental to paleobiological reconstruction. Predictive equations derived from correlation with craniodental and body mass data in extant taxa are the most commonly used, but they can be unreliable for species whose morphology departs widely from that of living relatives. Estimates based on proximal limb-bone circumference data are more accurate but are inapplicable where postcranial remains are unknown. In this study we assess the efficacy of predicting body mass in Australian fossil marsupials by using an alternative correlate, endocranial volume. Body mass estimates for a species with highly unusual craniodental anatomy, the Pleistocene marsupial lion (Thylacoleo carnifex), fall within the range determined on the basis of proximal limb-bone circumference data, whereas estimates based on dental data are highly dubious. For all marsupial taxa considered, allometric relationships have small confidence intervals, and percent prediction errors are comparable to those of the best predictors using craniodental data. Although application is limited in some respects, this method may provide a useful means of estimating body mass for species with atypical craniodental or postcranial morphologies and taxa unrepresented by postcranial remains. A trend toward increased encephalization may constrain the method's predictive power with respect to many, but not all, placental clades.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Objective: The Temptation and Restraint Inventory (TRI) is commonly used to measure drinking restraint in relation to problem drinking behavior. However, as yet the TRI has not been validated in a clinical group with alcohol dependence. Method: Male (n = 111) and female (n = 57) inpatients with DSM-IV diagnosed alcohol dependence completed the TRI and measures of problem drinking severity, including the Alcohol Dependence Scale and the quantity, frequency and week total of alcohol consumed. Results: The factor structure of the TRI was replicated in the alcohol dependent sample. Cognitive Emotional Preoccupation (CEP), one of the two higher order factors of the TRI, demonstrated sound predictive power toward all dependence severity indices. The other higher order factor, Cognitive Behavioral Control (CBC), was related to frequency of drinking. There was limited support for the CEP/CBC interactional model of drinking restraint. Conclusions: Although the construct validity of the TRI was sound, the measure appears more useful in understanding the development, maintenance and severity of alcohol-related problems in nondependent drinkers. The TRI may show promise in detecting either continuous drinking or heavy episodic type dependent drinkers.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many studies on birds focus on the collection of data through an experimental design, suitable for investigation in a classical analysis of variance (ANOVA) framework. Although many findings are confirmed by one or more experts, expert information is rarely used in conjunction with the survey data to enhance the explanatory and predictive power of the model. We explore this neglected aspect of ecological modelling through a study on Australian woodland birds, focusing on the potential impact of different intensities of commercial cattle grazing on bird density in woodland habitat. We examine a number of Bayesian hierarchical random effects models, which cater for overdispersion and a high frequency of zeros in the data using WinBUGS and explore the variation between and within different grazing regimes and species. The impact and value of expert information is investigated through the inclusion of priors that reflect the experience of 20 experts in the field of bird responses to disturbance. Results indicate that expert information moderates the survey data, especially in situations where there are little or no data. When experts agreed, credible intervals for predictions were tightened considerably. When experts failed to agree, results were similar to those evaluated in the absence of expert information. Overall, we found that without expert opinion our knowledge was quite weak. The fact that the survey data is quite consistent, in general, with expert opinion shows that we do know something about birds and grazing and we could learn a lot faster if we used this approach more in ecology, where data are scarce. Copyright (c) 2005 John Wiley & Sons, Ltd.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background. Adolescents' intentions to smoke are generally regarded as a valid and reliable predictor of subsequent smoking. This association is largely based on research with adults and needs a more detailed analysis for adolescents. Methods. Data on intentions and smoking status were collected as part of a longitudinal, birth-cohort study when the study members were 9, 11, 13, 15, 18, and 21 years of age. Results. The results showed that intention to smoke only had an important predictive power in the subgroup of previous nonsmokers. Among those already smoking (on a monthly basis or greater), previous level of smoking was a more important predictor of future behavior than intention to smoke. In addition, the effect of positive intention to smoke was nonlinear over age and had the greatest effect at age 15. Conclusion. The results indicated that in adolescence, measurement of intentions to smoke or not smoke cannot be assumed to be a general predictor of behavior at a later age for all groups of adolescents. (C) 2004 The Institute For Cancer Prevention and Elsevier Inc. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: There is a recognized need to move from mortality to morbidity outcome predictions following traumatic injury. However, there are few morbidity outcome prediction scoring methods and these fail to incorporate important comorbidities or cofactors. This study aims to develop and evaluate a method that includes such variables. Methods: This was a consecutive case series registered in the Queensland Trauma Registry that consented to a prospective 12-month telephone conducted follow-up study. A multivariable statistical model was developed relating Trauma Registry data to trichotomized 12-month post-injury outcome (categories: no limitations, minor limitations and major limitations). Cross-validation techniques using successive single hold-out samples were then conducted to evaluate the model's predictive capabilities. Results: In total, 619 participated, with 337 (54%) experiencing no limitations, 101 (16%) experiencing minor limitations and 181 (29%) experiencing major limitations 12 months after injury. The final parsimonious multivariable statistical model included whether the injury was in the lower extremity body region, injury severity, age, length of hospital stay, pulse at admission and whether the participant was admitted to an intensive care unit. This model explained 21% of the variability in post-injury outcome. Predictively, 64% of those with no limitations, 18% of those with minor limitations and 37% of those with major limitations were correctly identified. Conclusion: Although carefully developed, this statistical model lacks the predictive power necessary for its use as a basis of a useful prognostic tool. Further research is required to identify variables other than those routinely used in the Trauma Registry to develop a model with the necessary predictive utility.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As field determinations take much effort, it would be useful to be able to predict easily the coefficients describing the functional response of free-living predators, the function relating food intake rate to the abundance of food organisms in the environment. As a means easily to parameterise an individual-based model of shorebird Charadriiformes populations, we attempted this for shorebirds eating macro-invertebrates. Intake rate is measured as the ash-free dry mass (AFDM) per second of active foraging; i.e. excluding time spent on digestive pauses and other activities, such as preening. The present and previous studies show that the general shape of the functional response in shorebirds eating approximately the same size of prey across the full range of prey density is a decelerating rise to a plateau, thus approximating the Holling type 11 ('disc equation') formulation. But field studies confirmed that the asymptote was not set by handling time, as assumed by the disc equation, because only about half the foraging time was spent in successfully or unsuccessfully attacking and handling prey, the rest being devoted to searching. A review of 30 functional responses showed that intake rate in free-living shorebirds varied independently of prey density over a wide range, with the asymptote being reached at very low prey densities (< 150/m(-2)). Accordingly, most of the many studies of shorebird intake rate have probably been conducted at or near the asymptote of the functional response, suggesting that equations that predict intake rate should also predict the asymptote. A multivariate analysis of 468 'spot' estimates of intake rates from 26 shorebirds identified ten variables, representing prey and shorebird characteristics, that accounted for 81 % of the variance in logarithm-transformed intake rate. But four-variables accounted for almost as much (77.3 %), these being bird size, prey size, whether the bird was an oystercatcher Haematopus ostralegus eating mussels Mytilus edulis, or breeding. The four variable equation under-predicted, on average, the observed 30 estimates of the asymptote by 11.6%, but this discrepancy was reduced to 0.2% when two suspect estimates from one early study in the 1960s were removed. The equation therefore predicted the observed asymptote very successfully in 93 % of cases. We conclude that the asymptote can be reliably predicted from just four easily measured variables. Indeed, if the birds are not breeding and are not oystercatchers eating mussels, reliable predictions can be obtained using just two variables, bird and prey sizes. A multivariate analysis of 23 estimates of the half-asymptote constant suggested they were smaller when prey were small but greater when the birds were large, especially in oystercatchers. The resulting equation could be used to predict the half-asymptote constant, but its predictive power has yet to be tested. As well as predicting the asymptote of the functional response, the equations will enable research workers engaged in many areas of shorebird ecology and behaviour to estimate intake rate without the need for conventional time-consuming field studies, including species for which it has not yet proved possible to measure intake rate in the field.