889 resultados para continuous and discrete variables
Resumo:
Context There is contradictory information regarding the prognostic importance of adipocytokines, hepatic and inflammatory biomarkers on the incidence of type 2 diabetes. The objective was to assess the prognostic relevance of adipocytokine and inflammatory markers (C-reactive protein – CRP; interleukin-1beta – IL-1β; interleukin-6– IL-6; tumour necrosis factor-α – TNF-α; leptin and adiponectin) and gamma-glutamyl transpeptidase (γGT) on the incidence of type 2 diabetes. Methods Prospective, population-based study including 3,842 non-diabetic participants (43.3% men, age range 35 to 75 years), followed for an average of 5.5 years (2003–2008). The endpoint was the occurrence of type 2 diabetes. Results 208 participants (5.4%, 66 women) developed type 2 diabetes during follow-up. On univariate analysis, participants who developed type 2 diabetes had significantly higher baseline levels of IL-6, CRP, leptin and γGT, and lower levels of adiponectin than participants who remained free of type 2 diabetes. After adjusting for a validated type 2 diabetes risk score, only the associations with adiponectin: Odds Ratio and (95% confidence interval): 0.97 (0.64–1.47), 0.84 (0.55–1.30) and 0.64 (0.40–1.03) for the second, third and forth gender-specific quartiles respectively, remained significant (P-value for trend = 0.05). Adding each marker to a validated type 2 diabetes risk score (including age, family history of type 2 diabetes, height, waist circumference, resting heart rate, presence of hypertension, HDL cholesterol, triglycerides, fasting glucose and serum uric acid) did not improve the area under the ROC or the net reclassification index; similar findings were obtained when the markers were combined, when the markers were used as continuous (log-transformed) variables or when gender-specific quartiles were used. Conclusion Decreased adiponectin levels are associated with an increased risk for incident type 2 diabetes, but they seem to add little information regarding the risk of developing type 2 diabetes to a validated risk score.
Resumo:
The present study investigates the relation of perceived arousal (continuous self-rating), autonomic nervous system activity (heart rate, heart rate variability) and musical characteristics (sound intensity, musical rhythm) upon listening to a complex musical piece. Twenty amateur musicians listened to two performances of Chopin's "Tristesse" with different rhythmic shapes. Besides conventional statistical methods for analyzing psychophysiological reactions (heart rate, respiration rate) and musical variables, semblance analysis was used. Perceived arousal correlated strongly with sound intensity; heart rate showed only a partial response to changes in sound intensity. Larger changes in heart rate were caused by the version with more rhythmic tension. The low-/high-frequency ratio of heart rate variability increased-whereas the high frequency component decreased-during music listening. We conclude that autonomic nervous system activity can be modulated not only by sound intensity but also by the interpreter's use of rhythmic tension. Semblance analysis enables us to track the subtle correlations between musical and physiological variables.
Resumo:
We show that exotic phases arise in generalized lattice gauge theories known as quantum link models in which classical gauge fields are replaced by quantum operators. While these quantum models with discrete variables have a finite-dimensional Hilbert space per link, the continuous gauge symmetry is still exact. An efficient cluster algorithm is used to study these exotic phases. The (2+1)-d system is confining at zero temperature with a spontaneously broken translation symmetry. A crystalline phase exhibits confinement via multi stranded strings between chargeanti-charge pairs. A phase transition between two distinct confined phases is weakly first order and has an emergent spontaneously broken approximate SO(2) global symmetry. The low-energy physics is described by a (2 + 1)-d RP(1) effective field theory, perturbed by a dangerously irrelevant SO(2) breaking operator, which prevents the interpretation of the emergent pseudo-Goldstone boson as a dual photon. This model is an ideal candidate to be implemented in quantum simulators to study phenomena that are not accessible using Monte Carlo simulations such as the real-time evolution of the confining string and the real-time dynamics of the pseudo-Goldstone boson.
Resumo:
INTRODUCTION Patients admitted to intensive care following surgery for faecal peritonitis present particular challenges in terms of clinical management and risk assessment. Collaborating surgical and intensive care teams need shared perspectives on prognosis. We aimed to determine the relationship between dynamic assessment of trends in selected variables and outcomes. METHODS We analysed trends in physiological and laboratory variables during the first week of intensive care unit (ICU) stay in 977 patients at 102 centres across 16 European countries. The primary outcome was 6-month mortality. Secondary endpoints were ICU, hospital and 28-day mortality. For each trend, Cox proportional hazards (PH) regression analyses, adjusted for age and sex, were performed for each endpoint. RESULTS Trends over the first 7 days of the ICU stay independently associated with 6-month mortality were worsening thrombocytopaenia (mortality: hazard ratio (HR) = 1.02; 95% confidence interval (CI), 1.01 to 1.03; P <0.001) and renal function (total daily urine output: HR =1.02; 95% CI, 1.01 to 1.03; P <0.001; Sequential Organ Failure Assessment (SOFA) renal subscore: HR = 0.87; 95% CI, 0.75 to 0.99; P = 0.047), maximum bilirubin level (HR = 0.99; 95% CI, 0.99 to 0.99; P = 0.02) and Glasgow Coma Scale (GCS) SOFA subscore (HR = 0.81; 95% CI, 0.68 to 0.98; P = 0.028). Changes in renal function (total daily urine output and renal component of the SOFA score), GCS component of the SOFA score, total SOFA score and worsening thrombocytopaenia were also independently associated with secondary outcomes (ICU, hospital and 28-day mortality). We detected the same pattern when we analysed trends on days 2, 3 and 5. Dynamic trends in all other measured laboratory and physiological variables, and in radiological findings, changes inrespiratory support, renal replacement therapy and inotrope and/or vasopressor requirements failed to be retained as independently associated with outcome in multivariate analysis. CONCLUSIONS Only deterioration in renal function, thrombocytopaenia and SOFA score over the first 2, 3, 5 and 7 days of the ICU stay were consistently associated with mortality at all endpoints. These findings may help to inform clinical decision making in patients with this common cause of critical illness.
Resumo:
Recurrent wheezing or asthma is a common problem in children that has increased considerably in prevalence in the past few decades. The causes and underlying mechanisms are poorly understood and it is thought that a numb er of distinct diseases causing similar symptoms are involved. Due to the lack of a biologically founded classification system, children are classified according to their observed disease related features (symptoms, signs, measurements) into phenotypes. The objectives of this PhD project were a) to develop tools for analysing phenotypic variation of a disease, and b) to examine phenotypic variability of wheezing among children by applying these tools to existing epidemiological data. A combination of graphical methods (multivariate co rrespondence analysis) and statistical models (latent variables models) was used. In a first phase, a model for discrete variability (latent class model) was applied to data on symptoms and measurements from an epidemiological study to identify distinct phenotypes of wheezing. In a second phase, the modelling framework was expanded to include continuous variability (e.g. along a severity gradient) and combinations of discrete and continuo us variability (factor models and factor mixture models). The third phase focused on validating the methods using simulation studies. The main body of this thesis consists of 5 articles (3 published, 1 submitted and 1 to be submitted) including applications, methodological contributions and a review. The main findings and contributions were: 1) The application of a latent class model to epidemiological data (symptoms and physiological measurements) yielded plausible pheno types of wheezing with distinguishing characteristics that have previously been used as phenotype defining characteristics. 2) A method was proposed for including responses to conditional questions (e.g. questions on severity or triggers of wheezing are asked only to children with wheeze) in multivariate modelling.ii 3) A panel of clinicians was set up to agree on a plausible model for wheezing diseases. The model can be used to generate datasets for testing the modelling approach. 4) A critical review of methods for defining and validating phenotypes of wheeze in children was conducted. 5) The simulation studies showed that a parsimonious parameterisation of the models is required to identify the true underlying structure of the data. The developed approach can deal with some challenges of real-life cohort data such as variables of mixed mode (continuous and categorical), missing data and conditional questions. If carefully applied, the approach can be used to identify whether the underlying phenotypic variation is discrete (classes), continuous (factors) or a combination of these. These methods could help improve precision of research into causes and mechanisms and contribute to the development of a new classification of wheezing disorders in children and other diseases which are difficult to classify.
Resumo:
For adolescents, unprotected sexual intercourse is the primary cause of sexually transmitted disease (STD), including Human Immunodeficiency Virus (HIV) infection (virus which causes Acquired Immunodeficiency Syndrome (AIDS)), and pregnancy. Although many studies on adolescent sexual behavior have addressed racial/ethnic differences, few studies have examined the relation between race/ethnicity while controlling for other sociocultural and psychosocial variables. The purpose of this study is to examine the relationship between racial/ethnic categories and selected sociocultural and psychosocial variables, with reported adolescent sexual risk-taking and preventive behavior.^ A self-administered questionnaire was used to collect information from 3132 students in a Texas school district (Section 3.5.2). The instrument contained approximately 100 questions on demographic characteristics, sexual behavior, and psychosocial determinants of sexual behavior. Based on the findings of this study, the following major conclusions are made: (1) There are differences in reported sexual risk-taking and preventive behavior among Black, Hispanic and White adolescents in this study. The stratified analysis by gender further suggests significant gender differences in reported sexual behavior among the three racial/ethnic groups. (2) Gender, living arrangement, academic grades, and language spoken at home modified the association between reported sexual risk-taking and preventive behavior and race/ethnicity in this study. This suggests that these sociocultural variables should be considered in future research and practice involving multicultural populations. (3) There are differences in selected psychosocial determinants among the three racial/ethnic groups and between males and females. These differences were consistent with the reported sexual risk-taking and preventive behaviors among race/ethnicity and gender for adolescents in this study. The findings support the consideration of psychosocial determinants in research and interventions addressing adolescent sexual behavior among different racial/ethnic groups.^ Based on the results of this study, two recommendations for practice are made. First, health professionals developing interventions for adolescents from different cultural backgrounds and gender need to be familiar with the specific sociocultural and psychosocial factors which will reduce risky sexual behavior, and promote protective behavior. Second, the need for immediate, realistic, and continuous HIV/STD and pregnancy prevention programs for children and adolescents should be considered. ^
Resumo:
Gran parte de los procesos microbianos que contribuyen a la fertilidad de los agroecosistemas y el ciclado de nutrientes ocurren en el suelo. Este ciclado de nutrientes depende críticamente de la actividad microbiológica de los suelos, la cual a su vez está mediada por la estructura y funcionamiento de la microbiota edáfica. En este contexto, el objetivo de este trabajo, fue determinar si la actividad microbiana puede ser buena indicadora de la intensidad de uso del suelo, analizando: 1- si las diferencias en la intensidad de uso del suelo se relacionan con diferencias en la actividad microbiológica estimada a través de la respiración edáfica y la actividad enzimática; y 2- las posibles relaciones entre estas variables microbiológicas y las variables físico-químicas. Entre 2008 y 2010 se realizaron muestreos trimestrales en campos de la provincia de Buenos Aires en suelos Argiudoles bajo diferentes usos: 1- Agricultura intensiva continua, 2- Agricultura reciente, y 3- Pastizales naturalizados. Tres sitios de muestreo se seleccionaron como réplicas para cada uso de suelo, con 5 muestras por fecha y réplica. La actividad microbiana se evaluó midiendo la respiración edáfica y la actividad de las enzimas nitrogenasas y se analizaron variables físico- químicas. Tanto las variables microbiológicas como las físico-químicas se analizaron mediante Kruskall-Wallis (P < 0,05). Se exploró la asociación entre las variables físico-químicas y microbiológicas aplicando el coeficiente de correlación no paramétrico (Spearman). Los distintos usos de un mismo suelo presentaron diferencias en la actividad microbiológica. La respiración edáfica fue significativamente mayor en los pastizales naturalizados que en los sistemas con agricultura. La actividad nitrogenasa resultó significativamente mayor en los pastizales naturalizados respecto de la agricultura continua y no se diferenció significativamente de la agricultura reciente. Las variables físico- químicas resultaron menos consistentes en detectar diferencias entre usos. Se detectaron correlaciones significativas entre la actividad microbiológica y algunas de las variables físico-químicas. Los resultados muestran que la actividad microbiológica puede resultar útil para diferenciar intensidades de usos de suelo.
Resumo:
We have performed quantitative X-ray diffraction (qXRD) analysis of 157 grab or core-top samples from the western Nordic Seas between (WNS) ~57°-75°N and 5° to 45° W. The RockJock Vs6 analysis includes non-clay (20) and clay (10) mineral species in the <2 mm size fraction that sum to 100 weight %. The data matrix was reduced to 9 and 6 variables respectively by excluding minerals with low weight% and by grouping into larger groups, such as the alkali and plagioclase feldspars. Because of its potential dual origins calcite was placed outside of the sum. We initially hypothesized that a combination of regional bedrock outcrops and transport associated with drift-ice, meltwater plumes, and bottom currents would result in 6 clusters defined by "similar" mineral compositions. The hypothesis was tested by use of a fuzzy k-mean clustering algorithm and key minerals were identified by step-wise Discriminant Function Analysis. Key minerals in defining the clusters include quartz, pyroxene, muscovite, and amphibole. With 5 clusters, 87.5% of the observations are correctly classified. The geographic distributions of the five k-mean clusters compares reasonably well with the original hypothesis. The close spatial relationship between bedrock geology and discrete cluster membership stresses the importance of this variable at both the WNS-scale and at a more local scale in NE Greenland.
Resumo:
La relación entre la estructura urbana y la movilidad ha sido estudiada desde hace más de 70 años. El entorno urbano incluye múltiples dimensiones como por ejemplo: la estructura urbana, los usos de suelo, la distribución de instalaciones diversas (comercios, escuelas y zonas de restauración, parking, etc.). Al realizar una revisión de la literatura existente en este contexto, se encuentran distintos análisis, metodologías, escalas geográficas y dimensiones, tanto de la movilidad como de la estructura urbana. En este sentido, se trata de una relación muy estudiada pero muy compleja, sobre la que no existe hasta el momento un consenso sobre qué dimensión del entorno urbano influye sobre qué dimensión de la movilidad, y cuál es la manera apropiada de representar esta relación. Con el propósito de contestar estas preguntas investigación, la presente tesis tiene los siguientes objetivos generales: (1) Contribuir al mejor entendimiento de la compleja relación estructura urbana y movilidad. y (2) Entender el rol de los atributos latentes en la relación entorno urbano y movilidad. El objetivo específico de la tesis es analizar la influencia del entorno urbano sobre dos dimensiones de la movilidad: número de viajes y tipo de tour. Vista la complejidad de la relación entorno urbano y movilidad, se pretende contribuir al mejor entendimiento de la relación a través de la utilización de 3 escalas geográficas de las variables y del análisis de la influencia de efectos inobservados en la movilidad. Para el análisis se utiliza una base de datos conformada por tres tipos de datos: (1) Una encuesta de movilidad realizada durante los años 2006 y 2007. Se obtuvo un total de 943 encuestas, en 3 barrios de Madrid: Chamberí, Pozuelo y Algete. (2) Información municipal del Instituto Nacional de Estadística: dicha información se encuentra enlazada con los orígenes y destinos de los viajes recogidos en la encuesta. Y (3) Información georeferenciada en Arc-GIS de los hogares participantes en la encuesta: la base de datos contiene información respecto a la estructura de las calles, localización de escuelas, parking, centros médicos y lugares de restauración. Se analizó la correlación entre e intra-grupos y se modelizaron 4 casos de atributos bajo la estructura ordinal logit. Posteriormente se evalúa la auto-selección a través de la estimación conjunta de las elecciones de tipo de barrio y número de viajes. La elección del tipo de barrio consta de 3 alternativas: CBD, Urban y Suburban, según la zona de residencia recogida en las encuestas. Mientras que la elección del número de viajes consta de 4 categorías ordinales: 0 viajes, 1-2 viajes, 3-4 viajes y 5 o más viajes. A partir de la mejor especificación del modelo ordinal logit. Se desarrolló un modelo joint mixed-ordinal conjunto. Los resultados indican que las variables exógenas requieren un análisis exhaustivo de correlaciones con el fin de evitar resultados sesgados. ha determinado que es importante medir los atributos del BE donde se realiza el viaje, pero también la información municipal es muy explicativa de la movilidad individual. Por tanto, la percepción de las zonas de destino a nivel municipal es considerada importante. En el contexto de la Auto-selección (self-selection) es importante modelizar conjuntamente las decisiones. La Auto-selección existe, puesto que los parámetros estimados conjuntamente son significativos. Sin embargo, sólo ciertos atributos del entorno urbano son igualmente importantes sobre la elección de la zona de residencia y frecuencia de viajes. Para analizar la Propensión al Viaje, se desarrolló un modelo híbrido, formado por: una variable latente, un indicador y un modelo de elección discreta. La variable latente se denomina “Propensión al Viaje”, cuyo indicador en ecuación de medida es el número de viajes; la elección discreta es el tipo de tour. El modelo de elección consiste en 5 alternativas, según la jerarquía de actividades establecida en la tesis: HOME, no realiza viajes durante el día de estudio, HWH tour cuya actividad principal es el trabajo o estudios, y no se realizan paradas intermedias; HWHs tour si el individuo reaiza paradas intermedias; HOH tour cuya actividad principal es distinta a trabajo y estudios, y no se realizan paradas intermedias; HOHs donde se realizan paradas intermedias. Para llegar a la mejor especificación del modelo, se realizó un trabajo importante considerando diferentes estructuras de modelos y tres tipos de estimaciones. De tal manera, se obtuvieron parámetros consistentes y eficientes. Los resultados muestran que la modelización de los tours, representa una ventaja sobre la modelización de los viajes, puesto que supera las limitaciones de espacio y tiempo, enlazando los viajes realizados por la misma persona en el día de estudio. La propensión al viaje (PT) existe y es específica para cada tipo de tour. Los parámetros estimados en el modelo híbrido resultaron significativos y distintos para cada alternativa de tipo de tour. Por último, en la tesis se verifica que los modelos híbridos representan una mejora sobre los modelos tradicionales de elección discreta, dando como resultado parámetros consistentes y más robustos. En cuanto a políticas de transporte, se ha demostrado que los atributos del entorno urbano son más importantes que los LOS (Level of Service) en la generación de tours multi-etapas. la presente tesis representa el primer análisis empírico de la relación entre los tipos de tours y la propensión al viaje. El concepto Propensity to Travel ha sido desarrollado exclusivamente para la tesis. Igualmente, el desarrollo de un modelo conjunto RC-Number of trips basado en tres escalas de medida representa innovación en cuanto a la comparación de las escalas geográficas, que no había sido hecha en la modelización de la self-selection. The relationship between built environment (BE) and travel behaviour (TB) has been studied in a number of cases, using several methods - aggregate and disaggregate approaches - and different focuses – trip frequency, automobile use, and vehicle miles travelled and so on. Definitely, travel is generated by the need to undertake activities and obtain services, and there is a general consensus that urban components affect TB. However researches are still needed to better understand which components of the travel behaviour are affected most and by which of the urban components. In order to fill the gap in the research, the present dissertation faced two main objectives: (1) To contribute to the better understanding of the relationship between travel demand and urban environment. And (2) To develop an econometric model for estimating travel demand with urban environment attributes. With this purpose, the present thesis faced an exhaustive research and computation of land-use variables in order to find the best representation of BE for modelling trip frequency. In particular two empirical analyses are carried out: 1. Estimation of three dimensions of travel demand using dimensions of urban environment. We compare different travel dimensions and geographical scales, and we measure self-selection contribution following the joint models. 2. Develop a hybrid model, integrated latent variable and discrete choice model. The implementation of hybrid models is new in the analysis of land-use and travel behaviour. BE and TB explicitly interact and allow richness information about a specific individual decision process For all empirical analysis is used a data-base from a survey conducted in 2006 and 2007 in Madrid. Spatial attributes describing neighbourhood environment are derived from different data sources: National Institute of Statistics-INE (Administrative: municipality and district) and GIS (circular units). INE provides raw data for such spatial units as: municipality and district. The construction of census units is trivial as the census bureau provides tables that readily define districts and municipalities. The construction of circular units requires us to determine the radius and associate the spatial information to our households. The first empirical part analyzes trip frequency by applying an ordered logit model. In this part is studied the effect of socio-economic, transport and land use characteristics on two travel dimensions: trip frequency and type of tour. In particular the land use is defined in terms of type of neighbourhoods and types of dwellers. Three neighbourhood representations are explored, and described three for constructing neighbourhood attributes. In particular administrative units are examined to represent neighbourhood and circular – unit representation. Ordered logit models are applied, while ordinal logit models are well-known, an intensive work for constructing a spatial attributes was carried out. On the other hand, the second empirical analysis consists of the development of an innovative econometric model that considers a latent variable called “propensity to travel”, and choice model is the choice of type of tour. The first two specifications of ordinal models help to estimate this latent variable. The latent variable is unobserved but the manifestation is called “indicators”, then the probability of choosing an alternative of tour is conditional to the probability of latent variable and type of tour. Since latent variable is unknown we fit the integral over its distribution. Four “sets of best variables” are specified, following the specification obtained from the correlation analysis. The results evidence that the relative importance of SE variables versus BE variables depends on how BE variables are measured. We found that each of these three spatial scales has its intangible qualities and drawbacks. Spatial scales play an important role on predicting travel demand due to the variability in measures at trip origin/destinations within the same administrative unit (municipality, district and so on). Larger units will produce less variation in data; but it does not affect certain variables, such as public transport supply, that are more significant at municipality level. By contrast, land-use measures are more efficient at district level. Self-selection in this context, is weak. Thus, the influence of BE attributes is true. The results of the hybrid model show that unobserved factors affect the choice of tour complexity. The latent variable used in this model is propensity to travel that is explained by socioeconomic aspects and neighbourhood attributes. The results show that neighbourhood attributes have indeed a significant impact on the choice of the type of tours either directly and through the propensity to travel. The propensity to travel has a different impact depending on the structure of each tour and increases the probability of choosing more complex tours, such as tours with many intermediate stops. The integration of choice and latent variable model shows that omitting important perception and attitudes leads to inconsistent estimates. The results also indicate that goodness of fit improves by adding the latent variable in both sequential and simultaneous estimation. There are significant differences in the sensitivity to the latent variable across alternatives. In general, as expected, the hybrid models show a major improvement into the goodness of fit of the model, compared to a classical discrete choice model that does not incorporate latent effects. The integrated model leads to a more detailed analysis of the behavioural process. Summarizing, the effect that built environment characteristics on trip frequency studied is deeply analyzed. In particular we tried to better understand how land use characteristics can be defined and measured and which of these measures do have really an impact on trip frequency. We also tried to test the superiority of HCM on this field. We can concluded that HCM shows a major improvement into the goodness of fit of the model, compared to classical discrete choice model that does not incorporate latent effects. And consequently, the application of HCM shows the importance of LV on the decision of tour complexity. People are more elastic to built environment attributes than level of services. Thus, policy implications must take place to develop more mixed areas, work-places in combination with commercial retails.
Resumo:
Neuronal morphology is a key feature in the study of brain circuits, as it is highly related to information processing and functional identification. Neuronal morphology affects the process of integration of inputs from other neurons and determines the neurons which receive the output of the neurons. Different parts of the neurons can operate semi-independently according to the spatial location of the synaptic connections. As a result, there is considerable interest in the analysis of the microanatomy of nervous cells since it constitutes an excellent tool for better understanding cortical function. However, the morphologies, molecular features and electrophysiological properties of neuronal cells are extremely variable. Except for some special cases, this variability makes it hard to find a set of features that unambiguously define a neuronal type. In addition, there are distinct types of neurons in particular regions of the brain. This morphological variability makes the analysis and modeling of neuronal morphology a challenge. Uncertainty is a key feature in many complex real-world problems. Probability theory provides a framework for modeling and reasoning with uncertainty. Probabilistic graphical models combine statistical theory and graph theory to provide a tool for managing domains with uncertainty. In particular, we focus on Bayesian networks, the most commonly used probabilistic graphical model. In this dissertation, we design new methods for learning Bayesian networks and apply them to the problem of modeling and analyzing morphological data from neurons. The morphology of a neuron can be quantified using a number of measurements, e.g., the length of the dendrites and the axon, the number of bifurcations, the direction of the dendrites and the axon, etc. These measurements can be modeled as discrete or continuous data. The continuous data can be linear (e.g., the length or the width of a dendrite) or directional (e.g., the direction of the axon). These data may follow complex probability distributions and may not fit any known parametric distribution. Modeling this kind of problems using hybrid Bayesian networks with discrete, linear and directional variables poses a number of challenges regarding learning from data, inference, etc. In this dissertation, we propose a method for modeling and simulating basal dendritic trees from pyramidal neurons using Bayesian networks to capture the interactions between the variables in the problem domain. A complete set of variables is measured from the dendrites, and a learning algorithm is applied to find the structure and estimate the parameters of the probability distributions included in the Bayesian networks. Then, a simulation algorithm is used to build the virtual dendrites by sampling values from the Bayesian networks, and a thorough evaluation is performed to show the model’s ability to generate realistic dendrites. In this first approach, the variables are discretized so that discrete Bayesian networks can be learned and simulated. Then, we address the problem of learning hybrid Bayesian networks with different kinds of variables. Mixtures of polynomials have been proposed as a way of representing probability densities in hybrid Bayesian networks. We present a method for learning mixtures of polynomials approximations of one-dimensional, multidimensional and conditional probability densities from data. The method is based on basis spline interpolation, where a density is approximated as a linear combination of basis splines. The proposed algorithms are evaluated using artificial datasets. We also use the proposed methods as a non-parametric density estimation technique in Bayesian network classifiers. Next, we address the problem of including directional data in Bayesian networks. These data have some special properties that rule out the use of classical statistics. Therefore, different distributions and statistics, such as the univariate von Mises and the multivariate von Mises–Fisher distributions, should be used to deal with this kind of information. In particular, we extend the naive Bayes classifier to the case where the conditional probability distributions of the predictive variables given the class follow either of these distributions. We consider the simple scenario, where only directional predictive variables are used, and the hybrid case, where discrete, Gaussian and directional distributions are mixed. The classifier decision functions and their decision surfaces are studied at length. Artificial examples are used to illustrate the behavior of the classifiers. The proposed classifiers are empirically evaluated over real datasets. We also study the problem of interneuron classification. An extensive group of experts is asked to classify a set of neurons according to their most prominent anatomical features. A web application is developed to retrieve the experts’ classifications. We compute agreement measures to analyze the consensus between the experts when classifying the neurons. Using Bayesian networks and clustering algorithms on the resulting data, we investigate the suitability of the anatomical terms and neuron types commonly used in the literature. Additionally, we apply supervised learning approaches to automatically classify interneurons using the values of their morphological measurements. Then, a methodology for building a model which captures the opinions of all the experts is presented. First, one Bayesian network is learned for each expert, and we propose an algorithm for clustering Bayesian networks corresponding to experts with similar behaviors. Then, a Bayesian network which represents the opinions of each group of experts is induced. Finally, a consensus Bayesian multinet which models the opinions of the whole group of experts is built. A thorough analysis of the consensus model identifies different behaviors between the experts when classifying the interneurons in the experiment. A set of characterizing morphological traits for the neuronal types can be defined by performing inference in the Bayesian multinet. These findings are used to validate the model and to gain some insights into neuron morphology. Finally, we study a classification problem where the true class label of the training instances is not known. Instead, a set of class labels is available for each instance. This is inspired by the neuron classification problem, where a group of experts is asked to individually provide a class label for each instance. We propose a novel approach for learning Bayesian networks using count vectors which represent the number of experts who selected each class label for each instance. These Bayesian networks are evaluated using artificial datasets from supervised learning problems. Resumen La morfología neuronal es una característica clave en el estudio de los circuitos cerebrales, ya que está altamente relacionada con el procesado de información y con los roles funcionales. La morfología neuronal afecta al proceso de integración de las señales de entrada y determina las neuronas que reciben las salidas de otras neuronas. Las diferentes partes de la neurona pueden operar de forma semi-independiente de acuerdo a la localización espacial de las conexiones sinápticas. Por tanto, existe un interés considerable en el análisis de la microanatomía de las células nerviosas, ya que constituye una excelente herramienta para comprender mejor el funcionamiento de la corteza cerebral. Sin embargo, las propiedades morfológicas, moleculares y electrofisiológicas de las células neuronales son extremadamente variables. Excepto en algunos casos especiales, esta variabilidad morfológica dificulta la definición de un conjunto de características que distingan claramente un tipo neuronal. Además, existen diferentes tipos de neuronas en regiones particulares del cerebro. La variabilidad neuronal hace que el análisis y el modelado de la morfología neuronal sean un importante reto científico. La incertidumbre es una propiedad clave en muchos problemas reales. La teoría de la probabilidad proporciona un marco para modelar y razonar bajo incertidumbre. Los modelos gráficos probabilísticos combinan la teoría estadística y la teoría de grafos con el objetivo de proporcionar una herramienta con la que trabajar bajo incertidumbre. En particular, nos centraremos en las redes bayesianas, el modelo más utilizado dentro de los modelos gráficos probabilísticos. En esta tesis hemos diseñado nuevos métodos para aprender redes bayesianas, inspirados por y aplicados al problema del modelado y análisis de datos morfológicos de neuronas. La morfología de una neurona puede ser cuantificada usando una serie de medidas, por ejemplo, la longitud de las dendritas y el axón, el número de bifurcaciones, la dirección de las dendritas y el axón, etc. Estas medidas pueden ser modeladas como datos continuos o discretos. A su vez, los datos continuos pueden ser lineales (por ejemplo, la longitud o la anchura de una dendrita) o direccionales (por ejemplo, la dirección del axón). Estos datos pueden llegar a seguir distribuciones de probabilidad muy complejas y pueden no ajustarse a ninguna distribución paramétrica conocida. El modelado de este tipo de problemas con redes bayesianas híbridas incluyendo variables discretas, lineales y direccionales presenta una serie de retos en relación al aprendizaje a partir de datos, la inferencia, etc. En esta tesis se propone un método para modelar y simular árboles dendríticos basales de neuronas piramidales usando redes bayesianas para capturar las interacciones entre las variables del problema. Para ello, se mide un amplio conjunto de variables de las dendritas y se aplica un algoritmo de aprendizaje con el que se aprende la estructura y se estiman los parámetros de las distribuciones de probabilidad que constituyen las redes bayesianas. Después, se usa un algoritmo de simulación para construir dendritas virtuales mediante el muestreo de valores de las redes bayesianas. Finalmente, se lleva a cabo una profunda evaluaci ón para verificar la capacidad del modelo a la hora de generar dendritas realistas. En esta primera aproximación, las variables fueron discretizadas para poder aprender y muestrear las redes bayesianas. A continuación, se aborda el problema del aprendizaje de redes bayesianas con diferentes tipos de variables. Las mixturas de polinomios constituyen un método para representar densidades de probabilidad en redes bayesianas híbridas. Presentamos un método para aprender aproximaciones de densidades unidimensionales, multidimensionales y condicionales a partir de datos utilizando mixturas de polinomios. El método se basa en interpolación con splines, que aproxima una densidad como una combinación lineal de splines. Los algoritmos propuestos se evalúan utilizando bases de datos artificiales. Además, las mixturas de polinomios son utilizadas como un método no paramétrico de estimación de densidades para clasificadores basados en redes bayesianas. Después, se estudia el problema de incluir información direccional en redes bayesianas. Este tipo de datos presenta una serie de características especiales que impiden el uso de las técnicas estadísticas clásicas. Por ello, para manejar este tipo de información se deben usar estadísticos y distribuciones de probabilidad específicos, como la distribución univariante von Mises y la distribución multivariante von Mises–Fisher. En concreto, en esta tesis extendemos el clasificador naive Bayes al caso en el que las distribuciones de probabilidad condicionada de las variables predictoras dada la clase siguen alguna de estas distribuciones. Se estudia el caso base, en el que sólo se utilizan variables direccionales, y el caso híbrido, en el que variables discretas, lineales y direccionales aparecen mezcladas. También se estudian los clasificadores desde un punto de vista teórico, derivando sus funciones de decisión y las superficies de decisión asociadas. El comportamiento de los clasificadores se ilustra utilizando bases de datos artificiales. Además, los clasificadores son evaluados empíricamente utilizando bases de datos reales. También se estudia el problema de la clasificación de interneuronas. Desarrollamos una aplicación web que permite a un grupo de expertos clasificar un conjunto de neuronas de acuerdo a sus características morfológicas más destacadas. Se utilizan medidas de concordancia para analizar el consenso entre los expertos a la hora de clasificar las neuronas. Se investiga la idoneidad de los términos anatómicos y de los tipos neuronales utilizados frecuentemente en la literatura a través del análisis de redes bayesianas y la aplicación de algoritmos de clustering. Además, se aplican técnicas de aprendizaje supervisado con el objetivo de clasificar de forma automática las interneuronas a partir de sus valores morfológicos. A continuación, se presenta una metodología para construir un modelo que captura las opiniones de todos los expertos. Primero, se genera una red bayesiana para cada experto y se propone un algoritmo para agrupar las redes bayesianas que se corresponden con expertos con comportamientos similares. Después, se induce una red bayesiana que modela la opinión de cada grupo de expertos. Por último, se construye una multired bayesiana que modela las opiniones del conjunto completo de expertos. El análisis del modelo consensuado permite identificar diferentes comportamientos entre los expertos a la hora de clasificar las neuronas. Además, permite extraer un conjunto de características morfológicas relevantes para cada uno de los tipos neuronales mediante inferencia con la multired bayesiana. Estos descubrimientos se utilizan para validar el modelo y constituyen información relevante acerca de la morfología neuronal. Por último, se estudia un problema de clasificación en el que la etiqueta de clase de los datos de entrenamiento es incierta. En cambio, disponemos de un conjunto de etiquetas para cada instancia. Este problema está inspirado en el problema de la clasificación de neuronas, en el que un grupo de expertos proporciona una etiqueta de clase para cada instancia de manera individual. Se propone un método para aprender redes bayesianas utilizando vectores de cuentas, que representan el número de expertos que seleccionan cada etiqueta de clase para cada instancia. Estas redes bayesianas se evalúan utilizando bases de datos artificiales de problemas de aprendizaje supervisado.
Resumo:
La presente Tesis analiza las posibilidades que ofrecen en la actualidad las tecnologías del habla para la detección de patologías clínicas asociadas a la vía aérea superior. El estudio del habla que tradicionalmente cubre tanto la producción como el proceso de transformación del mensaje y las señales involucradas, desde el emisor hasta alcanzar al receptor, ofrece una vía de estudio alternativa para estas patologías. El hecho de que la señal emitida no solo contiene este mensaje, sino también información acerca del locutor, ha motivado el desarrollo de sistemas orientados a la identificación y verificación de la identidad de los locutores. Estos trabajos han recibido recientemente un nuevo impulso, orientándose tanto hacia la caracterización de rasgos que son comunes a varios locutores, como a las diferencias existentes entre grabaciones de un mismo locutor. Los primeros resultan especialmente relevantes para esta Tesis dado que estos rasgos podrían evidenciar la presencia de características relacionadas con una cierta condición común a varios locutores, independiente de su identidad. Tal es el caso que se enfrenta en esta Tesis, donde los rasgos identificados se relacionarían con una de la patología particular y directamente vinculada con el sistema de físico de conformación del habla. El caso del Síndrome de Apneas Hipopneas durante el Sueno (SAHS) resulta paradigmático. Se trata de una patología con una elevada prevalencia mundo, que aumenta con la edad. Los pacientes de esta patología experimentan episodios de cese involuntario de la respiración durante el sueño, que se prolongan durante varios segundos y que se reproducen a lo largo de la noche impidiendo el correcto descanso. En el caso de la apnea obstructiva, estos episodios se deben a la imposibilidad de mantener un camino abierto a través de la vía aérea, de forma que el flujo de aire se ve interrumpido. En la actualidad, el diagnostico de estos pacientes se realiza a través de un estudio polisomnográfico, que se centra en el análisis de los episodios de apnea durante el sueño, requiriendo que el paciente permanezca en el hospital durante una noche. La complejidad y el elevado coste de estos procedimientos, unidos a las crecientes listas de espera, han evidenciado la necesidad de contar con técnicas rápidas de detección, que si bien podrían no obtener tasas tan elevadas, permitirían reorganizar las listas de espera en función del grado de severidad de la patología en cada paciente. Entre otros, los sistemas de diagnostico por imagen, así como la caracterización antropométrica de los pacientes, han evidenciado la existencia de patrones anatómicos que tendrían influencia directa sobre el habla. Los trabajos dedicados al estudio del SAHS en lo relativo a como esta afecta al habla han sido escasos y algunos de ellos incluso contradictorios. Sin embargo, desde finales de la década de 1980 se conoce la existencia de patrones específicos relativos a la articulación, la fonación y la resonancia. Sin embargo, su descripción resultaba difícilmente aprovechable a través de un sistema de reconocimiento automático, pero apuntaba la existencia de un nexo entre voz y SAHS. En los últimos anos las técnicas de procesado automático han permitido el desarrollo de sistemas automáticos que ya son capaces de identificar diferencias significativas en el habla de los pacientes del SAHS, y que los distinguen de los locutores sanos. Por contra, poco se conoce acerca de la conexión entre estos nuevos resultados, los sé que habían obtenido en el pasado y la patogénesis del SAHS. Esta Tesis continua la labor desarrollada en este ámbito considerando específicamente: el estudio de la forma en que el SAHS afecta el habla de los pacientes, la mejora en las tasas de clasificación automática y la combinación de la información obtenida con los predictores utilizados por los especialistas clínicos en sus evaluaciones preliminares. Las dos primeras tareas plantean problemas simbióticos, pero diferentes. Mientras el estudio de la conexión entre el SAHS y el habla requiere de modelos acotados que puedan ser interpretados con facilidad, los sistemas de reconocimiento se sirven de un elevado número de dimensiones para la caracterización y posterior identificación de patrones. Así, la primera tarea debe permitirnos avanzar en la segunda, al igual que la incorporación de los predictores utilizados por los especialistas clínicos. La Tesis aborda el estudio tanto del habla continua como del habla sostenida, con el fin de aprovechar las sinergias y diferencias existentes entre ambas. En el análisis del habla continua se tomo como punto de partida un esquema que ya fue evaluado con anterioridad, y sobre el cual se ha tratado la evaluación y optimización de la representación del habla, así como la caracterización de los patrones específicos asociados al SAHS. Ello ha evidenciado la conexión entre el SAHS y los elementos fundamentales de la señal de voz: los formantes. Los resultados obtenidos demuestran que el éxito de estos sistemas se debe, fundamentalmente, a la capacidad de estas representaciones para describir dichas componentes, obviando las dimensiones ruidosas o con poca capacidad discriminativa. El esquema resultante ofrece una tasa de error por debajo del 18%, sirviéndose de clasificadores notablemente menos complejos que los descritos en el estado del arte y de una única grabación de voz de corta duración. En relación a la conexión entre el SAHS y los patrones observados, fue necesario considerar las diferencias inter- e intra-grupo, centrándonos en la articulación característica del locutor, sustituyendo los complejos modelos de clasificación por el estudio de los promedios espectrales. El resultado apunta con claridad hacia ciertas regiones del eje de frecuencias, sugiriendo la existencia de un estrechamiento sistemático en la sección del tracto en la región de la orofaringe, ya prevista en la patogénesis de este síndrome. En cuanto al habla sostenida, se han reproducido los estudios realizados sobre el habla continua en grabaciones de la vocal /a/ sostenida. Los resultados son cualitativamente análogos a los anteriores, si bien en este caso las tasas de clasificación resultan ser más bajas. Con el objetivo de identificar el sentido de este resultado se reprodujo el estudio de los promedios espectrales y de la variabilidad inter e intra-grupo. Ambos estudios mostraron importantes diferencias con los anteriores que podrían explicar estos resultados. Sin embargo, el habla sostenida ofrece otras oportunidades al establecer un entorno controlado para el estudio de la fonación, que también había sido identificada como una fuente de información para la detección del SAHS. De su estudio se pudo observar que, en el conjunto de datos disponibles, no existen variaciones que pudieran asociarse fácilmente con la fonación. Únicamente aquellas dimensiones que describen la distribución de energía a lo largo del eje de frecuencia evidenciaron diferencias significativas, apuntando, una vez más, en la dirección de las resonancias espectrales. Analizados los resultados anteriores, la Tesis afronta la fusión de ambas fuentes de información en un único sistema de clasificación. Con ello es posible mejorar las tasas de clasificación, bajo la hipótesis de que la información presente en el habla continua y el habla sostenida es fundamentalmente distinta. Esta tarea se realizo a través de un sencillo esquema de fusión que obtuvo un 88.6% de aciertos en clasificación (tasa de error del 11.4%), lo que representa una mejora significativa respecto al estado del arte. Finalmente, la combinación de este clasificador con los predictores utilizados por los especialistas clínicos ofreció una tasa del 91.3% (tasa de error de 8.7%), que se encuentra dentro del margen ofrecido por esquemas más costosos e intrusivos, y que a diferencia del propuesto, no pueden ser utilizados en la evaluación previa de los pacientes. Con todo, la Tesis ofrece una visión clara sobre la relación entre el SAHS y el habla, evidenciando el grado de madurez alcanzado por la tecnología del habla en la caracterización y detección del SAHS, poniendo de manifiesto que su uso para la evaluación de los pacientes ya sería posible, y dejando la puerta abierta a futuras investigaciones que continúen el trabajo aquí iniciado. ABSTRACT This Thesis explores the potential of speech technologies for the detection of clinical disorders connected to the upper airway. The study of speech traditionally covers both the production process and post processing of the signals involved, from the speaker up to the listener, offering an alternative path to study these pathologies. The fact that utterances embed not just the encoded message but also information about the speaker, has motivated the development of automatic systems oriented to the identification and verificaton the speaker’s identity. These have recently been boosted and reoriented either towards the characterization of traits that are common to several speakers, or to the differences between records of the same speaker collected under different conditions. The first are particularly relevant to this Thesis as these patterns could reveal the presence of features that are related to a common condition shared among different speakers, regardless of their identity. Such is the case faced in this Thesis, where the traits identified would relate to a particular pathology, directly connected to the speech production system. The Obstructive Sleep Apnea syndrome (OSA) is a paradigmatic case for analysis. It is a disorder with high prevalence among adults and affecting a larger number of them as they grow older. Patients suffering from this disorder experience episodes of involuntary cessation of breath during sleep that may last a few seconds and reproduce throughout the night, preventing proper rest. In the case of obstructive apnea, these episodes are related to the collapse of the pharynx, which interrupts the air flow. Currently, OSA diagnosis is done through a polysomnographic study, which focuses on the analysis of apnea episodes during sleep, requiring the patient to stay at the hospital for the whole night. The complexity and high cost of the procedures involved, combined with the waiting lists, have evidenced the need for screening techniques, which perhaps would not achieve outstanding performance rates but would allow clinicians to reorganize these lists ranking patients according to the severity of their condition. Among others, imaging diagnosis and anthropometric characterization of patients have evidenced the existence of anatomical patterns related to OSA that have direct influence on speech. Contributions devoted to the study of how this disorder affects scpeech are scarce and somehow contradictory. However, since the late 1980s the existence of specific patterns related to articulation, phonation and resonance is known. By that time these descriptions were virtually useless when coming to the development of an automatic system, but pointed out the existence of a link between speech and OSA. In recent years automatic processing techniques have evolved and are now able to identify significant differences in the speech of OSAS patients when compared to records from healthy subjects. Nevertheless, little is known about the connection between these new results with those published in the past and the pathogenesis of the OSA syndrome. This Thesis is aimed to progress beyond the previous research done in this area by addressing: the study of how OSA affects patients’ speech, the enhancement of automatic OSA classification based on speech analysis, and its integration with the information embedded in the predictors generally used by clinicians in preliminary patients’ examination. The first two tasks, though may appear symbiotic at first, are quite different. While studying the connection between speech and OSA requires simple narrow models that can be easily interpreted, classification requires larger models including a large number dimensions for the characterization and posterior identification of the observed patterns. Anyhow, it is clear that any progress made in the first task should allow us to improve our performance on the second one, and that the incorporation of the predictors used by clinicians shall contribute in this same direction. The Thesis considers both continuous and sustained speech analysis, to exploit the synergies and differences between them. On continuous speech analysis, a conventional speech processing scheme, designed and evaluated before this Thesis, was taken as a baseline. Over this initial system several alternative representations of the speech information were proposed, optimized and tested to select those more suitable for the characterization of OSA-specific patterns. Evidences were found on the existence of a connection between OSA and the fundamental constituents of the speech: the formants. Experimental results proved that the success of the proposed solution is well explained by the ability of speech representations to describe these specific OSA-related components, ignoring the noisy ones as well those presenting low discrimination capabilities. The resulting scheme obtained a 18% error rate, on a classification scheme significantly less complex than those described in the literature and operating on a single speech record. Regarding the connection between OSA and the observed patterns, it was necessary to consider inter-and intra-group differences for this analysis, and to focus on the articulation, replacing the complex classification models by the long-term average spectra. Results clearly point to certain regions on the frequency axis, suggesting the existence of a systematic narrowing in the vocal tract section at the oropharynx. This was already described in the pathogenesis of this syndrome. Regarding sustained speech, similar experiments as those conducted on continuous speech were reproduced on sustained phonations of vowel / a /. Results were qualitatively similar to the previous ones, though in this case perfomance rates were found to be noticeably lower. Trying to derive further knowledge from this result, experiments on the long-term average spectra and intraand inter-group variability ratios were also reproduced on sustained speech records. Results on both experiments showed significant differences from the previous ones obtained from continuous speech which could explain the differences observed on peformance. However, sustained speech also provided the opportunity to study phonation within the controlled framework it provides. This was also identified in the literature as a source of information for the detection of OSA. In this study it was found that, for the available dataset, no sistematic differences related to phonation could be found between the two groups of speakers. Only those dimensions which relate energy distribution along the frequency axis provided significant differences, pointing once again towards the direction of resonant components. Once classification schemes on both continuous and sustained speech were developed, the Thesis addressed their combination into a single classification system. Under the assumption that the information in continuous and sustained speech is fundamentally different, it should be possible to successfully merge the two of them. This was tested through a simple fusion scheme which obtained a 88.6% correct classification (11.4% error rate), which represents a significant improvement over the state of the art. Finally, the combination of this classifier with the variables used by clinicians obtained a 91.3% accuracy (8.7% error rate). This is within the range of alternative, but costly and intrusive schemes, which unlike the one proposed can not be used in the preliminary assessment of patients’ condition. In the end, this Thesis has shed new light on the underlying connection between OSA and speech, and evidenced the degree of maturity reached by speech technology on OSA characterization and detection, leaving the door open for future research which shall continue in the multiple directions that have been pointed out and left as future work.
Resumo:
In different problems of Elasticity the definition of the optimal gcometry of the boundary, according to a given objective function, is an issue of great interest. Finding the shape of a hole in the middle of a plate subjected to an arbitrary loading such that the stresses along the hole minimizes some functional or the optimal middle curved concrete vault for a tunnel along which a uniform minimum compression are two typical examples. In these two examples the objective functional depends on the geometry of the boundary that can be either a curve (in case of 2D problems) or a surface boundary (in 3D problems). Typically, optimization is achieved by means of an iterative process which requires the computation of gradients of the objective function with respect to design variables. Gradients can by computed in a variety of ways, although adjoint methods either continuous or discrete ones are the more efficient ones when they are applied in different technical branches. In this paper the adjoint continuous method is introduced in a systematic way to this type of problems and an illustrative simple example, namely the finding of an optimal shape tunnel vault immersed in a linearly elastic terrain, is presented.
Resumo:
La investigación de esta tesis se centra en el estudio de técnicas geoestadísticas y su contribución a una mayor caracterización del binomio factores climáticos-rendimiento de un cultivo agrícola. El inexorable vínculo entre la variabilidad climática y la producción agrícola cobra especial relevancia en estudios sobre el cambio climático o en la modelización de cultivos para dar respuesta a escenarios futuros de producción mundial. Es información especialmente valiosa en sistemas operacionales de monitoreo y predicción de rendimientos de cultivos Los cuales son actualmente uno de los pilares operacionales en los que se sustenta la agricultura y seguridad alimentaria mundial; ya que su objetivo final es el de proporcionar información imparcial y fiable para la regularización de mercados. Es en este contexto, donde se quiso dar un enfoque alternativo a estudios, que con distintos planteamientos, analizan la relación inter-anual clima vs producción. Así, se sustituyó la dimensión tiempo por la espacio, re-orientando el análisis estadístico de correlación interanual entre rendimiento y factores climáticos, por el estudio de la correlación inter-regional entre ambas variables. Se utilizó para ello una técnica estadística relativamente nueva y no muy aplicada en investigaciones similares, llamada regresión ponderada geográficamente (GWR, siglas en inglés de “Geographically weighted regression”). Se obtuvieron superficies continuas de las variables climáticas acumuladas en determinados periodos fenológicos, que fueron seleccionados por ser factores clave en el desarrollo vegetativo de un cultivo. Por ello, la primera parte de la tesis, consistió en un análisis exploratorio sobre comparación de Métodos de Interpolación Espacial (MIE). Partiendo de la hipótesis de que existe la variabilidad espacial de la relación entre factores climáticos y rendimiento, el objetivo principal de esta tesis, fue el de establecer en qué medida los MIE y otros métodos geoestadísticos de regresión local, pueden ayudar por un lado, a alcanzar un mayor entendimiento del binomio clima-rendimiento del trigo blando (Triticum aestivum L.) al incorporar en dicha relación el componente espacial; y por otro, a caracterizar la variación de los principales factores climáticos limitantes en el crecimiento del trigo blando, acumulados éstos en cuatro periodos fenológicos. Para lleva a cabo esto, una gran carga operacional en la investigación de la tesis consistió en homogeneizar y hacer los datos fenológicos, climáticos y estadísticas agrícolas comparables tanto a escala espacial como a escala temporal. Para España y los Bálticos se recolectaron y calcularon datos diarios de precipitación, temperatura máxima y mínima, evapotranspiración y radiación solar en las estaciones meteorológicas disponibles. Se dispuso de una serie temporal que coincidía con los mismos años recolectados en las estadísticas agrícolas, es decir, 14 años contados desde 2000 a 2013 (hasta 2011 en los Bálticos). Se superpuso la malla de información fenológica de cuadrícula 25 km con la ubicación de las estaciones meteorológicas con el fin de conocer los valores fenológicos en cada una de las estaciones disponibles. Hecho esto, para cada año de la serie temporal disponible se calcularon los valores climáticos diarios acumulados en cada uno de los cuatro periodos fenológicos seleccionados P1 (ciclo completo), P2 (emergencia-madurez), P3 (floración) y P4 (floraciónmadurez). Se calculó la superficie interpolada por el conjunto de métodos seleccionados en la comparación: técnicas deterministas convencionales, kriging ordinario y cokriging ordinario ponderado por la altitud. Seleccionados los métodos más eficaces, se calculó a nivel de provincias las variables climatológicas interpoladas. Y se realizaron las regresiones locales GWR para cuantificar, explorar y modelar las relaciones espaciales entre el rendimiento del trigo y las variables climáticas acumuladas en los cuatro periodos fenológicos. Al comparar la eficiencia de los MIE no destaca una técnica por encima del resto como la que proporcione el menor error en su predicción. Ahora bien, considerando los tres indicadores de calidad de los MIE estudiados se han identificado los métodos más efectivos. En el caso de la precipitación, es la técnica geoestadística cokriging la más idónea en la mayoría de los casos. De manera unánime, la interpolación determinista en función radial (spline regularizado) fue la técnica que mejor describía la superficie de precipitación acumulada en los cuatro periodos fenológicos. Los resultados son más heterogéneos para la evapotranspiración y radiación. Los métodos idóneos para estas se reparten entre el Inverse Distance Weighting (IDW), IDW ponderado por la altitud y el Ordinary Kriging (OK). También, se identificó que para la mayoría de los casos en que el error del Ordinary CoKriging (COK) era mayor que el del OK su eficacia es comparable a la del OK en términos de error y el requerimiento computacional de este último es mucho menor. Se pudo confirmar que existe la variabilidad espacial inter-regional entre factores climáticos y el rendimiento del trigo blando tanto en España como en los Bálticos. La herramienta estadística GWR fue capaz de reproducir esta variabilidad con un rendimiento lo suficientemente significativo como para considerarla una herramienta válida en futuros estudios. No obstante, se identificaron ciertas limitaciones en la misma respecto a la información que devuelve el programa a nivel local y que no permite desgranar todo el detalle sobre la ejecución del mismo. Los indicadores y periodos fenológicos que mejor pudieron reproducir la variabilidad espacial del rendimiento en España y Bálticos, arrojaron aún, una mayor credibilidad a los resultados obtenidos y a la eficacia del GWR, ya que estaban en línea con el conocimiento agronómico sobre el cultivo del trigo blando en sistemas agrícolas mediterráneos y norteuropeos. Así, en España, el indicador más robusto fue el balance climático hídrico Climatic Water Balance) acumulado éste, durante el periodo de crecimiento (entre la emergencia y madurez). Aunque se identificó la etapa clave de la floración como el periodo en el que las variables climáticas acumuladas proporcionaban un mayor poder explicativo del modelo GWR. Sin embargo, en los Bálticos, países donde el principal factor limitante en su agricultura es el bajo número de días de crecimiento efectivo, el indicador más efectivo fue la radiación acumulada a lo largo de todo el ciclo de crecimiento (entre la emergencia y madurez). Para el trigo en regadío no existe ninguna combinación que pueda explicar más allá del 30% de la variación del rendimiento en España. Poder demostrar que existe un comportamiento heterogéneo en la relación inter-regional entre el rendimiento y principales variables climáticas, podría contribuir a uno de los mayores desafíos a los que se enfrentan, a día de hoy, los sistemas operacionales de monitoreo y predicción de rendimientos de cultivos, y éste es el de poder reducir la escala espacial de predicción, de un nivel nacional a otro regional. ABSTRACT This thesis explores geostatistical techniques and their contribution to a better characterization of the relationship between climate factors and agricultural crop yields. The crucial link between climate variability and crop production plays a key role in climate change research as well as in crops modelling towards the future global production scenarios. This information is particularly important for monitoring and forecasting operational crop systems. These geostatistical techniques are currently one of the most fundamental operational systems on which global agriculture and food security rely on; with the final aim of providing neutral and reliable information for food market controls, thus avoiding financial speculation of nourishments of primary necessity. Within this context the present thesis aims to provide an alternative approach to the existing body of research examining the relationship between inter-annual climate and production. Therefore, the temporal dimension was replaced for the spatial dimension, re-orienting the statistical analysis of the inter-annual relationship between crops yields and climate factors to an inter-regional correlation between these two variables. Geographically weighted regression, which is a relatively new statistical technique and which has rarely been used in previous research on this topic was used in the current study. Continuous surface values of the climate accumulated variables in specific phenological periods were obtained. These specific periods were selected because they are key factors in the development of vegetative crop. Therefore, the first part of this thesis presents an exploratory analysis regarding the comparability of spatial interpolation methods (SIM) among diverse SIMs and alternative geostatistical methodologies. Given the premise that spatial variability of the relationship between climate factors and crop production exists, the primary aim of this thesis was to examine the extent to which the SIM and other geostatistical methods of local regression (which are integrated tools of the GIS software) are useful in relating crop production and climate variables. The usefulness of these methods was examined in two ways; on one hand the way this information could help to achieve higher production of the white wheat binomial (Triticum aestivum L.) by incorporating the spatial component in the examination of the above-mentioned relationship. On the other hand, the way it helps with the characterization of the key limiting climate factors of soft wheat growth which were analysed in four phenological periods. To achieve this aim, an important operational workload of this thesis consisted in the homogenization and obtention of comparable phenological and climate data, as well as agricultural statistics, which made heavy operational demands. For Spain and the Baltic countries, data on precipitation, maximum and minimum temperature, evapotranspiration and solar radiation from the available meteorological stations were gathered and calculated. A temporal serial approach was taken. These temporal series aligned with the years that agriculture statistics had previously gathered, these being 14 years from 2000 to 2013 (until 2011 for the Baltic countries). This temporal series was mapped with a phenological 25 km grid that had the location of the meteorological stations with the objective of obtaining the phenological values in each of the available stations. Following this procedure, the daily accumulated climate values for each of the four selected phenological periods were calculated; namely P1 (complete cycle), P2 (emergency-maturity), P3 (flowering) and P4 (flowering- maturity). The interpolated surface was then calculated using the set of selected methodologies for the comparison: deterministic conventional techniques, ordinary kriging and ordinary cokriging weighted by height. Once the most effective methods had been selected, the level of the interpolated climate variables was calculated. Local GWR regressions were calculated to quantify, examine and model the spatial relationships between soft wheat production and the accumulated variables in each of the four selected phenological periods. Results from the comparison among the SIMs revealed that no particular technique seems more favourable in terms of accuracy of prediction. However, when the three quality indicators of the compared SIMs are considered, some methodologies appeared to be more efficient than others. Regarding precipitation results, cokriging was the most accurate geostatistical technique for the majority of the cases. Deterministic interpolation in its radial function (controlled spline) was the most accurate technique for describing the accumulated precipitation surface in all phenological periods. However, results are more heterogeneous for the evapotranspiration and radiation methodologies. The most appropriate technique for these forecasts are the Inverse Distance Weighting (IDW), weighted IDW by height and the Ordinary Kriging (OK). Furthermore, it was found that for the majority of the cases where the Ordinary CoKriging (COK) error was larger than that of the OK, its efficacy was comparable to that of the OK in terms of error while the computational demands of the latter was much lower. The existing spatial inter-regional variability between climate factors and soft wheat production was confirmed for both Spain and the Baltic countries. The GWR statistic tool reproduced this variability with an outcome significative enough as to be considered a valid tool for future studies. Nevertheless, this tool also had some limitations with regards to the information delivered by the programme because it did not allow for a detailed break-down of its procedure. The indicators and phenological periods that best reproduced the spatial variability of yields in Spain and the Baltic countries made the results and the efficiency of the GWR statistical tool even more reliable, despite the fact that these were already aligned with the agricultural knowledge about soft wheat crop under mediterranean and northeuropean agricultural systems. Thus, for Spain, the most robust indicator was the Climatic Water Balance outcome accumulated throughout the growing period (between emergency and maturity). Although the flowering period was the phase that best explained the accumulated climate variables in the GWR model. For the Baltic countries where the main limiting agricultural factor is the number of days of effective growth, the most effective indicator was the accumulated radiation throughout the entire growing cycle (between emergency and maturity). For the irrigated soft wheat there was no combination capable of explaining above the 30% of variation of the production in Spain. The fact that the pattern of the inter-regional relationship between the crop production and key climate variables is heterogeneous within a country could contribute to one is one of the greatest challenges that the monitoring and forecasting operational systems for crop production face nowadays. The present findings suggest that the solution may lay in downscaling the spatial target scale from a national to a regional level.
Resumo:
In recent years, several explanatory models have been developed which attempt to analyse the predictive worth of various factors in relation to academic achievement, as well as the direct and indirect effects that they produce. The aim of this study was to examine a structural model incorporating various cognitive and motivational variables which influence student achievement in the two basic core skills in the Spanish curriculum: Spanish Language and Mathematics. These variables included differential aptitudes, specific self-concept, goal orientations, effort and learning strategies. The sample comprised 341 Spanish students in their first year of Compulsory Secondary Education. Various tests and questionnaires were used to assess each student, and Structural Equation Modelling (SEM) was employed to study the relationships in the initial model. The proposed model obtained a satisfactory fit for the two subjects studied, and all the relationships hypothesised were significant. The variable with the most explanatory power regarding academic achievement was mathematical and verbal aptitude. Also notable was the direct influence of specific self-concept on achievement, goal-orientation and effort, as was the mediatory effect that effort and learning strategies had between academic goals and final achievement.
Resumo:
We demonstrate that the process of generating smooth transitions Call be viewed as a natural result of the filtering operations implied in the generation of discrete-time series observations from the sampling of data from an underlying continuous time process that has undergone a process of structural change. In order to focus discussion, we utilize the problem of estimating the location of abrupt shifts in some simple time series models. This approach will permit its to address salient issues relating to distortions induced by the inherent aggregation associated with discrete-time sampling of continuous time processes experiencing structural change, We also address the issue of how time irreversible structures may be generated within the smooth transition processes. (c) 2005 Elsevier Inc. All rights reserved.