936 resultados para Mixed Type Variables Clustering
Resumo:
This paper investigates the effects on open-seat races in the United States House of Representatives. This project focuses on the influence that the House leadership exerts on races. Generally, the leadership influences race through spending by party organizations and leadership visits. During each election cycle, national party organizations spend millions of dollars to get their candidates into office. I have developed a multiple regression model that measures different types of spending from the Democratic Congressional Campaign Committee, the National Republican Congressional Committee, and the Republican National Committee and the effects of these spending types on the election results. Also, the study examines the number of visits by each party’s leadership to each race. I introduced control variables that account for the year, the competitiveness of each race, and the individual candidate fundraising. In terms of statistical significance, the results were mixed showing one type of party spending to be highly influential in the outcome of the race. Competitiveness and individual candidate fundraising also achieved statistical significance. The study also includes a qualitative investigation of leadership visits and individual case studies in order to understand better the way in which the data interact in real campaigns.
Resumo:
Purpose of the study. This study had two components. The first component of the study was the development and implementation of an infrastructure that integrated Promotores who teach diabetes self-management into a community clinic. The second component was a six-month randomized clinical trial (RCT) designed to test the effectiveness of the Promotores in changing knowledge, beliefs, and HbA1c levels among Mexican American patients with type 2 diabetes. ^ Methods. Starfield's adaptation of the Donbedian structure, process, and outcome methodology was used to develop a clinic infrastructure that allowed the integration of Promotores as diabetes educators. The RCT of the culturally sensitive Promotores-led 10-week diabetes self-management program compared the outcomes of 63 patients in the intervention group with 68 patients in a wait-list, usual care control group. Participants were Mexican Americans, at least 18 years of age, with type 2 diabetes, who were patients at a Federally Qualified Health Center on the Texas-Mexico border. At baseline, three months, and six months, data were collected using the Diabetes Knowledge Questionnaire (DKQ, the Health Beliefs Questionnaire (HBQ, and HbA1c levels were drawn by the clinic laboratory. A mixed model methodology was used to analyze the data. ^ Results. The infrastructure to support a Promotores-led diabetes self-management course designed in concert with administration, the physicians, and the CDE, resulted in (1) employment of Promotores to teach diabetes self-management courses; (2) integration of provider and nurse oversight of course design and implementation; (3) management of Promotora training, and the development of teaching competencies and skills; (4) coordination of care through communication and documentation policies and procedures; (5) utilization of quality control mechanisms to maintain patient safety; and (6) promotion of a culturally competent approach to the educational process. The RCT resulted in a significant improvement in the intervention group's DKQ scores over time (F [1, 129] = 4.77, p = 0.0308), and in treatment by time (F [2, 168] = 5.85, p = 0.0035). Neither the HBQ scores nor the HbA1c changed over time. However, the baseline HbA1c was 7.49, almost at the therapeutic level. The DKQ, HBQ, and HbA1c results were significantly affected by age; the DKQ and HbA1c by years with diabetes. ^ Conclusions. The clinic model provides a systematic approach to safely address the educational needs of large numbers of patients with type 2 diabetes who live in communities that suffer from a lack of health care professionals. The Promotores-led diabetes self-management course improved the knowledge of patients with diabetes and may be a culturally sensitive strategy for meeting patient educational needs. The low baseline HbA1c levels in this border community suggested that patients in this Federally Qualified Health Center on the Texas-Mexico border were experiencing good medical management of their diabetes. ^
Resumo:
Objective. The purpose of the study is to provide a holistic depiction of behavioral & environmental factors contributing to risky sexual behaviors among predominantly high school educated, low-income African Americans residing in urban areas of Houston, TX utilizing the Theory of Gender and Power, Situational/Environmental Variables Theory, and Sexual Script Theory. ^ Methods. A cross-sectional study was conducted via questionnaires among 215 Houston area residents, 149 were women and 66 were male. Measures used to assess behaviors of the population included a history of homelessness, use of crack/cocaine among several other illicit drugs, the type of sexual partner, age of participant, age of most recent sex partner, whether or not participants sought health care in the last 12 months, knowledge of partner's other sexual activities, symptoms of depression, and places where partner's were met. In an effort to determine risk of sexual encounters, a risk index employing the variables used to assess condom use was created categorizing sexual encounters as unsafe or safe. ^ Results. Variables meeting the significance level of p<.15 for the bivariate analysis of each theory were entered into a binary logistic regression analysis. The block for each theory was significant, suggesting that the grouping assignments of each variable by theory were significantly associated with unsafe sexual behaviors. Within the regression analysis, variables such as sex for drugs/money, low income, and crack use demonstrated an effect size of ≥±1, indicating that these variables had a significant effect on unsafe sexual behavioral practices. ^ Conclusions. Variables assessing behavior and environment demonstrated a significant effect when categorized by relation to designated theories. ^
Resumo:
Diabetic nephropathy is the most common cause of end-stage renal disease (ESRD) in the United States. African-Americans and patients with type 1 diabetes (T1D) are at increased risk. We studied the rate and factors that influenced progression of glomerular filtration rate (GFR) in 401 African-American T1D patients who were followed for 6 years through the observational cohort New Jersey 725 study. Patients with ESRD and/or GFR<20 ml/min were excluded. The mean (SD) baseline GFR was 106.8 (27.04) ml/min and it decreased by 13.8 (mean, SD 32.2) ml/min during the 6-year period (2.3 ml/min/year). In patients with baseline macroproteinuria, GFR decreased by 31.8 (39.0) ml/min (5.3 ml/min/year) compared to 8.2 (mean, SD 27.6) ml/min (1.3 ml/min/year) in patients without it (p<0.00001). Six-year GFR fell to <20 ml/min in 5.25% of all patients, but in 16.8% of macroproteinuric patients.^ A model including baseline GFR, proteinuria category and hypertension category, explained 35% of the 6-year GFR variability (p<0.0001). After adjustment for other variables in the model, 6-year GFR was 24.9 ml/min lower in macroproteinuric patients than in those without proteinuria (p=0.0001), and 12.6 ml/min lower in patients with treated but uncontrolled hypertension compared to normotensive patients (p=0.003). In this sample of patients, with an elevated mean glycosylated hemoglobin of 12.4%, glycemic control did not independently influence GFR deterioration, nor did BMI, cholesterol, gender, age at diabetes onset or socioeconomic level.^ Taken together, our findings suggest that proteinuria and hypertension are the most important factors associated with GFR deterioration in African-American T1D patients.^
Resumo:
Detection of multidrug-resistant tuberculosis (MDR-TB), a frequent cause of treatment failure, takes 2 or more weeks to identify by culture. RIF-resistance is a hallmark of MDR-TB, and detection of mutations in the rpoB gene of Mycobacterium tuberculosis using molecular beacon probes with real-time quantitative polymerase chain reaction (qPCR) is a novel approach that takes ≤2 days. However, qPCR identification of resistant isolates, particularly for isolates with mixed RIF-susceptible and RIF-resistant bacteria, is reader dependent and limits its clinical use. The aim of this study was to develop an objective, reader-independent method to define rpoB mutants using beacon qPCR. This would facilitate the transition from a research protocol to the clinical setting, where high-throughput methods with objective interpretation are required. For this, DNAs from 107 M. tuberculosis clinical isolates with known susceptibility to RIF by culture-based methods were obtained from 2 regions where isolates have not previously been subjected to evaluation using molecular beacon qPCR: the Texas–Mexico border and Colombia. Using coded DNA specimens, mutations within an 81-bp hot spot region of rpoB were established by qPCR with 5 beacons spanning this region. Visual and mathematical approaches were used to establish whether the qPCR cycle threshold of the experimental isolate was significantly higher (mutant) compared to a reference wild-type isolate. Visual classification of the beacon qPCR required reader training for strains with a mixture of RIF-susceptible and RIF-resistant bacteria. Only then had the visual interpretation by an experienced reader had 100% sensitivity and 94.6% specificity versus RIF-resistance by culture phenotype and 98.1% sensitivity and 100% specificity versus mutations based on DNA sequence. The mathematical approach was 98% sensitive and 94.5% specific versus culture and 96.2% sensitive and 100% specific versus DNA sequence. Our findings indicate the mathematical approach has advantages over the visual reading, in that it uses a Microsoft Excel template to eliminate reader bias or inexperience, and allows objective interpretation from high-throughput analyses even in the presence of a mixture of RIF-resistant and RIF-susceptible isolates without the need for reader training.^
Resumo:
Cross-sectional designs, longitudinal designs in which a single cohort is followed over time, and mixed-longitudinal designs in which several cohorts are followed for a shorter period are compared by their precision, potential for bias due to age, time and cohort effects, and feasibility. Mixed longitudinal studies have two advantages over longitudinal studies: isolation of time and age effects and shorter completion time. Though the advantages of mixed-longitudinal studies are clear, choosing an optimal design is difficult, especially given the number of possible combinations of the number of cohorts and number of overlapping intervals between cohorts. The purpose of this paper is to determine the optimal design for detecting differences in group growth rates.^ The type of mixed-longitudinal study appropriate for modeling both individual and group growth rates is called a "multiple-longitudinal" design. A multiple-longitudinal study typically requires uniform or simultaneous entry of subjects, who are each observed till the end of the study.^ While recommendations for designing pure-longitudinal studies have been made by Schlesselman (1973b), Lefant (1990) and Helms (1991), design recommendations for multiple-longitudinal studies have never been published. It is shown that by using power analyses to determine the minimum number of occasions per cohort and minimum number of overlapping occasions between cohorts, in conjunction with a cost model, an optimal multiple-longitudinal design can be determined. An example of systolic blood pressure values for cohorts of males and cohorts of females, ages 8 to 18 years, is given. ^
Resumo:
The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^
Resumo:
Las hormonas vegetales son capaces de controlar el desarrollo reproductivo, desde la diferenciación floral hasta los últimos estadios del desarrollo de los frutos. En particular, la etapa de fructificación y desarrollo depende del contenido endógeno de estas sustancias, y es posible manipular la iniciación del desarrollo del fruto por aplicación externa de hormonas. Previamente se evaluó el proceso de fructificación y desarrollo en el cultivo de tomate en invernadero en respuesta a la aplicación de b-NOA y AG3 en dosis fijas: se observó sensibilidad diferencial dependiendo del genotipo y tipo de regulador. El objetivo de este trabajo fue establecer dosis y momento óptimo para la aplicación de b-NOA y AG3 como formas de mejorar la fructificación y el desarrollo de frutos partenocárpicos. Como factores se consideraron tipo de regulador -b-NOA y AG3- en dosis y momentos de aplicación variables. Empleando ovarios no polinizados como sistema experimental fue posible concluir que la aplicación de 40 ppm de b-NOA a 7 días post antesis ofrece las mayores ventajas desde el punto de vista del rendimiento y menor impacto fisiológico, sin alterar el período de desarrollo de los frutos.
Resumo:
Measures of agro-ecosystems genetic variability are essential to sustain scientific-based actions and policies tending to protect the ecosystem services they provide. To build the genetic variability datum it is necessary to deal with a large number and different types of variables. Molecular marker data is highly dimensional by nature, and frequently additional types of information are obtained, as morphological and physiological traits. This way, genetic variability studies are usually associated with the measurement of several traits on each entity. Multivariate methods are aimed at finding proximities between entities characterized by multiple traits by summarizing information in few synthetic variables. In this work we discuss and illustrate several multivariate methods used for different purposes to build the datum of genetic variability. We include methods applied in studies for exploring the spatial structure of genetic variability and the association of genetic data to other sources of information. Multivariate techniques allow the pursuit of the genetic variability datum, as a unifying notion that merges concepts of type, abundance and distribution of variability at gene level.
Resumo:
High-resolution palynological analysis on annually laminated sediments of Sihailongwan Maar Lake (SHL) provides new insights into the Holocene vegetation and climate dynamics of NE China. The robust chronology of the presented record is based on varve counting and AMS radiocarbon dates from terrestrial plant macro-remains. In addition to the qualitative interpretation of the pollen data, we provide quantitative reconstructions of vegetation and climate based on the method of biomization and weighted averaging partial least squares regression (WA-PLS) technique, respectively. Power spectra were computed to investigate the frequency domain distribution of proxy signals and potential natural periodicities. Pollen assemblages, pollen-derived biome scores and climate variables as well as the cyclicity pattern indicate that NE China experienced significant changes in temperature and moisture conditions during the Holocene. Within the earliest phase of the Holocene, a large-scale reorganization of vegetation occurred, reflecting the reconstructed shift towards higher temperatures and precipitation values and the initial Holocene strengthening and northward expansion of the East Asian summer monsoon (EASM). Afterwards, summer temperatures remain at a high level, whereas the reconstructed precipitation shows an increasing trend until approximately 4000 cal. yr BP. Since 3500 cal. yr BP, temperature and precipitation values decline, indicating moderate cooling and weakening of the EASM. A distinct periodicity of 550-600 years and evidence of a Mid-Holocene transition from a temperature-triggered to a predominantly moisture-triggered climate regime are derived from the power spectra analysis. The results obtained from SHL are largely consistent with other palaeoenvironmental records from NE China, substantiating the regional nature of the reconstructed vegetation and climate patterns. However, the reconstructed climate changes contrast with the moisture evolution recorded in S China and the mid-latitude (semi-)arid regions of N China. Whereas a clear insolation-related trend of monsoon intensity over the Holocene is lacking from the SHL record, variations in the coupled atmosphere-Pacific Ocean system can largely explain the reconstructed changes in NE China.
Resumo:
The efficiency of the biological pump of carbon to the deep ocean depends largely on the biologically mediated export of carbon from the surface ocean and its remineralization with depth. Global satellite studies have primarily focused on chlorophyll concentration and net primary production (NPP) to understand the role of phytoplankton in these processes. Recent satellite retrievals of phytoplankton composition now allow for the size of phytoplankton cells to be considered. Here, we improve understanding of phytoplankton size structure impacts on particle export, remineralization and transfer. Particulate organic carbon (POC) flux observations from sediment traps and 234Th are compiled across the global ocean. Annual climatologies of NPP, percent microplankton, and POC flux at four time series locations and within biogeochemical provinces are constructed, and sinking velocities are calculated to align surface variables with POC flux at depth. Parameters that characterize POC flux vs. depth (export flux ratio, labile fraction, remineralization length scale) are then fit to the aligned dataset. Times of the year dominated by different size compositions are identified and fit separately in regions of the ocean where phytoplankton cell size showed enough dynamic range over the annual cycle. Considering all data together, our findings support the paradigm of high export flux but low transfer efficiency in more productive regions and vice versa for oligotrophic regions. However, when parsing by dominant size class, we find periods dominated by small cells to have both greater export flux and lower transfer efficiency than periods when large cells comprise a greater proportion of the phytoplankton community.
Resumo:
La relación entre la estructura urbana y la movilidad ha sido estudiada desde hace más de 70 años. El entorno urbano incluye múltiples dimensiones como por ejemplo: la estructura urbana, los usos de suelo, la distribución de instalaciones diversas (comercios, escuelas y zonas de restauración, parking, etc.). Al realizar una revisión de la literatura existente en este contexto, se encuentran distintos análisis, metodologías, escalas geográficas y dimensiones, tanto de la movilidad como de la estructura urbana. En este sentido, se trata de una relación muy estudiada pero muy compleja, sobre la que no existe hasta el momento un consenso sobre qué dimensión del entorno urbano influye sobre qué dimensión de la movilidad, y cuál es la manera apropiada de representar esta relación. Con el propósito de contestar estas preguntas investigación, la presente tesis tiene los siguientes objetivos generales: (1) Contribuir al mejor entendimiento de la compleja relación estructura urbana y movilidad. y (2) Entender el rol de los atributos latentes en la relación entorno urbano y movilidad. El objetivo específico de la tesis es analizar la influencia del entorno urbano sobre dos dimensiones de la movilidad: número de viajes y tipo de tour. Vista la complejidad de la relación entorno urbano y movilidad, se pretende contribuir al mejor entendimiento de la relación a través de la utilización de 3 escalas geográficas de las variables y del análisis de la influencia de efectos inobservados en la movilidad. Para el análisis se utiliza una base de datos conformada por tres tipos de datos: (1) Una encuesta de movilidad realizada durante los años 2006 y 2007. Se obtuvo un total de 943 encuestas, en 3 barrios de Madrid: Chamberí, Pozuelo y Algete. (2) Información municipal del Instituto Nacional de Estadística: dicha información se encuentra enlazada con los orígenes y destinos de los viajes recogidos en la encuesta. Y (3) Información georeferenciada en Arc-GIS de los hogares participantes en la encuesta: la base de datos contiene información respecto a la estructura de las calles, localización de escuelas, parking, centros médicos y lugares de restauración. Se analizó la correlación entre e intra-grupos y se modelizaron 4 casos de atributos bajo la estructura ordinal logit. Posteriormente se evalúa la auto-selección a través de la estimación conjunta de las elecciones de tipo de barrio y número de viajes. La elección del tipo de barrio consta de 3 alternativas: CBD, Urban y Suburban, según la zona de residencia recogida en las encuestas. Mientras que la elección del número de viajes consta de 4 categorías ordinales: 0 viajes, 1-2 viajes, 3-4 viajes y 5 o más viajes. A partir de la mejor especificación del modelo ordinal logit. Se desarrolló un modelo joint mixed-ordinal conjunto. Los resultados indican que las variables exógenas requieren un análisis exhaustivo de correlaciones con el fin de evitar resultados sesgados. ha determinado que es importante medir los atributos del BE donde se realiza el viaje, pero también la información municipal es muy explicativa de la movilidad individual. Por tanto, la percepción de las zonas de destino a nivel municipal es considerada importante. En el contexto de la Auto-selección (self-selection) es importante modelizar conjuntamente las decisiones. La Auto-selección existe, puesto que los parámetros estimados conjuntamente son significativos. Sin embargo, sólo ciertos atributos del entorno urbano son igualmente importantes sobre la elección de la zona de residencia y frecuencia de viajes. Para analizar la Propensión al Viaje, se desarrolló un modelo híbrido, formado por: una variable latente, un indicador y un modelo de elección discreta. La variable latente se denomina “Propensión al Viaje”, cuyo indicador en ecuación de medida es el número de viajes; la elección discreta es el tipo de tour. El modelo de elección consiste en 5 alternativas, según la jerarquía de actividades establecida en la tesis: HOME, no realiza viajes durante el día de estudio, HWH tour cuya actividad principal es el trabajo o estudios, y no se realizan paradas intermedias; HWHs tour si el individuo reaiza paradas intermedias; HOH tour cuya actividad principal es distinta a trabajo y estudios, y no se realizan paradas intermedias; HOHs donde se realizan paradas intermedias. Para llegar a la mejor especificación del modelo, se realizó un trabajo importante considerando diferentes estructuras de modelos y tres tipos de estimaciones. De tal manera, se obtuvieron parámetros consistentes y eficientes. Los resultados muestran que la modelización de los tours, representa una ventaja sobre la modelización de los viajes, puesto que supera las limitaciones de espacio y tiempo, enlazando los viajes realizados por la misma persona en el día de estudio. La propensión al viaje (PT) existe y es específica para cada tipo de tour. Los parámetros estimados en el modelo híbrido resultaron significativos y distintos para cada alternativa de tipo de tour. Por último, en la tesis se verifica que los modelos híbridos representan una mejora sobre los modelos tradicionales de elección discreta, dando como resultado parámetros consistentes y más robustos. En cuanto a políticas de transporte, se ha demostrado que los atributos del entorno urbano son más importantes que los LOS (Level of Service) en la generación de tours multi-etapas. la presente tesis representa el primer análisis empírico de la relación entre los tipos de tours y la propensión al viaje. El concepto Propensity to Travel ha sido desarrollado exclusivamente para la tesis. Igualmente, el desarrollo de un modelo conjunto RC-Number of trips basado en tres escalas de medida representa innovación en cuanto a la comparación de las escalas geográficas, que no había sido hecha en la modelización de la self-selection. The relationship between built environment (BE) and travel behaviour (TB) has been studied in a number of cases, using several methods - aggregate and disaggregate approaches - and different focuses – trip frequency, automobile use, and vehicle miles travelled and so on. Definitely, travel is generated by the need to undertake activities and obtain services, and there is a general consensus that urban components affect TB. However researches are still needed to better understand which components of the travel behaviour are affected most and by which of the urban components. In order to fill the gap in the research, the present dissertation faced two main objectives: (1) To contribute to the better understanding of the relationship between travel demand and urban environment. And (2) To develop an econometric model for estimating travel demand with urban environment attributes. With this purpose, the present thesis faced an exhaustive research and computation of land-use variables in order to find the best representation of BE for modelling trip frequency. In particular two empirical analyses are carried out: 1. Estimation of three dimensions of travel demand using dimensions of urban environment. We compare different travel dimensions and geographical scales, and we measure self-selection contribution following the joint models. 2. Develop a hybrid model, integrated latent variable and discrete choice model. The implementation of hybrid models is new in the analysis of land-use and travel behaviour. BE and TB explicitly interact and allow richness information about a specific individual decision process For all empirical analysis is used a data-base from a survey conducted in 2006 and 2007 in Madrid. Spatial attributes describing neighbourhood environment are derived from different data sources: National Institute of Statistics-INE (Administrative: municipality and district) and GIS (circular units). INE provides raw data for such spatial units as: municipality and district. The construction of census units is trivial as the census bureau provides tables that readily define districts and municipalities. The construction of circular units requires us to determine the radius and associate the spatial information to our households. The first empirical part analyzes trip frequency by applying an ordered logit model. In this part is studied the effect of socio-economic, transport and land use characteristics on two travel dimensions: trip frequency and type of tour. In particular the land use is defined in terms of type of neighbourhoods and types of dwellers. Three neighbourhood representations are explored, and described three for constructing neighbourhood attributes. In particular administrative units are examined to represent neighbourhood and circular – unit representation. Ordered logit models are applied, while ordinal logit models are well-known, an intensive work for constructing a spatial attributes was carried out. On the other hand, the second empirical analysis consists of the development of an innovative econometric model that considers a latent variable called “propensity to travel”, and choice model is the choice of type of tour. The first two specifications of ordinal models help to estimate this latent variable. The latent variable is unobserved but the manifestation is called “indicators”, then the probability of choosing an alternative of tour is conditional to the probability of latent variable and type of tour. Since latent variable is unknown we fit the integral over its distribution. Four “sets of best variables” are specified, following the specification obtained from the correlation analysis. The results evidence that the relative importance of SE variables versus BE variables depends on how BE variables are measured. We found that each of these three spatial scales has its intangible qualities and drawbacks. Spatial scales play an important role on predicting travel demand due to the variability in measures at trip origin/destinations within the same administrative unit (municipality, district and so on). Larger units will produce less variation in data; but it does not affect certain variables, such as public transport supply, that are more significant at municipality level. By contrast, land-use measures are more efficient at district level. Self-selection in this context, is weak. Thus, the influence of BE attributes is true. The results of the hybrid model show that unobserved factors affect the choice of tour complexity. The latent variable used in this model is propensity to travel that is explained by socioeconomic aspects and neighbourhood attributes. The results show that neighbourhood attributes have indeed a significant impact on the choice of the type of tours either directly and through the propensity to travel. The propensity to travel has a different impact depending on the structure of each tour and increases the probability of choosing more complex tours, such as tours with many intermediate stops. The integration of choice and latent variable model shows that omitting important perception and attitudes leads to inconsistent estimates. The results also indicate that goodness of fit improves by adding the latent variable in both sequential and simultaneous estimation. There are significant differences in the sensitivity to the latent variable across alternatives. In general, as expected, the hybrid models show a major improvement into the goodness of fit of the model, compared to a classical discrete choice model that does not incorporate latent effects. The integrated model leads to a more detailed analysis of the behavioural process. Summarizing, the effect that built environment characteristics on trip frequency studied is deeply analyzed. In particular we tried to better understand how land use characteristics can be defined and measured and which of these measures do have really an impact on trip frequency. We also tried to test the superiority of HCM on this field. We can concluded that HCM shows a major improvement into the goodness of fit of the model, compared to classical discrete choice model that does not incorporate latent effects. And consequently, the application of HCM shows the importance of LV on the decision of tour complexity. People are more elastic to built environment attributes than level of services. Thus, policy implications must take place to develop more mixed areas, work-places in combination with commercial retails.
Resumo:
Abstract Air pollution is a big threat and a phenomenon that has a specific impact on human health, in addition, changes that occur in the chemical composition of the atmosphere can change the weather and cause acid rain or ozone destruction. Those are phenomena of global importance. The World Health Organization (WHO) considerates air pollution as one of the most important global priorities. Salamanca, Gto., Mexico has been ranked as one of the most polluted cities in this country. The industry of the area led to a major economic development and rapid population growth in the second half of the twentieth century. The impact in the air quality is important and significant efforts have been made to measure the concentrations of pollutants. The main pollution sources are locally based plants in the chemical and power generation sectors. The registered concerning pollutants are Sulphur Dioxide (SO2) and particles on the order of ∼10 micrometers or less (PM10). The prediction in the concentration of those pollutants can be a powerful tool in order to take preventive measures such as the reduction of emissions and alerting the affected population. In this PhD thesis we propose a model to predict concentrations of pollutants SO2 and PM10 for each monitoring booth in the Atmospheric Monitoring Network Salamanca (REDMAS - for its spanish acronym). The proposed models consider the use of meteorological variables as factors influencing the concentration of pollutants. The information used along this work is the current real data from REDMAS. In the proposed model, Artificial Neural Networks (ANN) combined with clustering algorithms are used. The type of ANN used is the Multilayer Perceptron with a hidden layer, using separate structures for the prediction of each pollutant. The meteorological variables used for prediction were: Wind Direction (WD), wind speed (WS), Temperature (T) and relative humidity (RH). Clustering algorithms, K-means and Fuzzy C-means, are used to find relationships between air pollutants and weather variables under consideration, which are added as input of the RNA. Those relationships provide information to the ANN in order to obtain the prediction of the pollutants. The results of the model proposed in this work are compared with the results of a multivariate linear regression and multilayer perceptron neural network. The evaluation of the prediction is calculated with the mean absolute error, the root mean square error, the correlation coefficient and the index of agreement. The results show the importance of meteorological variables in the prediction of the concentration of the pollutants SO2 and PM10 in the city of Salamanca, Gto., Mexico. The results show that the proposed model perform better than multivariate linear regression and multilayer perceptron neural network. The models implemented for each monitoring booth have the ability to make predictions of air quality that can be used in a system of real-time forecasting and human health impact analysis. Among the main results of the development of this thesis we can cite: A model based on artificial neural network combined with clustering algorithms for prediction with a hour ahead of the concentration of each pollutant (SO2 and PM10) is proposed. A different model was designed for each pollutant and for each of the three monitoring booths of the REDMAS. A model to predict the average of pollutant concentration in the next 24 hours of pollutants SO2 and PM10 is proposed, based on artificial neural network combined with clustering algorithms. Model was designed for each booth of the REDMAS and each pollutant separately. Resumen La contaminación atmosférica es una amenaza aguda, constituye un fenómeno que tiene particular incidencia sobre la salud del hombre. Los cambios que se producen en la composición química de la atmósfera pueden cambiar el clima, producir lluvia ácida o destruir el ozono, fenómenos todos ellos de una gran importancia global. La Organización Mundial de la Salud (OMS) considera la contaminación atmosférica como una de las más importantes prioridades mundiales. Salamanca, Gto., México; ha sido catalogada como una de las ciudades más contaminadas en este país. La industria de la zona propició un importante desarrollo económico y un crecimiento acelerado de la población en la segunda mitad del siglo XX. Las afectaciones en el aire son graves y se han hecho importantes esfuerzos por medir las concentraciones de los contaminantes. Las principales fuentes de contaminación son fuentes fijas como industrias químicas y de generación eléctrica. Los contaminantes que se han registrado como preocupantes son el Bióxido de Azufre (SO2) y las Partículas Menores a 10 micrómetros (PM10). La predicción de las concentraciones de estos contaminantes puede ser una potente herramienta que permita tomar medidas preventivas como reducción de emisiones a la atmósfera y alertar a la población afectada. En la presente tesis doctoral se propone un modelo de predicción de concentraci ón de los contaminantes más críticos SO2 y PM10 para cada caseta de monitorización de la Red de Monitorización Atmosférica de Salamanca (REDMAS). Los modelos propuestos plantean el uso de las variables meteorol ógicas como factores que influyen en la concentración de los contaminantes. La información utilizada durante el desarrollo de este trabajo corresponde a datos reales obtenidos de la REDMAS. En el Modelo Propuesto (MP) se aplican Redes Neuronales Artificiales (RNA) combinadas con algoritmos de agrupamiento. La RNA utilizada es el Perceptrón Multicapa con una capa oculta, utilizando estructuras independientes para la predicción de cada contaminante. Las variables meteorológicas disponibles para realizar la predicción fueron: Dirección de Viento (DV), Velocidad de Viento (VV), Temperatura (T) y Humedad Relativa (HR). Los algoritmos de agrupamiento K-means y Fuzzy C-means son utilizados para encontrar relaciones existentes entre los contaminantes atmosféricos en estudio y las variables meteorológicas. Dichas relaciones aportan información a las RNA para obtener la predicción de los contaminantes, la cual es agregada como entrada de las RNA. Los resultados del modelo propuesto en este trabajo son comparados con los resultados de una Regresión Lineal Multivariable (RLM) y un Perceptrón Multicapa (MLP). La evaluación de la predicción se realiza con el Error Medio Absoluto, la Raíz del Error Cuadrático Medio, el coeficiente de correlación y el índice de acuerdo. Los resultados obtenidos muestran la importancia de las variables meteorológicas en la predicción de la concentración de los contaminantes SO2 y PM10 en la ciudad de Salamanca, Gto., México. Los resultados muestran que el MP predice mejor la concentración de los contaminantes SO2 y PM10 que los modelos RLM y MLP. Los modelos implementados para cada caseta de monitorizaci ón tienen la capacidad para realizar predicciones de calidad del aire, estos modelos pueden ser implementados en un sistema que permita realizar la predicción en tiempo real y analizar el impacto en la salud de la población. Entre los principales resultados obtenidos del desarrollo de esta tesis podemos citar: Se propone un modelo basado en una red neuronal artificial combinado con algoritmos de agrupamiento para la predicción con una hora de anticipaci ón de la concentración de cada contaminante (SO2 y PM10). Se diseñó un modelo diferente para cada contaminante y para cada una de las tres casetas de monitorización de la REDMAS. Se propone un modelo de predicción del promedio de la concentración de las próximas 24 horas de los contaminantes SO2 y PM10, basado en una red neuronal artificial combinado con algoritmos de agrupamiento. Se diseñó un modelo para cada caseta de monitorización de la REDMAS y para cada contaminante por separado.