958 resultados para multivariate binary data
Resumo:
Standard methods for the analysis of linear latent variable models oftenrely on the assumption that the vector of observed variables is normallydistributed. This normality assumption (NA) plays a crucial role inassessingoptimality of estimates, in computing standard errors, and in designinganasymptotic chi-square goodness-of-fit test. The asymptotic validity of NAinferences when the data deviates from normality has been calledasymptoticrobustness. In the present paper we extend previous work on asymptoticrobustnessto a general context of multi-sample analysis of linear latent variablemodels,with a latent component of the model allowed to be fixed across(hypothetical)sample replications, and with the asymptotic covariance matrix of thesamplemoments not necessarily finite. We will show that, under certainconditions,the matrix $\Gamma$ of asymptotic variances of the analyzed samplemomentscan be substituted by a matrix $\Omega$ that is a function only of thecross-product moments of the observed variables. The main advantage of thisis thatinferences based on $\Omega$ are readily available in standard softwareforcovariance structure analysis, and do not require to compute samplefourth-order moments. An illustration with simulated data in the context ofregressionwith errors in variables will be presented.
Resumo:
BACKGROUND: Circulating 25-hydroxyvitamin D [25(OH)D] concentration is inversely associated with peripheral arterial disease and hypertension. Vascular remodeling may play a role in this association, however, data relating vitamin D level to specific remodeling biomarkers among ESRD patients is sparse. We tested whether 25(OH)D concentration is associated with markers of vascular remodeling and inflammation in African American ESRD patients.METHODS: We conducted a cross-sectional study among ESRD patients receiving maintenance hemodialysis within Emory University-affiliated outpatient hemodialysis units. Demographic, clinical and dialysis treatment data were collected via direct patient interview and review of patients records at the time of enrollment, and each patient gave blood samples. Associations between 25(OH)D and biomarker concentrations were estimated in univariate analyses using Pearson's correlation coefficients and in multivariate analyses using linear regression models. 25(OH) D concentration was entered in multivariate linear regression models as a continuous variable and binary variable (<15 ng/ml and =15 ng/ml). Adjusted estimate concentrations of biomarkers were compared between 25(OH) D groups using analysis of variance (ANOVA). Finally, results were stratified by vascular access type.RESULTS: Among 91 patients, mean (standard deviation) 25(OH)D concentration was 18.8 (9.6) ng/ml, and was low (<15 ng/ml) in 43% of patients. In univariate analyses, low 25(OH) D was associated with lower serum calcium, higher serum phosphorus, and higher LDL concentrations. 25(OH) D concentration was inversely correlated with MMP-9 concentration (r = -0.29, p = 0.004). In multivariate analyses, MMP-9 concentration remained negatively associated with 25(OH) D concentration (P = 0.03) and anti-inflammatory IL-10 concentration positively correlated with 25(OH) D concentration (P = 0.04).CONCLUSIONS: Plasma MMP-9 and circulating 25(OH) D concentrations are significantly and inversely associated among ESRD patients. This finding may suggest a potential mechanism by which low circulating 25(OH) D functions as a cardiovascular risk factor.
Resumo:
We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyse the ratios of the data values. The usual approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property. This weighted log-ratio analysis is theoretically equivalent to spectral mapping , a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modelling. The weighted log-ratio methodology is applied here to frequency data in linguistics and to chemical compositional data in archaeology.
Resumo:
1. Aim - Concerns over how global change will influence species distributions, in conjunction with increased emphasis on understanding niche dynamics in evolutionary and community contexts, highlight the growing need for robust methods to quantify niche differences between or within taxa. We propose a statistical framework to describe and compare environmental niches from occurrence and spatial environmental data.¦2. Location - Europe, North America, South America¦3. Methods - The framework applies kernel smoothers to densities of species occurrence in gridded environmental space to calculate metrics of niche overlap and test hypotheses regarding niche conservatism. We use this framework and simulated species with predefined distributions and amounts of niche overlap to evaluate several ordination and species distribution modeling techniques for quantifying niche overlap. We illustrate the approach with data on two well-studied invasive species.¦4. Results - We show that niche overlap can be accurately detected with the framework when variables driving the distributions are known. The method is robust to known and previously undocumented biases related to the dependence of species occurrences on the frequency of environmental conditions that occur across geographic space. The use of a kernel smoother makes the process of moving from geographical space to multivariate environmental space independent of both sampling effort and arbitrary choice of resolution in environmental space. However, the use of ordination and species distribution model techniques for selecting, combining and weighting variables on which niche overlap is calculated provide contrasting results.¦5. Main conclusions - The framework meets the increasing need for robust methods to quantify niche differences. It is appropriate to study niche differences between species, subspecies or intraspecific lineages that differ in their geographical distributions. Alternatively, it can be used to measure the degree to which the environmental niche of a species or intraspecific lineage has changed over time.
Resumo:
A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.
Resumo:
The goal of this paper is to estimate time-varying covariance matrices.Since the covariance matrix of financial returns is known to changethrough time and is an essential ingredient in risk measurement, portfolioselection, and tests of asset pricing models, this is a very importantproblem in practice. Our model of choice is the Diagonal-Vech version ofthe Multivariate GARCH(1,1) model. The problem is that the estimation ofthe general Diagonal-Vech model model is numerically infeasible indimensions higher than 5. The common approach is to estimate more restrictive models which are tractable but may not conform to the data. Our contributionis to propose an alternative estimation method that is numerically feasible,produces positive semi-definite conditional covariance matrices, and doesnot impose unrealistic a priori restrictions. We provide an empiricalapplication in the context of international stock markets, comparing thenew estimator to a number of existing ones.
Resumo:
The singular value decomposition and its interpretation as alinear biplot has proved to be a powerful tool for analysing many formsof multivariate data. Here we adapt biplot methodology to the specifficcase of compositional data consisting of positive vectors each of whichis constrained to have unit sum. These relative variation biplots haveproperties relating to special features of compositional data: the studyof ratios, subcompositions and models of compositional relationships. Themethodology is demonstrated on a data set consisting of six-part colourcompositions in 22 abstract paintings, showing how the singular valuedecomposition can achieve an accurate biplot of the colour ratios and howpossible models interrelating the colours can be diagnosed.
Resumo:
The classical binary classification problem is investigatedwhen it is known in advance that the posterior probability function(or regression function) belongs to some class of functions. We introduceand analyze a method which effectively exploits this knowledge. The methodis based on minimizing the empirical risk over a carefully selected``skeleton'' of the class of regression functions. The skeleton is acovering of the class based on a data--dependent metric, especiallyfitted for classification. A new scale--sensitive dimension isintroduced which is more useful for the studied classification problemthan other, previously defined, dimension measures. This fact isdemonstrated by performance bounds for the skeleton estimate in termsof the new dimension.
Resumo:
We continue the development of a method for the selection of a bandwidth or a number of design parameters in density estimation. We provideexplicit non-asymptotic density-free inequalities that relate the $L_1$ error of the selected estimate with that of the best possible estimate,and study in particular the connection between the richness of the classof density estimates and the performance bound. For example, our methodallows one to pick the bandwidth and kernel order in the kernel estimatesimultaneously and still assure that for {\it all densities}, the $L_1$error of the corresponding kernel estimate is not larger than aboutthree times the error of the estimate with the optimal smoothing factor and kernel plus a constant times $\sqrt{\log n/n}$, where $n$ is the sample size, and the constant only depends on the complexity of the family of kernels used in the estimate. Further applications include multivariate kernel estimates, transformed kernel estimates, and variablekernel estimates.
Resumo:
La tècnica de l’electroencefalograma (EEG) és una de les tècniques més utilitzades per estudiar el cervell. En aquesta tècnica s’enregistren els senyals elèctrics que es produeixen en el còrtex humà a través d’elèctrodes col•locats al cap. Aquesta tècnica, però, presenta algunes limitacions a l’hora de realitzar els enregistraments, la principal limitació es coneix com a artefactes, que són senyals indesitjats que es mesclen amb els senyals EEG. L’objectiu d’aquest treball de final de màster és presentar tres nous mètodes de neteja d’artefactes que poden ser aplicats en EEG. Aquests estan basats en l’aplicació de la Multivariate Empirical Mode Decomposition, que és una nova tècnica utilitzada per al processament de senyal. Els mètodes de neteja proposats s’apliquen a dades EEG simulades que contenen artefactes (pestanyeigs), i un cop s’han aplicat els procediments de neteja es comparen amb dades EEG que no tenen pestanyeigs, per comprovar quina millora presenten. Posteriorment, dos dels tres mètodes de neteja proposats s’apliquen sobre dades EEG reals. Les conclusions que s’han extret del treball són que dos dels nous procediments de neteja proposats es poden utilitzar per realitzar el preprocessament de dades reals per eliminar pestanyeigs.
Resumo:
BACKGROUND: Patients who have acute coronary syndromes with or without ST-segment elevation have high rates of major vascular events. We evaluated the efficacy of early clopidogrel administration (300 mg) (<24 hours) when given with aspirin in such patients. METHODS: We included 30,243 patients who had an acute coronary syndrome with or without ST segment elevation. Data on early clopidogrel administration were available for 24,463 (81%). Some 15,525 (51%) of the total cohort were administrated clopidogrel within 24h of admission. RESULTS: In-hospital death occurred in 2.9% of the patients in the early clopidogrel group treated with primary PCI and in 11.4% of the patients in the other group without primary percutaneous coronary intervention (PCI) and no early clopidogrel. The unadjusted clopidogrel odds ratio (OR) for mortality was 0.31 (95% confidence interval 0.27-0.34; p <0.001). Incidence of major adverse cardiac death (MACE) was 4.1% in the early clopidogrel group treated with 1°PCI and 13.5% in the other group without primary PCI and no early clopidogrel (OR 0.35, confidence interval 0.32-0.39, p <0.001). Early clopidogrel administration and PCI were the only treatment lowering mortality as shown by mutlivariate analysis. CONCLUSIONS: The early administration of the anti-platelet agent clopidogrel in patients with acute coronary syndromes with or without ST-segment elevation has a beneficial effect on mortality and major adverse cardiac events. The lower mortality rate and incidence of MACE emerged with a combination of primary PCI and early clopidogrel administration.
Resumo:
The integration of specific institutions for teacher education into the higher education system represents a milestone in the Swiss educational policy and has broad implications. This thesis explores organizational and institutional change resulting from this policy reform, and attempts to assess structural change in terms of differentiation and convergence within the system of higher education. Key issues that are dealt with are, on the one hand, the adoption of a research function by the newly conceptualized institutions of teacher education, and on the other, the positioning of the new institutions within the higher education system. Drawing on actor-centred approaches to differentiation, this dissertation discusses system-level specificities of tertiarized teacher education and asks how this affects institutional configurations and actor constellations. On the basis of qualitative and quantitative empirical data, a comparative analysis has been carried out including case studies of four universities of teacher education as well as multivariate regression analysis of micro-level data on students' educational choices. The study finds that the process of system integration and adaption to the research function by the various institutions have unfolded differently depending on the institutional setting and the specific actor constellations. The new institutions have clearly made a strong push to position themselves as a new institutional type and to find their identity beyond the traditional binary divide which assigns the universities of teacher education to the college sector. Potential conflicts have been identified in divergent cognitive normative orientations and perceptions of researchers, teacher educators, policy-makers, teachers, and students as to the mission and role of the new type of higher education institution. - L'intégration dans le système d'enseignement supérieur d'institutions qui ont pour tâche spécifique de former des enseignants peut être considérée comme un événement majeur dans la politique éducative suisse, qui se trouve avoir des conséquences importantes à plusieurs niveaux. Cette thèse explore les changements organisationnels et institutionnels résultant de cette réforme politique, et elle se propose d'évaluer en termes de différentiation et de convergence les changements structurels intervenus dans le système d'éducation tertiaire. Les principaux aspects traités sont d'une part la nouvelle mission de recherche attribuée à ces institutions de formation pédagogique, et de l'autre la place par rapport aux autres institutions du système d'éducation tertiaire. Recourant à une approche centrée sur les acteurs pour étudier les processus de différen-tiation, la thèse met en lumière et en discussion les spécificités inhérentes au système tertiaire au sein duquel se joue la formation des enseignants nouvellement conçue et soulève la question des effets de cette nouvelle façon de former les enseignants sur les configurations institutionnelles et les constellations d'acteurs. Une analyse comparative a été réalisée sur la base de données qualitatives et quantitatives issues de quatre études de cas de hautes écoles pédagogiques et d'analyses de régression multiple de données de niveau micro concernant les choix de carrière des étudiants. Les résultats montrent à quel point le processus d'intégration dans le système et la nouvelle mission de recherche peuvent apparaître de manière différente selon le cadre institutionnel d'une école et la constellation spécifique des acteurs influents. A pu clairement être observée une forte aspiration des hautes écoles pédagogiques à se créer une identité au-delà de la structure binaire du système qui assigne la formation des enseignants au secteur des hautes écoles spéciali-sées. Des divergences apparaissent dans les conceptions et perceptions cognitives et normatives des cher-cheurs, formateurs, politiciens, enseignants et étudiants quant à la mission et au rôle de ce nouveau type de haute école. - Die Integration spezieller Institutionen für die Lehrerbildung ins Hochschulsystem stellt einen bedeutsamen Schritt mit weitreichenden Folgen in der Entwicklung des schweizerischen Bildungswesens dar. Diese Dissertation untersucht die mit der Neuerung verbundenen Veränderungen auf organisatorischer und institutioneller Ebene und versucht, die strukturelle Entwicklung unter den Gesichtspunkten von Differenzierung und Konvergenz innerhalb des tertiären Bildungssystems einzuordnen. Zentrale Themen sind dabei zum einen die Einführung von Forschung und Entwicklung als zusätzlichem Leistungsauftrag in der Lehrerbildung und zum andern die Positionierung der pädagogischen Hochschulen innerhalb des Hochschulsystems. Anhand akteurzentrierter Ansätze zur Differenzierung werden die Besonderheiten einer tertiarisierten Lehrerbildung hinsichtlich der Systemebenen diskutiert und Antworten auf die Frage gesucht, wie die Reform die institutionellen Konfigurationen und die Akteurkonstellationen beeinflusst. Auf der Grundlage qualitativer und quantitativer Daten wurde eine vergleichende Analyse durchgeführt, welche Fallstudien zu vier pädagogischen Hochschulen umfasst sowie Regressionsanalysen von Mikrodaten zur Studienwahl von Maturanden. Die Ergebnisse machen deutlich, dass sich der Prozess der Systemintegration und die Einführung von Forschung in die Lehrerbildung in Abhängigkeit von institutionellen Ordnungen und der jeweiligen Akteurkonstellation unterschiedlich gestalten. Es lässt sich bei den neu gegründeten pädagogischen Hochschulen ein starkes Bestreben feststellen, sich als neuen Hochschultypus zu positionieren und sich eine Identität zu schaffen jenseits der herkömmlichen binären Struktur, welche die pädagogischen Hochschulen dem Fachhochschul-Sektor zuordnet. Potentielle Konflikte zeichnen sich ab in den divergierenden kognitiven und normativen Orientierungen und Wahrnehmungen von Forschern, Ausbildern, Bildungspolitikern, Lehrern und Studierenden hinsichtlich des Auftrags und der Rolle dieses neuen Typs Hochschule.
Resumo:
With the trend in molecular epidemiology towards both genome-wide association studies and complex modelling, the need for large sample sizes to detect small effects and to allow for the estimation of many parameters within a model continues to increase. Unfortunately, most methods of association analysis have been restricted to either a family-based or a case-control design, resulting in the lack of synthesis of data from multiple studies. Transmission disequilibrium-type methods for detecting linkage disequilibrium from family data were developed as an effective way of preventing the detection of association due to population stratification. Because these methods condition on parental genotype, however, they have precluded the joint analysis of family and case-control data, although methods for case-control data may not protect against population stratification and do not allow for familial correlations. We present here an extension of a family-based association analysis method for continuous traits that will simultaneously test for, and if necessary control for, population stratification. We further extend this method to analyse binary traits (and therefore family and case-control data together) and accurately to estimate genetic effects in the population, even when using an ascertained family sample. Finally, we present the power of this binary extension for both family-only and joint family and case-control data, and demonstrate the accuracy of the association parameter and variance components in an ascertained family sample.
Resumo:
The spatial variability of soil and plant properties exerts great influence on the yeld of agricultural crops. This study analyzed the spatial variability of the fertility of a Humic Rhodic Hapludox with Arabic coffee, using principal component analysis, cluster analysis and geostatistics in combination. The experiment was carried out in an area under Coffea arabica L., variety Catucai 20/15 - 479. The soil was sampled at a depth 0.20 m, at 50 points of a sampling grid. The following chemical properties were determined: P, K+, Ca2+, Mg2+, Na+, S, Al3+, pH, H + Al, SB, t, T, V, m, OM, Na saturation index (SSI), remaining phosphorus (P-rem), and micronutrients (Zn, Fe, Mn, Cu and B). The data were analyzed with descriptive statistics, followed by principal component and cluster analyses. Geostatistics were used to check and quantify the degree of spatial dependence of properties, represented by principal components. The principal component analysis allowed a dimensional reduction of the problem, providing interpretable components, with little information loss. Despite the characteristic information loss of principal component analysis, the combination of this technique with geostatistical analysis was efficient for the quantification and determination of the structure of spatial dependence of soil fertility. In general, the availability of soil mineral nutrients was low and the levels of acidity and exchangeable Al were high.
Resumo:
In the State of Rio Grande do Sul, the municipality of Pelotas is responsible for 90 % of peach production due to its suitable climate and soil conditions. However, there is the need for new studies that aim at improved fruit quality and increased yield. The aim of this study was to evaluate the relationship that exists between soil physical properties and properties in the peach plant in the years 2010 and 2011 by the technique of multivariate canonical correlation. The experiment was conducted in a peach orchard located in the municipality of Morro Redondo, RS, Brazil, where an experimental grid of 101 plants was established. In a trench dug beside each one of the 101 plants, soil samples were collected to determine silt, clay, and sand contents, soil density, total porosity, macroporosity, microporosity, and volumetric water content in the 0.00-0.10 and 0.10-0.20 m layers, as well as the depth of the A horizon. In each plant and in each year, the following properties were assessed: trunk diameter, fruit size and number of fruits per plant, average weight of the fruit per plant, fruit pulp firmness, Brix content, and yield from the orchard. Exploratory analysis of the data was undertaken by descriptive statistics, and the relationships between the physical properties of the soil and of the plant were assessed by canonical correlation analysis. The results showed that the clay and microporosity variables were those that exhibited the highest coefficients of canonical cross-loading with the plant properties in the soil layers assessed, and that the variable of mean weight of the fruit per plant was that which had the highest coefficients of canonical loading within the plant group for the two years assessed.