Biblioteca Digital

991 resultados para CATEGORICAL-DATA

Large-Area Photo-Mosaics Using Global Alignment and Navigation Data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Seafloor imagery is a rich source of data for the study of biological and geological processes. Among several applications, still images of the ocean floor can be used to build image composites referred to as photo-mosaics. Photo-mosaics provide a wide-area visual representation of the benthos, and enable applications as diverse as geological surveys, mapping and detection of temporal changes in the morphology of biodiversity. We present an approach for creating globally aligned photo-mosaics using 3D position estimates provided by navigation sensors available in deep water surveys. Without image registration, such navigation data does not provide enough accuracy to produce useful composite images. Results from a challenging data set of the Lucky Strike vent field at the Mid Atlantic Ridge are reported

The agreement between ipsative and normative questionnaires using compositional data analysis techniques

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main instrument used in psychological measurement is the self-report questionnaire. One of its majordrawbacks however is its susceptibility to response biases. A known strategy to control these biases hasbeen the use of so-called ipsative items. Ipsative items are items that require the respondent to makebetween-scale comparisons within each item. The selected option determines to which scale the weight ofthe answer is attributed. Consequently in questionnaires only consisting of ipsative items everyrespondent is allotted an equal amount, i.e. the total score, that each can distribute differently over thescales. Therefore this type of response format yields data that can be considered compositional from itsinception.Methodological oriented psychologists have heavily criticized this type of item format, since the resultingdata is also marked by the associated unfavourable statistical properties. Nevertheless, clinicians havekept using these questionnaires to their satisfaction. This investigation therefore aims to evaluate bothpositions and addresses the similarities and differences between the two data collection methods. Theultimate objective is to formulate a guideline when to use which type of item format.The comparison is based on data obtained with both an ipsative and normative version of threepsychological questionnaires, which were administered to 502 first-year students in psychology accordingto a balanced within-subjects design. Previous research only compared the direct ipsative scale scoreswith the derived ipsative scale scores. The use of compositional data analysis techniques also enables oneto compare derived normative score ratios with direct normative score ratios. The addition of the secondcomparison not only offers the advantage of a better-balanced research strategy. In principle it also allowsfor parametric testing in the evaluation

Alternative ways to estimate change points in multinomial sequences. An application to an authorship attribution problem

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The statistical analysis of literary style is the part of stylometry that compares measurable characteristicsin a text that are rarely controlled by the author, with those in other texts. When thegoal is to settle authorship questions, these characteristics should relate to the author’s style andnot to the genre, epoch or editor, and they should be such that their variation between authors islarger than the variation within comparable texts from the same author.For an overview of the literature on stylometry and some of the techniques involved, see for exampleMosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) orLebart, Salem and Berry (1998).Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be“the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writterslike Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translatedseveral times into Spanish, Italian and French, with modern English translations by Rosenthal(1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465,but it was not printed until 1490.There is an intense and long lasting debate around its authorship sprouting from its first edition,where its introduction states that the whole book is the work of Martorell (1413?-1468), while atthe end it is stated that the last one fourth of the book is by Galba (?-1490), after the death ofMartorell. Some of the authors that support the theory of single authorship are Riquer (1990),Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer(1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990).Neither of the two candidate authors left any text comparable to the one under study, and thereforediscriminant analysis can not be used to help classify chapters by author. By using sample textsencompassing about ten percent of the book, and looking at word length and at the use of 44conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that mightindicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba andGinebra (2000) estimates that stylistic boundary to be near chapter 383.Following the lead of the extensive literature, this paper looks into word length, the use of the mostfrequent words and into the use of vowels in each chapter of the book. Given that the featuresselected are categorical, that leads to three contingency tables of ordered rows and therefore tothree sequences of multinomial observations.Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3describes the problem of the estimation of a suden change-point in those sequences, in the followingsections we propose various ways to estimate change-points in multinomial sequences; the methodin section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma modelsonto the sequence of Chi-square distances between each row profiles and the average profile, theone in Section 6 fits models onto the sequence of values taken by the first component of thecorrespondence analysis as well as onto sequences of other summary measures like the averageword length. In Section 7 we fit models onto the marginal binomial sequences to identify thefeatures that distinguish the chapters before and after that boundary. Most methods rely heavilyon the use of generalized linear models

Local WMR navigation with monocular data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article presents recent WMR (wheeled mobile robot) navigation experiences using local perception knowledge provided by monocular and odometer systems. A local narrow perception horizon is used to plan safety trajectories towards the objective. Therefore, monocular data are proposed as a way to obtain real time local information by building two dimensional occupancy grids through a time integration of the frames. The path planning is accomplished by using attraction potential fields, while the trajectory tracking is performed by using model predictive control techniques. The results are faced to indoor situations by using the lab available platform consisting in a differential driven mobile robot

Validation of order rank scales based on compositional data analysis: a proposal

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Usually, psychometricians apply classical factorial analysis to evaluate construct validity of order rankscales. Nevertheless, these scales have particular characteristics that must be taken into account: totalscores and rank are highly relevant

Assessing the Precision of Compositional Data in a Stratified Double Stage Cluster Sample: Application to the Swiss Earnings Structure Survey

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Precision of released figures is not only an important quality feature of official statistics,it is also essential for a good understanding of the data. In this paper we show a casestudy of how precision could be conveyed if the multivariate nature of data has to betaken into account. In the official release of the Swiss earnings structure survey, the totalsalary is broken down into several wage components. We follow Aitchison's approachfor the analysis of compositional data, which is based on logratios of components. Wefirst present diferent multivariate analyses of the compositional data whereby the wagecomponents are broken down by economic activity classes. Then we propose a numberof ways to assess precision

Schistosomiasis risk mapping in the state of Minas Gerais, Brazil, using a decision tree approach, remote sensing data and sociological indicators

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Schistosomiasis mansoni is not just a physical disease, but is related to social and behavioural factors as well. Snails of the Biomphalaria genus are an intermediate host for Schistosoma mansoni and infect humans through water. The objective of this study is to classify the risk of schistosomiasis in the state of Minas Gerais (MG). We focus on socioeconomic and demographic features, basic sanitation features, the presence of accumulated water bodies, dense vegetation in the summer and winter seasons and related terrain characteristics. We draw on the decision tree approach to infection risk modelling and mapping. The model robustness was properly verified. The main variables that were selected by the procedure included the terrain's water accumulation capacity, temperature extremes and the Human Development Index. In addition, the model was used to generate two maps, one that included risk classification for the entire of MG and another that included classification errors. The resulting map was 62.9% accurate.

Why Public Health Agencies Cannot Depend on Good Laboratory Practices as a Criterion for Selecting Data: The Case of Bisphenol A

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In their safety evaluations of bisphenol A (BPA), the U.S. Food and Drug Administration (FDA) and a counterpart in Europe, the European Food Safety Authority (EFSA), have given special prominence to two industry-funded studies that adhered to standards defined by Good Laboratory Practices (GLP). These same agencies have given much less weight in risk assessments to a large number of independently replicated non-GLP studies conducted with government funding by the leading experts in various fields of science from around the world. OBJECTIVES: We reviewed differences between industry-funded GLP studies of BPA conducted by commercial laboratories for regulatory purposes and non-GLP studies conducted in academic and government laboratories to identify hazards and molecular mechanisms mediating adverse effects. We examined the methods and results in the GLP studies that were pivotal in the draft decision of the U.S. FDA declaring BPA safe in relation to findings from studies that were competitive for U.S. National Institutes of Health (NIH) funding, peer-reviewed for publication in leading journals, subject to independent replication, but rejected by the U.S. FDA for regulatory purposes. DISCUSSION: Although the U.S. FDA and EFSA have deemed two industry-funded GLP studies of BPA to be superior to hundreds of studies funded by the U.S. NIH and NIH counterparts in other countries, the GLP studies on which the agencies based their decisions have serious conceptual and methodologic flaws. In addition, the U.S. FDA and EFSA have mistakenly assumed that GLP yields valid and reliable scientific findings (i.e., "good science"). Their rationale for favoring GLP studies over hundreds of publically funded studies ignores the central factor in determining the reliability and validity of scientific findings, namely, independent replication, and use of the most appropriate and sensitive state-of-the-art assays, neither of which is an expectation of industry-funded GLP research. CONCLUSIONS: Public health decisions should be based on studies using appropriate protocols with appropriate controls and the most sensitive assays, not GLP. Relevant NIH-funded research using state-of-the-art techniques should play a prominent role in safety evaluations of chemicals.

Reliability and validity of physiological data obtained within a cycle-run transition test in age-group triathletes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study examined the validity and reliability of a sequential "Run-Bike-Run" test (RBR) in age-group triathletes. Eight Olympic distance (OD) specialists (age 30.0 ± 2.0 years, mass 75.6 ± 1.6 kg, run VO2max 63.8 ± 1.9 ml· kg(-1)· min(-1), cycle VO2peak 56.7 ± 5.1 ml· kg(-1)· min(-1)) performed four trials over 10 days. Trial 1 (TRVO2max) was an incremental treadmill running test. Trials 2 and 3 (RBR1 and RBR2) involved: 1) a 7-min run at 15 km· h(-1) (R1) plus a 1-min transition to 2) cycling to fatigue (2 W· kg(-1) body mass then 30 W each 3 min); 3) 10-min cycling at 3 W· kg(-1) (Bsubmax); another 1-min transition and 4) a second 7-min run at 15 km· h(-1) (R2). Trial 4 (TT) was a 30-min cycle - 20-min run time trial. No significant differences in absolute oxygen uptake (VO2), heart rate (HR), or blood lactate concentration ([BLA]) were evidenced between RBR1 and RBR2. For all measured physiological variables, the limits of agreement were similar, and the mean differences were physiologically unimportant, between trials. Low levels of test-retest error (i.e. ICC <0.8, CV<10%) were observed for most (logged) measurements. However [BLA] post R1 (ICC 0.87, CV 25.1%), [BLA] post Bsubmax (ICC 0.99, CV 16.31) and [BLA] post R2 (ICC 0.51, CV 22.9%) were least reliable. These error ranges may help coaches detect real changes in training status over time. Moreover, RBR test variables can be used to predict discipline specific and overall TT performance. Cycle VO2peak, cycle peak power output, and the change between R1 and R2 (deltaR1R2) in [BLA] were most highly related to overall TT distance (r = 0.89, p < 0. 01; r = 0.94, p < 0.02; r = 0.86, p < 0.05, respectively). The percentage of TR VO2max at 15 km· h(-1), and deltaR1R2 HR, were also related to run TT distance (r = -0.83 and 0.86, both p < 0.05).

Lung function in idiopathic pulmonary fibrosis--extended analyses of the IFIGENIA trial.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND The randomized placebo-controlled IFIGENIA-trial demonstrated that therapy with high-dose N-acetylcysteine (NAC) given for one year, added to prednisone and azathioprine, significantly ameliorates (i.e. slows down) disease progression in terms of vital capacity (VC) (+9%) and diffusing capacity (DLco) (+24%) in idiopathic pulmonary fibrosis (IPF). To better understand the clinical implications of these findings we performed additional, explorative analyses of the IFGENIA data set. METHODS We analysed effects of NAC on VC, DLco, a composite physiologic index (CPI), and mortality in the 155 study-patients. RESULTS In trial completers the functional indices did not change significantly with NAC, whereas most indices deteriorated with placebo; in non-completers the majority of indices worsened but decline was generally less pronounced in most indices with NAC than with placebo. Most categorical analyses of VC, DLco and CPI also showed favourable changes with NAC. The effects of NAC on VC, DLco and CPI were significantly better if the baseline CPI was 50 points or lower. CONCLUSION This descriptive analysis confirms and extends the favourable effects of NAC on lung function in IPF and emphasizes the usefulness of VC, DLco, and the CPI for the evaluation of a therapeutic effect. Most importantly, less progressed disease as indicated by a CPI of 50 points or lower at baseline was more responsive to therapy in this study.

A compositional approach to stable isotope data analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Isotopic data are currently becoming an important source of information regardingsources, evolution and mixing processes of water in hydrogeologic systems. However, itis not clear how to treat with statistics the geochemical data and the isotopic datatogether. We propose to introduce the isotopic information as new parts, and applycompositional data analysis with the resulting increased composition. Results areequivalent to downscale the classical isotopic delta variables, because they are alreadyrelative (as needed in the compositional framework) and isotopic variations are almostalways very small. This methodology is illustrated and tested with the study of theLlobregat River Basin (Barcelona, NE Spain), where it is shown that, though verysmall, isotopic variations comp lement geochemical principal components, and help inthe better identification of pollution sources

CODAPACK3D. A new version of Compositional Data Package

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The statistical analysis of compositional data should be treated using logratios of parts,which are difficult to use correctly in standard statistical packages. For this reason afreeware package, named CoDaPack was created. This software implements most of thebasic statistical methods suitable for compositional data.In this paper we describe the new version of the package that now is calledCoDaPack3D. It is developed in Visual Basic for applications (associated with Excel©),Visual Basic and Open GL, and it is oriented towards users with a minimum knowledgeof computers with the aim at being simple and easy to use.This new version includes new graphical output in 2D and 3D. These outputs could bezoomed and, in 3D, rotated. Also a customization menu is included and outputs couldbe saved in jpeg format. Also this new version includes an interactive help and alldialog windows have been improved in order to facilitate its use.To use CoDaPack one has to access Excel© and introduce the data in a standardspreadsheet. These should be organized as a matrix where Excel© rows correspond tothe observations and columns to the parts. The user executes macros that returnnumerical or graphical results. There are two kinds of numerical results: new variablesand descriptive statistics, and both appear on the same sheet. Graphical output appearsin independent windows. In the present version there are 8 menus, with a total of 38submenus which, after some dialogue, directly call the corresponding macro. Thedialogues ask the user to input variables and further parameters needed, as well aswhere to put these results. The web site http://ima.udg.es/CoDaPack contains thisfreeware package and only Microsoft Excel© under Microsoft Windows© is required torun the software.Kew words: Compositional data Analysis, Software

Adaptability in action : Using personality, interest, and values data to help clients increase their emotional, social, and cognitive career meta-capacities

Relevância:

20.00% 20.00%

Publicador:

Compositional data and Simpson's paradox

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Simpson's paradox, also known as amalgamation or aggregation paradox, appears whendealing with proportions. Proportions are by construction parts of a whole, which canbe interpreted as compositions assuming they only carry relative information. TheAitchison inner product space structure of the simplex, the sample space of compositions, explains the appearance of the paradox, given that amalgamation is a nonlinearoperation within that structure. Here we propose to use balances, which are specificelements of this structure, to analyse situations where the paradox might appear. Withthe proposed approach we obtain that the centre of the tables analysed is a naturalway to compare them, which avoids by construction the possibility of a paradox.Key words: Aitchison geometry, geometric mean, orthogonal projection

A Homogeneous Set-Theoretical Frame for Clustering Fuzzy Relational Data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Our purpose is to provide a set-theoretical frame to clustering fuzzy relational data basically based on cardinality of the fuzzy subsets that represent objects and their complementaries, without applying any crisp property. From this perspective we define a family of fuzzy similarity indexes which includes a set of fuzzy indexes introduced by Tolias et al, and we analyze under which conditions it is defined a fuzzy proximity relation. Following an original idea due to S. Miyamoto we evaluate the similarity between objects and features by means the same mathematical procedure. Joining these concepts and methods we establish an algorithm to clustering fuzzy relational data. Finally, we present an example to make clear all the process

«
1
2
...
59
60
61
62
63
64
65
66
67
»