Biblioteca Digital

927 resultados para Statistical packages

A semantic modelling approach to knowledge based statistical software

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The topic of this thesis is the development of knowledge based statistical software. The shortcomings of conventional statistical packages are discussed to illustrate the need to develop software which is able to exhibit a greater degree of statistical expertise, thereby reducing the misuse of statistical methods by those not well versed in the art of statistical analysis. Some of the issues involved in the development of knowledge based software are presented and a review is given of some of the systems that have been developed so far. The majority of these have moved away from conventional architectures by adopting what can be termed an expert systems approach. The thesis then proposes an approach which is based upon the concept of semantic modelling. By representing some of the semantic meaning of data, it is conceived that a system could examine a request to apply a statistical technique and check if the use of the chosen technique was semantically sound, i.e. will the results obtained be meaningful. Current systems, in contrast, can only perform what can be considered as syntactic checks. The prototype system that has been implemented to explore the feasibility of such an approach is presented, the system has been designed as an enhanced variant of a conventional style statistical package. This involved developing a semantic data model to represent some of the statistically relevant knowledge about data and identifying sets of requirements that should be met for the application of the statistical techniques to be valid. Those areas of statistics covered in the prototype are measures of association and tests of location.

Creació d’un programari per a l’avaluació de la qualitat dels resultats analítics generats al laboratori

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La validació de mètodes és un dels pilars fonamentals de l’assegurament de la qualitat en els laboratoris d’anàlisi, tal i com queda reflectit en la norma ISO/IEC 17025. És, per tant, un aspecte que cal abordar en els plans d’estudis dels presents i dels futurs graus en Química. Existeix molta bibliografia relativa a la validació de mètodes, però molt sovint aquesta s’utilitza poc, degut a la dificultat manifesta de processar tota la informació disponible i aplicar-la al laboratori i als problemes concrets. Una altra de les limitacions en aquest camps és la manca de programaris adaptats a les necessitats del laboratori. Moltes de les rutines estadístiques que es fan servir en la validació de mètodes són adaptacions fetes amb Microsoft Excel o venen incorporades en paquets estadístics gegants, amb un alt grau de complexitat. És per aquest motiu que l’objectiu del projecte ha estat generar un programari per la validació de mètodes i l’assegurament de la qualitat dels resultats analítics, que incorporés únicament les rutines necessàries. Específicament, el programari incorpora les funcions estadístiques necessàries per a verificar l’exactitud i avaluar la precisió d’un mètode analític. El llenguatge de programació triat ha estat el Java en la seva versió 6. La part de creació del programari ha constat de les següents etapes: recollida de requisits, anàlisi dels requisits, disseny del programari en mòduls, programació d les funcions del programa i de la interfície gràfica, creació de tests d’integració i prova amb usuaris reals, i, finalment, la posada en funcionament del programari (creació de l’instal·lador i distribució del programari).

New Features of CoDaPack. An Userfriendly Compositional Data Package

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The statistical analysis of compositional data is commonly used in geological studies.As is well-known, compositions should be treated using logratios of parts, which aredifficult to use correctly in standard statistical packages. In this paper we describe thenew features of our freeware package, named CoDaPack, which implements most of thebasic statistical methods suitable for compositional data. An example using real data ispresented to illustrate the use of the package

CODAPACK3D. A new version of Compositional Data Package

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The statistical analysis of compositional data should be treated using logratios of parts,which are difficult to use correctly in standard statistical packages. For this reason afreeware package, named CoDaPack was created. This software implements most of thebasic statistical methods suitable for compositional data.In this paper we describe the new version of the package that now is calledCoDaPack3D. It is developed in Visual Basic for applications (associated with Excel©),Visual Basic and Open GL, and it is oriented towards users with a minimum knowledgeof computers with the aim at being simple and easy to use.This new version includes new graphical output in 2D and 3D. These outputs could bezoomed and, in 3D, rotated. Also a customization menu is included and outputs couldbe saved in jpeg format. Also this new version includes an interactive help and alldialog windows have been improved in order to facilitate its use.To use CoDaPack one has to access Excel© and introduce the data in a standardspreadsheet. These should be organized as a matrix where Excel© rows correspond tothe observations and columns to the parts. The user executes macros that returnnumerical or graphical results. There are two kinds of numerical results: new variablesand descriptive statistics, and both appear on the same sheet. Graphical output appearsin independent windows. In the present version there are 8 menus, with a total of 38submenus which, after some dialogue, directly call the corresponding macro. Thedialogues ask the user to input variables and further parameters needed, as well aswhere to put these results. The web site http://ima.udg.es/CoDaPack contains thisfreeware package and only Microsoft Excel© under Microsoft Windows© is required torun the software.Kew words: Compositional data Analysis, Software

Incongruence between test statistics and P values in medical papers

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Given an observed test statistic and its degrees of freedom, one may compute the observed P value with most statistical packages. It is unknown to what extent test statistics and P values are congruent in published medical papers. Methods:We checked the congruence of statistical results reported in all the papers of volumes 409–412 of Nature (2001) and a random sample of 63 results from volumes 322–323 of BMJ (2001). We also tested whether the frequencies of the last digit of a sample of 610 test statistics deviated from a uniform distribution (i.e., equally probable digits).Results: 11.6% (21 of 181) and 11.1% (7 of 63) of the statistical results published in Nature and BMJ respectively during 2001 were incongruent, probably mostly due to rounding, transcription, or type-setting errors. At least one such error appeared in 38% and 25% of the papers of Nature and BMJ, respectively. In 12% of the cases, the significance level might change one or more orders of magnitude. The frequencies of the last digit of statistics deviated from the uniform distribution and suggested digit preference in rounding and reporting.Conclusions: this incongruence of test statistics and P values is another example that statistical practice is generally poor, even in the most renowned scientific journals, and that quality of papers should be more controlled and valued

The Total Deviation Index estimated by Tolerance Intervals to evaluate the concordance of measurement devices

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background In an agreement assay, it is of interest to evaluate the degree of agreement between the different methods (devices, instruments or observers) used to measure the same characteristic. We propose in this study a technical simplification for inference about the total deviation index (TDI) estimate to assess agreement between two devices of normally-distributed measurements and describe its utility to evaluate inter- and intra-rater agreement if more than one reading per subject is available for each device. Methods We propose to estimate the TDI by constructing a probability interval of the difference in paired measurements between devices, and thereafter, we derive a tolerance interval (TI) procedure as a natural way to make inferences about probability limit estimates. We also describe how the proposed method can be used to compute bounds of the coverage probability. Results The approach is illustrated in a real case example where the agreement between two instruments, a handle mercury sphygmomanometer device and an OMRON 711 automatic device, is assessed in a sample of 384 subjects where measures of systolic blood pressure were taken twice by each device. A simulation study procedure is implemented to evaluate and compare the accuracy of the approach to two already established methods, showing that the TI approximation produces accurate empirical confidence levels which are reasonably close to the nominal confidence level. Conclusions The method proposed is straightforward since the TDI estimate is derived directly from a probability interval of a normally-distributed variable in its original scale, without further transformations. Thereafter, a natural way of making inferences about this estimate is to derive the appropriate TI. Constructions of TI based on normal populations are implemented in most standard statistical packages, thus making it simpler for any practitioner to implement our proposal to assess agreement.

Uma introdução à análise exploratória de dados multivariados

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The modern technological ability to handle large amounts of information confronts the chemist with the necessity to re-evaluate the statistical tools he routinely uses. Multivariate statistics furnishes theoretical bases for analyzing systems involving large numbers of variables. The mathematical calculations required for these systems are no longer an obstacle due to the existence of statistical packages that furnish multivariate analysis options. Here basic concepts of two multivariate statistical techniques, principal component and hierarchical cluster analysis that have received broad acceptance for treating chemical data are discussed.

New Features of CoDaPack. An Userfriendly Compositional Data Package

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The statistical analysis of compositional data is commonly used in geological studies. As is well-known, compositions should be treated using logratios of parts, which are difficult to use correctly in standard statistical packages. In this paper we describe the new features of our freeware package, named CoDaPack, which implements most of the basic statistical methods suitable for compositional data. An example using real data is presented to illustrate the use of the package

CODAPACK3D. A new version of Compositional Data Package

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The statistical analysis of compositional data should be treated using logratios of parts, which are difficult to use correctly in standard statistical packages. For this reason a freeware package, named CoDaPack was created. This software implements most of the basic statistical methods suitable for compositional data. In this paper we describe the new version of the package that now is called CoDaPack3D. It is developed in Visual Basic for applications (associated with Excel©), Visual Basic and Open GL, and it is oriented towards users with a minimum knowledge of computers with the aim at being simple and easy to use. This new version includes new graphical output in 2D and 3D. These outputs could be zoomed and, in 3D, rotated. Also a customization menu is included and outputs could be saved in jpeg format. Also this new version includes an interactive help and all dialog windows have been improved in order to facilitate its use. To use CoDaPack one has to access Excel© and introduce the data in a standard spreadsheet. These should be organized as a matrix where Excel© rows correspond to the observations and columns to the parts. The user executes macros that return numerical or graphical results. There are two kinds of numerical results: new variables and descriptive statistics, and both appear on the same sheet. Graphical output appears in independent windows. In the present version there are 8 menus, with a total of 38 submenus which, after some dialogue, directly call the corresponding macro. The dialogues ask the user to input variables and further parameters needed, as well as where to put these results. The web site http://ima.udg.es/CoDaPack contains this freeware package and only Microsoft Excel© under Microsoft Windows© is required to run the software. Kew words: Compositional data Analysis, Software

El entorno estadístico R : ventajas de su uso en la docencia y la investigación.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Resumen basado en el de la publicación

Incongruence between test statistics and P values in medical papers

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Given an observed test statistic and its degrees of freedom, one may compute the observed P value with most statistical packages. It is unknown to what extent test statistics and P values are congruent in published medical papers. Methods: We checked the congruence of statistical results reported in all the papers of volumes 409–412 of Nature (2001) and a random sample of 63 results from volumes 322–323 of BMJ (2001). We also tested whether the frequencies of the last digit of a sample of 610 test statistics deviated from a uniform distribution (i.e., equally probable digits).Results: 11.6% (21 of 181) and 11.1% (7 of 63) of the statistical results published in Nature and BMJ respectively during 2001 were incongruent, probably mostly due to rounding, transcription, or type-setting errors. At least one such error appeared in 38% and 25% of the papers of Nature and BMJ, respectively. In 12% of the cases, the significance level might change one or more orders of magnitude. The frequencies of the last digit of statistics deviated from the uniform distribution and suggested digit preference in rounding and reporting.Conclusions: this incongruence of test statistics and P values is another example that statistical practice is generally poor, even in the most renowned scientific journals, and that quality of papers should be more controlled and valued

Variance estimation with highly stratified sampling designs with unequal probabilities

Relevância:

60.00% 60.00%

Publicador:

Resumo:

It is common practice to design a survey with a large number of strata. However, in this case the usual techniques for variance estimation can be inaccurate. This paper proposes a variance estimator for estimators of totals. The method proposed can be implemented with standard statistical packages without any specific programming, as it involves simple techniques of estimation, such as regression fitting.

Variance estimation with Chao's sampling scheme

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We show that the Hájek (Ann. Math Statist. (1964) 1491) variance estimator can be used to estimate the variance of the Horvitz–Thompson estimator when the Chao sampling scheme (Chao, Biometrika 69 (1982) 653) is implemented. This estimator is simple and can be implemented with any statistical packages. We consider a numerical and an analytic method to show that this estimator can be used. A series of simulations supports our findings.

Comparing diagnostic tests with missing data

Relevância:

60.00% 60.00%

Publicador:

Resumo:

When missing data occur in studies designed to compare the accuracy of diagnostic tests, a common, though naive, practice is to base the comparison of sensitivity, specificity, as well as of positive and negative predictive values on some subset of the data that fits into methods implemented in standard statistical packages. Such methods are usually valid only under the strong missing completely at random (MCAR) assumption and may generate biased and less precise estimates. We review some models that use the dependence structure of the completely observed cases to incorporate the information of the partially categorized observations into the analysis and show how they may be fitted via a two-stage hybrid process involving maximum likelihood in the first stage and weighted least squares in the second. We indicate how computational subroutines written in R may be used to fit the proposed models and illustrate the different analysis strategies with observational data collected to compare the accuracy of three distinct non-invasive diagnostic methods for endometriosis. The results indicate that even when the MCAR assumption is plausible, the naive partial analyses should be avoided.

Estudo de sobrevivência de uma coorte de pessoas de 60 anos e mais no município de Botucatu (SP) - Brasil

Relevância:

60.00% 60.00%

Publicador:

Resumo:

O aumento proporcional do número de idosos na população tem motivado estudos no sentido de melhorar a qualidade de vida desta faixa etária através de políticas sociais e, entre elas, o planejamento em saúde. Com o objetivo de conhecer riscos de mortalidade para a população de sessenta anos e mais, um estudo de sobrevida foi realizado rastreando, no ano de 1992, os idosos participantes de um inquérito de morbidade referida realizado na cidade de Botucatu em 1983/84. Foram localizados 89,6% destes idosos. Curvas de sobrevivência foram calculadas com o método de Kaplan-Meier e a análise de riscos, utilizando-se a Regressão Múltipla de Cox ajustando-se o modelo agregando as variáveis por blocos. Para o sexo masculino foram encontradas associadas, independentemente, ao aumento da mortalidade as seguintes categorias de variáveis: idade de 70 anos e mais: Hazard Ratio (HR)=2,4 (1,6 - 3,7); salário menor que um salário mínimo: HR=2,2 (1,3 - 3,8); ter outras rendas: HR=2,2 (1,3 - 3,9); ser o chefe da família ou seu cônjuge: HR=2,3 (1,2 - 2,4); referência de doenças do aparelho circulatório: HR=1,6 (1,1 - 2,4); referência de diabetes mellitus: HR=3,0 (1,3 - 7,0). Para o sexo feminino, foram encontradas associadas a idade de 70 anos e mais: HR=4,6 (3,0 - 7,1); referência de diabetes mellitus: HR=3,0 (1,7-5,3) e ter outras rendas: HR=2,0 (1,1 - 4,0).

«
1
2
3
4
5
6
7
8
...
61
62
»