868 resultados para data analysis software
Resumo:
I takt med att GIS (Grafiska InformationsSystem) blir allt vanligare och mer användarvänligt har WM-data sett att kunder skulle ha intresse i att kunna koppla information från sin verksamhet till en kartbild. Detta för att lättare kunna ta till sig informationen om hur den geografiskt finns utspridd över ett område för att t.ex. ordna effektivare tranporter. WM-data, som det här arbetet är utfört åt, avser att ta fram en prototyp som sedan kan visas upp för att påvisa för kunder och andra intressenter att detta är möjligt att genomföra genom att skapa en integration mellan redan befintliga system. I det här arbetet har prototypen tagits fram med skogsindustrin och dess lager som inriktning. Befintliga program som integrationen ska skapas mellan är båda webbaserade och körs i en webbläsare. Analysprogrammet som ska användas heter Insikt och är utvecklat av företaget Trimma, kartprogrammet heter GIMS som är WM-datas egna program. Det ska vara möjligt att i Insikt analysera data och skapa en rapport. Den ska sedan skickas till GIMS där informationen skrivs ut på kartan på den plats som respektive information hör till. Det ska även gå att välja ut ett eller flera områden i kartan och skicka till Insikt för att analysera information från enbart de utvalda områdena. En prototyp med önskad funktionalitet har under arbetets gång tagits fram, men för att ha en säljbar produkt är en del arbeta kvar. Prototypen har visats för ett antal intresserade som tyckte det var intressant och tror att det är något som skulle kunna användas flitigt inom många områden.
Resumo:
In this second article, statistical ideas are extended to the problem of testing whether there is a true difference between two samples of measurements. First, it will be shown that the difference between the means of two samples comes from a population of such differences which is normally distributed. Second, the 't' distribution, one of the most important in statistics, will be applied to a test of the difference between two means using a simple data set drawn from a clinical experiment in optometry. Third, in making a t-test, a statistical judgement is made as to whether there is a significant difference between the means of two samples. Before the widespread use of statistical software, this judgement was made with reference to a statistical table. Even if such tables are not used, it is useful to understand their logical structure and how to use them. Finally, the analysis of data, which are known to depart significantly from the normal distribution, will be described.
Resumo:
Multiple regression analysis is a complex statistical method with many potential uses. It has also become one of the most abused of all statistical procedures since anyone with a data base and suitable software can carry it out. An investigator should always have a clear hypothesis in mind before carrying out such a procedure and knowledge of the limitations of each aspect of the analysis. In addition, multiple regression is probably best used in an exploratory context, identifying variables that might profitably be examined by more detailed studies. Where there are many variables potentially influencing Y, they are likely to be intercorrelated and to account for relatively small amounts of the variance. Any analysis in which R squared is less than 50% should be suspect as probably not indicating the presence of significant variables. A further problem relates to sample size. It is often stated that the number of subjects or patients must be at least 5-10 times the number of variables included in the study.5 This advice should be taken only as a rough guide but it does indicate that the variables included should be selected with great care as inclusion of an obviously unimportant variable may have a significant impact on the sample size required.
Resumo:
With the latest development in computer science, multivariate data analysis methods became increasingly popular among economists. Pattern recognition in complex economic data and empirical model construction can be more straightforward with proper application of modern softwares. However, despite the appealing simplicity of some popular software packages, the interpretation of data analysis results requires strong theoretical knowledge. This book aims at combining the development of both theoretical and applicationrelated data analysis knowledge. The text is designed for advanced level studies and assumes acquaintance with elementary statistical terms. After a brief introduction to selected mathematical concepts, the highlighting of selected model features is followed by a practice-oriented introduction to the interpretation of SPSS1 outputs for the described data analysis methods. Learning of data analysis is usually time-consuming and requires efforts, but with tenacity the learning process can bring about a significant improvement of individual data analysis skills.
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
This project, as part of a broader Sustainable Sub-divisions research agenda, addresses the role of natural ventilation in reducing the use of energy required to cool dwellings
Resumo:
This report provides an introduction to our analyses of secondary data with respect to violent acts and incidents relating to males living in rural settings in Australia. It clarifies important aspects of our overall approach primarily by concentrating on three elements that required early scoping and resolution. Firstly, a wide and inclusive view of violence which encompasses measures of violent acts and incidents and also data identifying risk taking behaviour and the consequences of violence is outlined and justified. Secondly, the classification used to make comparisons between the city and the bush together with associated caveats is outlined. The third element discussed is in relation to national injury data. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in five subsequent reports in this series.
Resumo:
This paper explores a method of comparative analysis and classification of data through perceived design affordances. Included is discussion about the musical potential of data forms that are derived through eco-structural analysis of musical features inherent in audio recordings of natural sounds. A system of classification of these forms is proposed based on their structural contours. The classifications include four primitive types; steady, iterative, unstable and impulse. The classification extends previous taxonomies used to describe the gestural morphology of sound. The methods presented are used to provide compositional support for eco-structuralism.
Resumo:
Road agencies require comprehensive, relevan and quality data describing their road assets to support their investment decisions. An investment decision support system for raod maintenance and rehabilitation mainly comprise three important supporting elements namely: road asset data, decision support tools and criteria for decision-making. Probability-based methods have played a crucial role in helping decision makers understand the relationship among road related data, asset performance and uncertainties in estimating budgets/costs for road management investment. This paper presents applications of the probability-bsed method for road asset management.
Resumo:
Now in its second edition, this book describes tools that are commonly used in transportation data analysis. The first part of the text provides statistical fundamentals while the second part presents continuous dependent variable models. With a focus on count and discrete dependent variable models, the third part features new chapters on mixed logit models, logistic regression, and ordered probability models. The last section provides additional coverage of Bayesian statistical modeling, including Bayesian inference and Markov chain Monte Carlo methods. Data sets are available online to use with the modeling techniques discussed.
Resumo:
BACKGROUND: In Bangladesh, poor infant and young child feeding practices are contributing to the burden of infectious diseases and malnutrition. Objective. To estimate the determinants of selected feeding practices and key indicators of breastfeeding and complementary feeding in Bangladesh. METHODS: The sample included 2482 children aged 0 to 23 months from the Bangladesh Demographic and Health Survey of 2004. The World Health Organization (WHO)-recommended infant and young child feeding indicators were estimated, and selected feeding indicators were examined against a set of individual-, household-, and community-level variables using univariate and multivariate analyses. RESULTS: Only 27.5% of mothers initiated breastfeeding within the first hour after birth, 99.9% had ever breastfed their infants, 97.3% were currently breastfeeding, and 22.4% were currently bottle-feeding. Among infants under 6 months of age, 42.5% were exclusively breastfed, and among those aged 6 to 9 months, 62.3% received complementary foods in addition to breastmilk. Among the risk factors for an infant not being exclusively breastfed were higher socioeconomic status, higher maternal education, and living in the Dhaka region. Higher birth order and female sex were associated with increased rates of exclusive breastfeeding of infants under 6 months of age. The risk factors for bottle-feeding were similar and included having a partner with a higher educational level (OR = 2.17), older maternal age (OR for age > or = 35 years = 2.32), and being in the upper wealth quintiles (OR for the richest = 3.43). Urban mothers were at higher risk for not initiating breastfeeding within the first hour after birth (OR = 1.61). Those who made three to six visits to the antenatal clinic were at lower risk for not initiating breastfeeding within the first hour (OR = 0.61). The rate of initiating breastfeeding within the first hour was higher in mothers from richer households (OR = 0.37). CONCLUSIONS: Most breastfeeding indicators in Bangladesh were below acceptable levels. Breastfeeding promotion programs in Bangladesh need nationwide application because of the low rates of appropriate infant feeding indicators, but they should also target women who have the main risk factors, i.e., working mothers living in urban areas (particularly in Dhaka).
Resumo:
Background: Poor feeding practices in early childhood contribute to the burden of childhood malnutrition and morbidity. Objective: To estimate the key indicators of breastfeeding and complementary feeding and the determinants of selected feeding practices in Sri Lanka. Methods: The sample consisted of 1,127 children aged 0 to 23 months from the Sri Lanka Demographic and Health Survey 2000. The key infant feeding indicators were estimated and selected indicators were examined against a set of individual-, household-, and community- level variables using univariate and multivariate analyses. Results: Breastfeeding was initiated within the first hour after birth in 56.3% of infants, 99.7% had ever been breastfed, 85.0% were currently being breastfed, and 27.2% were being bottle-fed. Of infants under 6 months of age, 60.6% were fully breastfed, and of those aged 6 to 9 months, 93.4% received complementary foods. The likelihood of not initiating breastfeeding within the first hour after birth was higher for mothers who underwent cesarean delivery (OR = 3.23) and those who were not visited by a Public Health Midwife at home during pregnancy (OR = 1.81). The rate of full breastfeeding was significantly lower among mothers who did not receive postnatal home visits by a Public Health Midwife. Bottlefeeding rates were higher among infants whose mothers had ever been employed (OR = 1.86), lived in a metropolitan area (OR = 3.99), or lived in the South-Central Hill country (OR = 3.11) and were lower among infants of mothers with secondary education (OR = 0.27). Infants from the urban (OR = 8.06) and tea estate (OR = 12.63) sectors were less likely to receive timely complementary feeding than rural infants. Conclusions: Antenatal and postnatal contacts with Public Health Midwives were associated with improved breastfeeding practices. Breastfeeding promotion strategies should specifically focus on the estate and urban or metropolitan communities.