954 resultados para stars: statistics
Resumo:
The European Space Agency's Gaia mission will create the largest and most precise three dimensional chart of our galaxy (the Milky Way), by providing unprecedented position, parallax, proper motion, and radial velocity measurements for about one billion stars. The resulting catalogue will be made available to the scientific community and will be analyzed in many different ways, including the production of a variety of statistics. The latter will often entail the generation of multidimensional histograms and hypercubes as part of the precomputed statistics for each data release, or for scientific analysis involving either the final data products or the raw data coming from the satellite instruments. In this paper we present and analyze a generic framework that allows the hypercube generation to be easily done within a MapReduce infrastructure, providing all the advantages of the new Big Data analysis paradigmbut without dealing with any specific interface to the lower level distributed system implementation (Hadoop). Furthermore, we show how executing the framework for different data storage model configurations (i.e. row or column oriented) and compression techniques can considerably improve the response time of this type of workload for the currently available simulated data of the mission. In addition, we put forward the advantages and shortcomings of the deployment of the framework on a public cloud provider, benchmark against other popular solutions available (that are not always the best for such ad-hoc applications), and describe some user experiences with the framework, which was employed for a number of dedicated astronomical data analysis techniques workshops.
Resumo:
The European Space Agency's Gaia mission will create the largest and most precise three dimensional chart of our galaxy (the Milky Way), by providing unprecedented position, parallax, proper motion, and radial velocity measurements for about one billion stars. The resulting catalogue will be made available to the scientific community and will be analyzed in many different ways, including the production of a variety of statistics. The latter will often entail the generation of multidimensional histograms and hypercubes as part of the precomputed statistics for each data release, or for scientific analysis involving either the final data products or the raw data coming from the satellite instruments. In this paper we present and analyze a generic framework that allows the hypercube generation to be easily done within a MapReduce infrastructure, providing all the advantages of the new Big Data analysis paradigmbut without dealing with any specific interface to the lower level distributed system implementation (Hadoop). Furthermore, we show how executing the framework for different data storage model configurations (i.e. row or column oriented) and compression techniques can considerably improve the response time of this type of workload for the currently available simulated data of the mission. In addition, we put forward the advantages and shortcomings of the deployment of the framework on a public cloud provider, benchmark against other popular solutions available (that are not always the best for such ad-hoc applications), and describe some user experiences with the framework, which was employed for a number of dedicated astronomical data analysis techniques workshops.
Resumo:
This is the statistical portion of the annual survey results of the State Library of Iowa for 1974.
Resumo:
This publication is an historical recording of the most requested statistics on vital events and is a source of information that can be used in further analysis.
Resumo:
This publication is an historical recording of the most requested statistics on vital events and is a source of information that can be used in further analysis.
Resumo:
The objective of this paper is to introduce a fourth-order cost function of the displaced frame difference (DFD) capable of estimatingmotion even for small regions or blocks. Using higher than second-orderstatistics is appropriate in case the image sequence is severely corruptedby additive Gaussian noise. Some results are presented and compared to those obtained from the mean kurtosis and the mean square error of the DFD.
Resumo:
Web-portaalien aiheenmukaista luokittelua voidaan hyödyntää tunnistamaan käyttäjän kiinnostuksen kohteet keräämällä tilastotietoa hänen selaustottumuksistaan eri kategorioissa. Tämä diplomityö käsittelee web-sovelluksien osa-alueita, joissa kerättyä tilastotietoa voidaan hyödyntää personalisoinnissa. Yleisperiaatteet sisällön personalisoinnista, Internet-mainostamisesta ja tiedonhausta selitetään matemaattisia malleja käyttäen. Lisäksi työssä kuvaillaan yleisluontoiset ominaisuudet web-portaaleista sekä tilastotiedon keräämiseen liittyvät seikat.
Resumo:
Statistics has become an indispensable tool in biomedical research. Thanks, in particular, to computer science, the researcher has easy access to elementary "classical" procedures. These are often of a "confirmatory" nature: their aim is to test hypotheses (for example the efficacy of a treatment) prior to experimentation. However, doctors often use them in situations more complex than foreseen, to discover interesting data structures and formulate hypotheses. This inverse process may lead to misuse which increases the number of "statistically proven" results in medical publications. The help of a professional statistician thus becomes necessary. Moreover, good, simple "exploratory" techniques are now available. In addition, medical data contain quite a high percentage of outliers (data that deviate from the majority). With classical methods it is often very difficult (even for a statistician!) to detect them and the reliability of results becomes questionable. New, reliable ("robust") procedures have been the subject of research for the past two decades. Their practical introduction is one of the activities of the Statistics and Data Processing Department of the University of Social and Preventive Medicine, Lausanne.