991 resultados para data refinement
Resumo:
By using the Rasch model, much detailed diagnostic information is available to developers of survey and assessment instruments and to the researchers who use them. We outline an approach to the analysis of data obtained from the administration of survey instruments that can enable researchers to recognise and diagnose difficulties with those instruments and then to suggest remedial actions that can improve the measurement properties of the scales included in questionnaires. We illustrate the approach using examples drawn from recent research and demonstrate how the approach can be used to generate figures that make the results of Rasch analyses accessible to non-specialists.
Resumo:
The paper analyses the expected value of OD volumes from probe with fixed error, error that is proportional to zone size and inversely proportional to zone size. To add realism to the analysis, real trip ODs in the Tokyo Metropolitan Region are synthesised. The results show that for small zone coding with average radius of 1.1km, and fixed measurement error of 100m, an accuracy of 70% can be expected. The equivalent accuracy for medium zone coding with average radius of 5km would translate into a fixed error of approximately 300m. As expected small zone coding is more sensitive than medium zone coding as the chances of the probe error envelope falling into adjacent zones are higher. For the same error radii, error proportional to zone size would deliver higher level of accuracy. As over half (54.8%) of the trip ends start or end at zone with equivalent radius of ≤ 1.2 km and only 13% of trips ends occurred at zones with equivalent radius ≥2.5km, measurement error that is proportional to zone size such as mobile phone would deliver higher level of accuracy. The synthesis of real OD with different probe error characteristics have shown that expected value of >85% is difficult to achieve for small zone coding with average radius of 1.1km. For most transport applications, OD matrix at medium zone coding is sufficient for transport management. From this study it can be drawn that GPS with error range between 2 and 5m, and at medium zone coding (average radius of 5km) would provide OD estimates greater than 90% of the expected value. However, for a typical mobile phone operating error range at medium zone coding the expected value would be lower than 85%. This paper assumes transmission of one origin and one destination positions from the probe. However, if multiple positions within the origin and destination zones are transmitted, map matching to transport network could be performed and it would greatly improve the accuracy of the probe data.
Resumo:
This paper presents a model to estimate travel time using cumulative plots. Three different cases considered are i) case-Det, for only detector data; ii) case-DetSig, for detector data and signal controller data and iii) case-DetSigSFR: for detector data, signal controller data and saturation flow rate. The performance of the model for different detection intervals is evaluated. It is observed that detection interval is not critical if signal timings are available. Comparable accuracy can be obtained from larger detection interval with signal timings or from shorter detection interval without signal timings. The performance for case-DetSig and for case-DetSigSFR is consistent with accuracy generally more than 95% whereas, case-Det is highly sensitive to the signal phases in the detection interval and its performance is uncertain if detection interval is integral multiple of signal cycles.
Resumo:
Light Detection and Ranging (LIDAR) has great potential to assist vegetation management in power line corridors by providing more accurate geometric information of the power line assets and vegetation along the corridors. However, the development of algorithms for the automatic processing of LIDAR point cloud data, in particular for feature extraction and classification of raw point cloud data, is in still in its infancy. In this paper, we take advantage of LIDAR intensity and try to classify ground and non-ground points by statistically analyzing the skewness and kurtosis of the intensity data. Moreover, the Hough transform is employed to detected power lines from the filtered object points. The experimental results show the effectiveness of our methods and indicate that better results were obtained by using LIDAR intensity data than elevation data.
Resumo:
A few studies examined interactive effects between air pollution and temperature on health outcomes. This study is to examine if temperature modified effects of ozone and cardiovascular mortality in 95 large US cities. A nonparametric and a parametric regression models were separately used to explore interactive effects of temperature and ozone on cardiovascular mortality during May and October, 1987-2000. A Bayesian meta-analysis was used to pool estimates. Both models illustrate that temperature enhanced the ozone effects on mortality in the northern region, but obviously in the southern region. A 10-ppb increment in ozone was associated with 0.41 % (95% posterior interval (PI): -0.19 %, 0.93 %), 0.27 % (95% PI: -0.44 %, 0.87 %) and 1.68 % (95% PI: 0.07 %, 3.26 %) increases in daily cardiovascular mortality corresponding to low, moderate and high levels of temperature, respectively. We concluded that temperature modified effects of ozone, particularly in the northern region.
Resumo:
Total deposition of petrol, diesel and environmental tobacco smoke (ETS) aerosols in the human respiratory tract for nasal breathing conditions was computed for 14 nonsmoking volunteers, considering the specific anatomical and respiratory parameters of each volunteer and the specific size distribution for each inhalation experiment. Theoretical predictions were 34.6% for petrol, 24.0% for diesel, and 18.5% for ETS particles. Compared to the experimental results, predicted deposition values were consistently smaller than the measured data (41.4% for petrol, 29.6% for diesel, and 36.2% for ETS particles). The apparent discrepancy between experimental data on total deposition and modeling results may be reconciled by considering the non-spherical shape of the test aerosols by diameter-dependent dynamic shape factors to account for differences between mobility-equivalent and volume-equivalent or thermodynamic diameters. While the application of dynamic shape factors is able to explain the observed differences for petrol and diesel particles, additional mechanisms may be required for ETS particle deposition, such as the size reduction upon inspiration by evaporation of volatile compounds and/or condensation-induced restructuring, and, possibly, electrical charge effects.
Resumo:
The wide range of contributing factors and circumstances surrounding crashes on road curves suggest that no single intervention can prevent these crashes. This paper presents a novel methodology, based on data mining techniques, to identify contributing factors and the relationship between them. It identifies contributing factors that influence the risk of a crash. Incident records, described using free text, from a large insurance company were analysed with rough set theory. Rough set theory was used to discover dependencies among data, and reasons using the vague, uncertain and imprecise information that characterised the insurance dataset. The results show that male drivers, who are between 50 and 59 years old, driving during evening peak hours are involved with a collision, had a lowest crash risk. Drivers between 25 and 29 years old, driving from around midnight to 6 am and in a new car has the highest risk. The analysis of the most significant contributing factors on curves suggests that drivers with driving experience of 25 to 42 years, who are driving a new vehicle have the highest crash cost risk, characterised by the vehicle running off the road and hitting a tree. This research complements existing statistically based tools approach to analyse road crashes. Our data mining approach is supported with proven theory and will allow road safety practitioners to effectively understand the dependencies between contributing factors and the crash type with the view to designing tailored countermeasures.
Resumo:
The problem of impostor dataset selection for GMM-based speaker verification is addressed through the recently proposed data-driven background dataset refinement technique. The SVM-based refinement technique selects from a candidate impostor dataset those examples that are most frequently selected as support vectors when training a set of SVMs on a development corpus. This study demonstrates the versatility of dataset refinement in the task of selecting suitable impostor datasets for use in GMM-based speaker verification. The use of refined Z- and T-norm datasets provided performance gains of 15% in EER in the NIST 2006 SRE over the use of heuristically selected datasets. The refined datasets were shown to generalise well to the unseen data of the NIST 2008 SRE.
Resumo:
1. Ecological data sets often use clustered measurements or use repeated sampling in a longitudinal design. Choosing the correct covariance structure is an important step in the analysis of such data, as the covariance describes the degree of similarity among the repeated observations. 2. Three methods for choosing the covariance are: the Akaike information criterion (AIC), the quasi-information criterion (QIC), and the deviance information criterion (DIC). We compared the methods using a simulation study and using a data set that explored effects of forest fragmentation on avian species richness over 15 years. 3. The overall success was 80.6% for the AIC, 29.4% for the QIC and 81.6% for the DIC. For the forest fragmentation study the AIC and DIC selected the unstructured covariance, whereas the QIC selected the simpler autoregressive covariance. Graphical diagnostics suggested that the unstructured covariance was probably correct. 4. We recommend using DIC for selecting the correct covariance structure.
Resumo:
This report provides an introduction to our analyses of secondary data with respect to violent acts and incidents relating to males living in rural settings in Australia. It clarifies important aspects of our overall approach primarily by concentrating on three elements that required early scoping and resolution. Firstly, a wide and inclusive view of violence which encompasses measures of violent acts and incidents and also data identifying risk taking behaviour and the consequences of violence is outlined and justified. Secondly, the classification used to make comparisons between the city and the bush together with associated caveats is outlined. The third element discussed is in relation to national injury data. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in five subsequent reports in this series.
Resumo:
This report focuses on our examination of extant data which have been sourced with respect to self-harm and suicide in Australia. Moreover, specific areas of concern regarding elevated rates of suicide for rural males and data anomalies which emerged during our examination of these data are discussed. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in complementary reports in this series.
Resumo:
This report focuses on our examination of extant data which have been sourced with respect to intentional violence perpetrated or experienced by males in regional and remote Australia. The nature of intentional violent acts can be physical, sexual or psychological or involve deprivation or neglect. We have presented under the headings of: self-harm including suicide; homicide; assault, sexual assault and the threat of assault; child abuse; other family and intimate partner violence; harassment, stalking and bullying; alcohol related social violence; and animal abuse. State variations in interpersonal violence are also presented. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in complementary reports in this series.
Resumo:
This report focuses on our examination of extant data which have been sourced with respect to unintentional serious and violent injuries to males living in regional and remote Australia. Such injuries typically might be caused by, for example, transport accidents, occupational exposures and hazards, burns and so on. Thus unintentional violent incidents cause physical trauma the consequences of which can sometimes lead to chronic conditions including psychological harm or substance abuse. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in complementary reports in this series.
Resumo:
This report focuses on our examination of extant data which have been sourced with respect to personally and socially risky behaviour associated with males living in regional and remote Australia . The AIHW (2008: PHE 97:89) defines personally risky behaviour, on the one hand, as working, swimming, boating, driving or operating hazardous machinery while intoxicated with alcohol or an illicit drug. Socially risky behaviour, on the other hand, is defined as creating a public disturbance, damaging property, stealing or verbally or physically abusing someone while intoxicated with alcohol or an illicit drug. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in complementary reports in this series.