970 resultados para Datasets


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Modern mobile computing devices are versatile, but bring the burden of constant settings adjustment according to the current conditions of the environment. While until today, this task has to be accomplished by the human user, the variety of sensors usually deployed in such a handset provides enough data for autonomous self-configuration by a learning, adaptive system. However, this data is not fully available at certain points in time, or can contain false values. Handling potentially incomplete sensor data to detect context changes without a semantic layer represents a scientific challenge which we address with our approach. A novel machine learning technique is presented - the Missing-Values-SOM - which solves this problem by predicting setting adjustments based on context information. Our method is centered around a self-organizing map, extending it to provide a means of handling missing values. We demonstrate the performance of our approach on mobile context snapshots, as well as on classical machine learning datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Geothermal industry in Australia and Queensland is in its infancy and for hot dry rock (HDR) geothermal energy, it is very much in the target identification and resource definition stages. As a key effort to assist the geothermal industry and exploration for HDR in Queensland, we are developing a comprehensive and new integrated geochemical and geochronological database on igneous rocks. To date, around 18,000 igneous rocks have been analysed across Queensland for chemical and/or age information. However, these data currently reside in a number of disparate datasets (e.g., Ozchron, Champion et al., 2007, Geological Survey of Queensland, journal publications, and unpublished university theses). The goal of this project is to collate and integrate these data on Queensland igneous rocks to improve our understanding of high heat producing granites in Queensland, in terms of their distribution (particularly in the subsurface), dimensions, ages, and controlling factors in their genesis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. Methodology/Principal Findings A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. Conclusions It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method. It is freely available at http://bioinformatics.awowshop.com/snlpr​ed_page.php.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Travel time in an important transport performance indicator. Different modes of transport (buses and cars) have different mechanical and operational characteristics, resulting in significantly different travel behaviours and complexities in multimodal travel time estimation on urban networks. This paper explores the relationship between bus and car travel time on urban networks by utilising the empirical Bluetooth and Bus Vehicle Identification data from Brisbane. The technologies and issues behind the two datasets are studied. After cleaning the data to remove outliers, the relationship between not-in-service bus and car travel time and the relationship between in-service bus and car travel time are discussed. The travel time estimation models reveal that the not-in-service bus travel time are similar to the car travel time and the in-service bus travel time could be used to estimate car travel time during off-peak hours

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Compression ignition (CI) engine design is subject to many constraints which presents a multi-criteria optimisation problem that the engine researcher must solve. In particular, the modern CI engine must not only be efficient, but must also deliver low gaseous, particulate and life cycle greenhouse gas emissions so that its impact on urban air quality, human health, and global warming are minimised. Consequently, this study undertakes a multi-criteria analysis which seeks to identify alternative fuels, injection technologies and combustion strategies that could potentially satisfy these CI engine design constraints. Three datasets are analysed with the Preference Ranking Organization Method for Enrichment Evaluations and Geometrical Analysis for Interactive Aid (PROMETHEE-GAIA) algorithm to explore the impact of 1): an ethanol fumigation system, 2): alternative fuels (20 % biodiesel and synthetic diesel) and alternative injection technologies (mechanical direct injection and common rail injection), and 3): various biodiesel fuels made from 3 feedstocks (i.e. soy, tallow, and canola) tested at several blend percentages (20-100 %) on the resulting emissions and efficiency profile of the various test engines. The results show that moderate ethanol substitutions (~20 % by energy) at moderate load, high percentage soy blends (60-100 %), and alternative fuels (biodiesel and synthetic diesel) provide an efficiency and emissions profile that yields the most “preferred” solutions to this multi-criteria engine design problem. Further research is, however, required to reduce Reactive Oxygen Species (ROS) emissions with alternative fuels, and to deliver technologies that do not significantly reduce the median diameter of particle emissions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Efficient effective child product safety (PS) responses require data on hazards, injury severity and injury probability. PS responses in Australia largely rely on reports from manufacturers/retailers, other jurisdictions/regulators, or consumers. The extent to which reactive responses reflect actual child injury priorities is unknown. Aims/Objectives/Purpose This research compared PS issues for children identified using data compiled from PS regulatory data and data compiled from health data sources in Queensland, Australia. Methods PS regulatory documents describing issues affecting children in Queensland in 2008–2009 were compiled and analysed to identify frequent products and hazards. Three health data sources (ED, injury surveillance and hospital data) were analysed to identify frequent products and hazards. Results/Outcomes Projectile toys/squeeze toys were the priority products for PS regulators with these toys having the potential to release small parts presenting choking hazards. However, across all health datasets, falls were the most common mechanism of injury, and several of the products identified were not subject to a PS system response. While some incidents may not require a response, a manual review of injury description text identified child poisonings and burns as common mechanisms of injuries in the health data where there was substantial documentation of product-involvement, yet only 10% of PS system responses focused on these two mechanisms combined. Significance/contribution to the field Regulatory data focused on products that fail compliance checks with ‘potential’ to cause harm, and health data identified actual harm, resulting in different prioritisation of products/mechanisms. Work is needed to better integrate health data into PS responses in Australia.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To recognize faces in video, face appearances have been widely modeled as piece-wise local linear models which linearly approximate the smooth yet non-linear low dimensional face appearance manifolds. The choice of representations of the local models is crucial. Most of the existing methods learn each local model individually meaning that they only anticipate variations within each class. In this work, we propose to represent local models as Gaussian distributions which are learned simultaneously using the heteroscedastic probabilistic linear discriminant analysis (PLDA). Each gallery video is therefore represented as a collection of such distributions. With the PLDA, not only the within-class variations are estimated during the training, the separability between classes is also maximized leading to an improved discrimination. The heteroscedastic PLDA itself is adapted from the standard PLDA to approximate face appearance manifolds more accurately. Instead of assuming a single global within-class covariance, the heteroscedastic PLDA learns different within-class covariances specific to each local model. In the recognition phase, a probe video is matched against gallery samples through the fusion of point-to-model distances. Experiments on the Honda and MoBo datasets have shown the merit of the proposed method which achieves better performance than the state-of-the-art technique.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Queensland University of Technology (QUT) Library, like many other academic and research institution libraries in Australia, has been collaborating with a range of academic and service provider partners to develop a range of research data management services and collections. Three main strategies are being employed and an overview of process, infrastructure, usage and benefits is provided of each of these service aspects. The development of processes and infrastructure to facilitate the strategic identification and management of QUT developed datasets has been a major focus. A number of Australian National Data Service (ANDS) sponsored projects - including Seeding the Commons; Metadata Hub / Store; Data Capture and Gold Standard Record Exemplars have / will provide QUT with a data registry system, linkages to storage, processes for identifying and describing datasets, and a degree of academic awareness. QUT supports open access and has established a culture for making its research outputs available via the QUT ePrints institutional repository. Incorporating open access research datasets into the library collections is an equally important aspect of facilitating the adoption of data-centric eresearch methods. Some datasets are available commercially, and the library has collaborated with QUT researchers, in the QUT Business School especially strongly, to identify and procure a rapidly growing range of financial datasets to support research. The library undertakes licensing and uses the Library Resource Allocation to pay for the subscriptions. It is a new area of collection development for with much to be learned. The final strategy discussed is the library acting as “data broker”. QUT Library has been working with researchers to identify these datasets and undertake the licensing, payment and access as a centrally supported service on behalf of researchers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This project sought to investigate parameters of residual soil materials located in South East Queensland (SEQ), as determined from a large number of historical site investigation records. This was undertaken to quantify material parameter variability and to assess the validity of using commonly adopted correlations to estimate "typical" soil parameters for this region. A dataset of in situ and laboratory derived residual soil parameters was constructed and analysed to identify potential correlations that related either to the entire area considered, or to specific residual soils that were derived from a common parent material. The variability of SEQ soil parameters were generally found to be greater than the results of equivalent studies that analysed transported soil dominant datasets. Noteworthy differences in material properties also became evident when residual soils weathered from different parent materials were considered independently. Large variation between the correlations developed for specific soil types was found, which highligted both heterogeneity of the studied materials and the incompatibility of generic correlations to residual soils present in SEQ. Region and parent material specific correlations that estimate shear strength from in situ penetration tests have been proposed for the various residual soil types considered.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The health impacts of exposure to ambient temperature have been drawing increasing attention from the environmental health research community, government, society, industries, and the public. Case-crossover and time series models are most commonly used to examine the effects of ambient temperature on mortality. However, some key methodological issues remain to be addressed. For example, few studies have used spatiotemporal models to assess the effects of spatial temperatures on mortality. Few studies have used a case-crossover design to examine the delayed (distributed lag) and non-linear relationship between temperature and mortality. Also, little evidence is available on the effects of temperature changes on mortality, and on differences in heat-related mortality over time. This thesis aimed to address the following research questions: 1. How to combine case-crossover design and distributed lag non-linear models? 2. Is there any significant difference in effect estimates between time series and spatiotemporal models? 3. How to assess the effects of temperature changes between neighbouring days on mortality? 4. Is there any change in temperature effects on mortality over time? To combine the case-crossover design and distributed lag non-linear model, datasets including deaths, and weather conditions (minimum temperature, mean temperature, maximum temperature, and relative humidity), and air pollution were acquired from Tianjin China, for the years 2005 to 2007. I demonstrated how to combine the case-crossover design with a distributed lag non-linear model. This allows the case-crossover design to estimate the non-linear and delayed effects of temperature whilst controlling for seasonality. There was consistent U-shaped relationship between temperature and mortality. Cold effects were delayed by 3 days, and persisted for 10 days. Hot effects were acute and lasted for three days, and were followed by mortality displacement for non-accidental, cardiopulmonary, and cardiovascular deaths. Mean temperature was a better predictor of mortality (based on model fit) than maximum or minimum temperature. It is still unclear whether spatiotemporal models using spatial temperature exposure produce better estimates of mortality risk compared with time series models that use a single site’s temperature or averaged temperature from a network of sites. Daily mortality data were obtained from 163 locations across Brisbane city, Australia from 2000 to 2004. Ordinary kriging was used to interpolate spatial temperatures across the city based on 19 monitoring sites. A spatiotemporal model was used to examine the impact of spatial temperature on mortality. A time series model was used to assess the effects of single site’s temperature, and averaged temperature from 3 monitoring sites on mortality. Squared Pearson scaled residuals were used to check the model fit. The results of this study show that even though spatiotemporal models gave a better model fit than time series models, spatiotemporal and time series models gave similar effect estimates. Time series analyses using temperature recorded from a single monitoring site or average temperature of multiple sites were equally good at estimating the association between temperature and mortality as compared with a spatiotemporal model. A time series Poisson regression model was used to estimate the association between temperature change and mortality in summer in Brisbane, Australia during 1996–2004 and Los Angeles, United States during 1987–2000. Temperature change was calculated by the current day's mean temperature minus the previous day's mean. In Brisbane, a drop of more than 3 �C in temperature between days was associated with relative risks (RRs) of 1.16 (95% confidence interval (CI): 1.02, 1.31) for non-external mortality (NEM), 1.19 (95% CI: 1.00, 1.41) for NEM in females, and 1.44 (95% CI: 1.10, 1.89) for NEM aged 65.74 years. An increase of more than 3 �C was associated with RRs of 1.35 (95% CI: 1.03, 1.77) for cardiovascular mortality and 1.67 (95% CI: 1.15, 2.43) for people aged < 65 years. In Los Angeles, only a drop of more than 3 �C was significantly associated with RRs of 1.13 (95% CI: 1.05, 1.22) for total NEM, 1.25 (95% CI: 1.13, 1.39) for cardiovascular mortality, and 1.25 (95% CI: 1.14, 1.39) for people aged . 75 years. In both cities, there were joint effects of temperature change and mean temperature on NEM. A change in temperature of more than 3 �C, whether positive or negative, has an adverse impact on mortality even after controlling for mean temperature. I examined the variation in the effects of high temperatures on elderly mortality (age . 75 years) by year, city and region for 83 large US cities between 1987 and 2000. High temperature days were defined as two or more consecutive days with temperatures above the 90th percentile for each city during each warm season (May 1 to September 30). The mortality risk for high temperatures was decomposed into: a "main effect" due to high temperatures using a distributed lag non-linear function, and an "added effect" due to consecutive high temperature days. I pooled yearly effects across regions and overall effects at both regional and national levels. The effects of high temperature (both main and added effects) on elderly mortality varied greatly by year, city and region. The years with higher heat-related mortality were often followed by those with relatively lower mortality. Understanding this variability in the effects of high temperatures is important for the development of heat-warning systems. In conclusion, this thesis makes contribution in several aspects. Case-crossover design was combined with distribute lag non-linear model to assess the effects of temperature on mortality in Tianjin. This makes the case-crossover design flexibly estimate the non-linear and delayed effects of temperature. Both extreme cold and high temperatures increased the risk of mortality in Tianjin. Time series model using single site’s temperature or averaged temperature from some sites can be used to examine the effects of temperature on mortality. Temperature change (no matter significant temperature drop or great temperature increase) increases the risk of mortality. The high temperature effect on mortality is highly variable from year to year.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Using Monte Carlo simulation for radiotherapy dose calculation can provide more accurate results when compared to the analytical methods usually found in modern treatment planning systems, especially in regions with a high degree of inhomogeneity. These more accurate results acquired using Monte Carlo simulation however, often require orders of magnitude more calculation time so as to attain high precision, thereby reducing its utility within the clinical environment. This work aims to improve the utility of Monte Carlo simulation within the clinical environment by developing techniques which enable faster Monte Carlo simulation of radiotherapy geometries. This is achieved principally through the use new high performance computing environments and simpler alternative, yet equivalent representations of complex geometries. Firstly the use of cloud computing technology and it application to radiotherapy dose calculation is demonstrated. As with other super-computer like environments, the time to complete a simulation decreases as 1=n with increasing n cloud based computers performing the calculation in parallel. Unlike traditional super computer infrastructure however, there is no initial outlay of cost, only modest ongoing usage fees; the simulations described in the following are performed using this cloud computing technology. The definition of geometry within the chosen Monte Carlo simulation environment - Geometry & Tracking 4 (GEANT4) in this case - is also addressed in this work. At the simulation implementation level, a new computer aided design interface is presented for use with GEANT4 enabling direct coupling between manufactured parts and their equivalent in the simulation environment, which is of particular importance when defining linear accelerator treatment head geometry. Further, a new technique for navigating tessellated or meshed geometries is described, allowing for up to 3 orders of magnitude performance improvement with the use of tetrahedral meshes in place of complex triangular surface meshes. The technique has application in the definition of both mechanical parts in a geometry as well as patient geometry. Static patient CT datasets like those found in typical radiotherapy treatment plans are often very large and present a significant performance penalty on a Monte Carlo simulation. By extracting the regions of interest in a radiotherapy treatment plan, and representing them in a mesh based form similar to those used in computer aided design, the above mentioned optimisation techniques can be used so as to reduce the time required to navigation the patient geometry in the simulation environment. Results presented in this work show that these equivalent yet much simplified patient geometry representations enable significant performance improvements over simulations that consider raw CT datasets alone. Furthermore, this mesh based representation allows for direct manipulation of the geometry enabling motion augmentation for time dependant dose calculation for example. Finally, an experimental dosimetry technique is described which allows the validation of time dependant Monte Carlo simulation, like the ones made possible by the afore mentioned patient geometry definition. A bespoke organic plastic scintillator dose rate meter is embedded in a gel dosimeter thereby enabling simultaneous 3D dose distribution and dose rate measurement. This work demonstrates the effectiveness of applying alternative and equivalent geometry definitions to complex geometries for the purposes of Monte Carlo simulation performance improvement. Additionally, these alternative geometry definitions allow for manipulations to be performed on otherwise static and rigid geometry.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There are no population studies of prevalence or incidence of child maltreatment in Australia. Child protection data gives some understanding but is restricted by system capacity and definitional issues across jurisdictions. Child protection data currently suggests that numbers of reports are increasing yearly, and the child protection system then becomes focussed on investigating all reports and diluting available resources for those children who are most in need of intervention. A public health response across multiple agencies enables responses to child safety across the entire population. All families are targeted at the primary level; examples include ensuring all parents know the dangers of shaking a baby or teaching children to say no if a situation makes them uncomfortable. The secondary level of prevention targets families with a number of risk factors, for example subsidised child care so children aren't left unsupervised after school when both parents have to be at work or home visiting for drug-addicted parents to ensure children are cared for. The tertiary response then becomes the responsibility of the child protection system and is reserved for those children where abuse and neglect are identified. This model requires that child safety is seen in a broader context than just the child protection system, and increasingly health professionals are being identified as an important component in the public health framework. If all injury is viewed as preventable and considered along a continuum of 'accidental' through to 'inflicted', it becomes possible to conceptualise child maltreatment in an injury context. Parental intent may not be to cause harm to the child, but by lack of insight or concern about risk, the potential for injury is high. The mechanisms for unintentional and intentional injury overlap and some suggest that by segregating child abuse (with the possible exception of sexual abuse) from unintentional injury, child abuse is excluded from the broader injury prevention initiative that is gaining momentum in the community. This research uses a public health perspective, specifically that of injury prevention, to consider the problem of child abuse. This study employed a mixed method design that incorporates secondary data analysis, data linkage and structured interviews of different professional groups. Datasets from the Queensland Injury Surveillance Unit (QISU) and The Department of Child Safety (DCS) were evaluated. Coded injury data was grouped according to intent of injury according to those with a code that indicated the ED presentation was due to child abuse, a code indicating that the injury was possibly due to abuse or, in the third group, the intent code indicated that the injury was unintentional and not due to abuse. Primary data collection from ED records was undertaken and information recoded to assess reliability and completeness. Emergency department data (QISU) was linked to Department of Child Safety Data to examine concordance and data quality. Factors influencing the collection and collation of these data were identified through structured interview methodology and analysed using qualitative methods. Secondary analysis of QISU data indicated that codes lacking specific information on the injury event were more likely to also have an intent code indicating abuse than those records where there was specific information on the injury event. Codes for abuse appeared in only 1.2% of the 84,765 records analysed. Unintentional injury was the most commonly coded intent (95.3%). In the group with a definite abuse code assigned at triage, 83% linked to a record with DCS and cases where documentation indicated police involvement were significantly more likely to be associated with a DCS record than those without such documentation. In those coded with an unintentional injury code, 22% linked to a DCS record with cases assigned an urgent triage category more likely to link than those with a triage category for resuscitation and children who presented to regional or remote hospitals more likely to link to a DCS record than those presenting to urban hospitals. Twenty-nine per cent of cases with a code indicating possible abuse linked to a DCS record. In documentation that indicated police involvement in the case, a code for unspecified activity when compared to cases with a code indicating involvement in a sporting activity and children less than 12 months of age compared to those in the 13-17 year old age group were all variables significantly associated with linkage to a DCS record. Only 13% of records contained documentation indicating that child abuse and neglect were considered in the diagnosis of the injury despite almost half of the sample having a code of abuse or possible abuse. Doctors and nurses were confident in their knowledge of the process of reporting child maltreatment but less confident about identifying child abuse and neglect and what should be reported. Many were concerned about implications of reporting, for the child and family and for themselves. A number were concerned about the implications of not reporting, mostly for the wellbeing of the child and a few in terms of their legal obligations as mandatory reporters. The outcomes of this research will help improve the knowledge of barriers to effective surveillance of child abuse in emergency departments. This will, in turn, ensure better identification and reporting practises; more reliable official statistical collections and the potential of flagging high-risk cases to ensure adequate departmental responses have been initiated.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Summary: More than ever before contemporary societies are characterised by the huge amounts of data being transferred. Authorities, companies, academia and other stakeholders refer to Big Data when discussing the importance of large and complex datasets and developing possible solutions for their use. Big Data promises to be the next frontier of innovation for institutions and individuals, yet it also offers possibilities to predict and influence human behaviour with ever-greater precision

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we demonstrate passive vision-based localization in environments more than two orders of magnitude darker than the current benchmark using a 100 webcam and a 500 camera. Our approach uses the camera’s maximum exposure duration and sensor gain to achieve appropriately exposed images even in unlit night-time environments, albeit with extreme levels of motion blur. Using the SeqSLAM algorithm, we first evaluate the effect of variable motion blur caused by simulated exposures of 132 ms to 10000 ms duration on localization performance. We then use actual long exposure camera datasets to demonstrate day-night localization in two different environments. Finally we perform a statistical analysis that compares the baseline performance of matching unprocessed greyscale images to using patch normalization and local neighbourhood normalization – the two key SeqSLAM components. Our results and analysis show for the first time why the SeqSLAM algorithm is effective, and demonstrate the potential for cheap camera-based localization systems that function across extreme perceptual change.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper investigates engaging experienced birders, as volunteer citizen scientists, to analyze large recorded audio datasets gathered through environmental acoustic monitoring. Although audio data is straightforward to gather, automated analysis remains a challenging task; the existing expertise, local knowledge and motivation of the birder community can complement computational approaches and provide distinct benefits. We explored both the culture and practice of birders, and paradigms for interacting with recorded audio data. A variety of candidate design elements were tested with birders. This study contributes an understanding of how virtual interactions and practices can be developed to complement existing practices of experienced birders in the physical world. In so doing this study contributes a new approach to engagement in e-science. Whereas most citizen science projects task lay participants with discrete real world or artificial activities, sometimes using extrinsic motivators, this approach builds on existing intrinsically satisfying practices.