971 resultados para Data compression
Resumo:
A rule-based approach for classifying previously identified medical concepts in the clinical free text into an assertion category is presented. There are six different categories of assertions for the task: Present, Absent, Possible, Conditional, Hypothetical and Not associated with the patient. The assertion classification algorithms were largely based on extending the popular NegEx and Context algorithms. In addition, a health based clinical terminology called SNOMED CT and other publicly available dictionaries were used to classify assertions, which did not fit the NegEx/Context model. The data for this task includes discharge summaries from Partners HealthCare and from Beth Israel Deaconess Medical Centre, as well as discharge summaries and progress notes from University of Pittsburgh Medical Centre. The set consists of 349 discharge reports, each with pairs of ground truth concept and assertion files for system development, and 477 reports for evaluation. The system’s performance on the evaluation data set was 0.83, 0.83 and 0.83 for recall, precision and F1-measure, respectively. Although the rule-based system shows promise, further improvements can be made by incorporating machine learning approaches.
Resumo:
Projects funded by the Australian National Data Service(ANDS). The specific projects that were funded included: a) Greenhouse Gas Emissions Project (N2O) with Prof. Peter Grace from QUT’s Institute of Sustainable Resources. b) Q150 Project for the management of multimedia data collected at Festival events with Prof. Phil Graham from QUT’s Institute of Creative Industries. c) Bio-diversity environmental sensing with Prof. Paul Roe from the QUT Microsoft eResearch Centre. For the purposes of these projects the Eclipse Rich Client Platform (Eclipse RCP) was chosen as an appropriate software development framework within which to develop the respective software. This poster will present a brief overview of the requirements of the projects, an overview of the experiences of the project team in using Eclipse RCP, report on the advantages and disadvantages of using Eclipse and it’s perspective on Eclipse as an integrated tool for supporting future data management requirements.
Resumo:
It is a big challenge to acquire correct user profiles for personalized text classification since users may be unsure in providing their interests. Traditional approaches to user profiling adopt machine learning (ML) to automatically discover classification knowledge from explicit user feedback in describing personal interests. However, the accuracy of ML-based methods cannot be significantly improved in many cases due to the term independence assumption and uncertainties associated with them. This paper presents a novel relevance feedback approach for personalized text classification. It basically applies data mining to discover knowledge from relevant and non-relevant text and constraints specific knowledge by reasoning rules to eliminate some conflicting information. We also developed a Dempster-Shafer (DS) approach as the means to utilise the specific knowledge to build high-quality data models for classification. The experimental results conducted on Reuters Corpus Volume 1 and TREC topics support that the proposed technique achieves encouraging performance in comparing with the state-of-the-art relevance feedback models.
Resumo:
This paper studies the missing covariate problem which is often encountered in survival analysis. Three covariate imputation methods are employed in the study, and the effectiveness of each method is evaluated within the hazard prediction framework. Data from a typical engineering asset is used in the case study. Covariate values in some time steps are deliberately discarded to generate an incomplete covariate set. It is found that although the mean imputation method is simpler than others for solving missing covariate problems, the results calculated by it can differ largely from the real values of the missing covariates. This study also shows that in general, results obtained from the regression method are more accurate than those of the mean imputation method but at the cost of a higher computational expensive. Gaussian Mixture Model (GMM) method is found to be the most effective method within these three in terms of both computation efficiency and predication accuracy.
Resumo:
Cities accumulate and distribute vast sets of digital information. Many decision-making and planning processes in councils, local governments and organisations are based on both real-time and historical data. Until recently, only a small, carefully selected subset of this information has been released to the public – usually for specific purposes (e.g. train timetables, release of planning application through websites to name just a few). This situation is however changing rapidly. Regulatory frameworks, such as the Freedom of Information Legislation in the US, the UK, the European Union and many other countries guarantee public access to data held by the state. One of the results of this legislation and changing attitudes towards open data has been the widespread release of public information as part of recent Government 2.0 initiatives. This includes the creation of public data catalogues such as data.gov.au (U.S.), data.gov.uk (U.K.), data.gov.au (Australia) at federal government levels, and datasf.org (San Francisco) and data.london.gov.uk (London) at municipal levels. The release of this data has opened up the possibility of a wide range of future applications and services which are now the subject of intensified research efforts. Previous research endeavours have explored the creation of specialised tools to aid decision-making by urban citizens, councils and other stakeholders (Calabrese, Kloeckl & Ratti, 2008; Paulos, Honicky & Hooker, 2009). While these initiatives represent an important step towards open data, they too often result in mere collections of data repositories. Proprietary database formats and the lack of an open application programming interface (API) limit the full potential achievable by allowing these data sets to be cross-queried. Our research, presented in this paper, looks beyond the pure release of data. It is concerned with three essential questions: First, how can data from different sources be integrated into a consistent framework and made accessible? Second, how can ordinary citizens be supported in easily composing data from different sources in order to address their specific problems? Third, what are interfaces that make it easy for citizens to interact with data in an urban environment? How can data be accessed and collected?
Resumo:
Accurate and detailed road models play an important role in a number of geospatial applications, such as infrastructure planning, traffic monitoring, and driver assistance systems. In this thesis, an integrated approach for the automatic extraction of precise road features from high resolution aerial images and LiDAR point clouds is presented. A framework of road information modeling has been proposed, for rural and urban scenarios respectively, and an integrated system has been developed to deal with road feature extraction using image and LiDAR analysis. For road extraction in rural regions, a hierarchical image analysis is first performed to maximize the exploitation of road characteristics in different resolutions. The rough locations and directions of roads are provided by the road centerlines detected in low resolution images, both of which can be further employed to facilitate the road information generation in high resolution images. The histogram thresholding method is then chosen to classify road details in high resolution images, where color space transformation is used for data preparation. After the road surface detection, anisotropic Gaussian and Gabor filters are employed to enhance road pavement markings while constraining other ground objects, such as vegetation and houses. Afterwards, pavement markings are obtained from the filtered image using the Otsu's clustering method. The final road model is generated by superimposing the lane markings on the road surfaces, where the digital terrain model (DTM) produced by LiDAR data can also be combined to obtain the 3D road model. As the extraction of roads in urban areas is greatly affected by buildings, shadows, vehicles, and parking lots, we combine high resolution aerial images and dense LiDAR data to fully exploit the precise spectral and horizontal spatial resolution of aerial images and the accurate vertical information provided by airborne LiDAR. Objectoriented image analysis methods are employed to process the feature classiffcation and road detection in aerial images. In this process, we first utilize an adaptive mean shift (MS) segmentation algorithm to segment the original images into meaningful object-oriented clusters. Then the support vector machine (SVM) algorithm is further applied on the MS segmented image to extract road objects. Road surface detected in LiDAR intensity images is taken as a mask to remove the effects of shadows and trees. In addition, normalized DSM (nDSM) obtained from LiDAR is employed to filter out other above-ground objects, such as buildings and vehicles. The proposed road extraction approaches are tested using rural and urban datasets respectively. The rural road extraction method is performed using pan-sharpened aerial images of the Bruce Highway, Gympie, Queensland. The road extraction algorithm for urban regions is tested using the datasets of Bundaberg, which combine aerial imagery and LiDAR data. Quantitative evaluation of the extracted road information for both datasets has been carried out. The experiments and the evaluation results using Gympie datasets show that more than 96% of the road surfaces and over 90% of the lane markings are accurately reconstructed, and the false alarm rates for road surfaces and lane markings are below 3% and 2% respectively. For the urban test sites of Bundaberg, more than 93% of the road surface is correctly reconstructed, and the mis-detection rate is below 10%.
Resumo:
Typical reference year (TRY) weather data is often used to represent the long term weather pattern for building simulation and design. Through the analysis of ten year historical hourly weather data for seven Australian major capital cities using the frequencies procedure of descriptive statistics analysis (by SPSS software), this paper investigates: • the closeness of the typical reference year (TRY) weather data in representing the long term weather pattern; • the variations and common features that may exist between relatively hot and cold years. It is found that for the given set of input data, in comparison with the other weather elements, the discrepancy between TRY and multiple years is much smaller for the dry bulb temperature, relative humidity and global solar irradiance. The overall distribution patterns of key weather elements are also generally similar between the hot and cold years, but with some shift and/or small distortion. There is little common tendency of change between the hot and the cold years for different weather variables at different study locations.
Resumo:
Concerns regarding groundwater contamination with nitrate and the long-term sustainability of groundwater resources have prompted the development of a multi-layered three dimensional (3D) geological model to characterise the aquifer geometry of the Wairau Plain, Marlborough District, New Zealand. The 3D geological model which consists of eight litho-stratigraphic units has been subsequently used to synthesise hydrogeological and hydrogeochemical data for different aquifers in an approach that aims to demonstrate how integration of water chemistry data within the physical framework of a 3D geological model can help to better understand and conceptualise groundwater systems in complex geological settings. Multivariate statistical techniques(e.g. Principal Component Analysis and Hierarchical Cluster Analysis) were applied to groundwater chemistry data to identify hydrochemical facies which are characteristic of distinct evolutionary pathways and a common hydrologic history of groundwaters. Principal Component Analysis on hydrochemical data demonstrated that natural water-rock interactions, redox potential and human agricultural impact are the key controls of groundwater quality in the Wairau Plain. Hierarchical Cluster Analysis revealed distinct hydrochemical water quality groups in the Wairau Plain groundwater system. Visualisation of the results of the multivariate statistical analyses and distribution of groundwater nitrate concentrations in the context of aquifer lithology highlighted the link between groundwater chemistry and the lithology of host aquifers. The methodology followed in this study can be applied in a variety of hydrogeological settings to synthesise geological, hydrogeological and hydrochemical data and present them in a format readily understood by a wide range of stakeholders. This enables a more efficient communication of the results of scientific studies to the wider community.
Resumo:
During the course of several natural disasters in recent years, Twitter has been found to play an important role as an additional medium for many–to–many crisis communication. Emergency services are successfully using Twitter to inform the public about current developments, and are increasingly also attempting to source first–hand situational information from Twitter feeds (such as relevant hashtags). The further study of the uses of Twitter during natural disasters relies on the development of flexible and reliable research infrastructure for tracking and analysing Twitter feeds at scale and in close to real time, however. This article outlines two approaches to the development of such infrastructure: one which builds on the readily available open source platform yourTwapperkeeper to provide a low–cost, simple, and basic solution; and, one which establishes a more powerful and flexible framework by drawing on highly scaleable, state–of–the–art technology.
Resumo:
This thesis provides a query model suitable for context sensitive access to a wide range of distributed linked datasets which are available to scientists using the Internet. The model is designed based on scientific research standards which require scientists to provide replicable methods in their publications. Although there are query models available that provide limited replicability, they do not contextualise the process whereby different scientists select dataset locations based on their trust and physical location. In different contexts, scientists need to perform different data cleaning actions, independent of the overall query, and the model was designed to accommodate this function. The query model was implemented as a prototype web application and its features were verified through its use as the engine behind a major scientific data access site, Bio2RDF.org. The prototype showed that it was possible to have context sensitive behaviour for each of the three mirrors of Bio2RDF.org using a single set of configuration settings. The prototype provided executable query provenance that could be attached to scientific publications to fulfil replicability requirements. The model was designed to make it simple to independently interpret and execute the query provenance documents using context specific profiles, without modifying the original provenance documents. Experiments using the prototype as the data access tool in workflow management systems confirmed that the design of the model made it possible to replicate results in different contexts with minimal additions, and no deletions, to query provenance documents.
Resumo:
In recent times, light gauge steel framed (LSF) structures, such as cold-formed steel wall systems, are increasingly used, but without a full understanding of their fire performance. Traditionally the fire resistance rating of these load-bearing LSF wall systems is based on approximate prescriptive methods developed based on limited fire tests. Very often they are limited to standard wall configurations used by the industry. Increased fire rating is provided simply by adding more plasterboards to these walls. This is not an acceptable situation as it not only inhibits innovation and structural and cost efficiencies but also casts doubt over the fire safety of these wall systems. Hence a detailed fire research study into the performance of LSF wall systems was undertaken using full scale fire tests and extensive numerical studies. A new composite wall panel developed at QUT was also considered in this study, where the insulation was used externally between the plasterboards on both sides of the steel wall frame instead of locating it in the cavity. Three full scale fire tests of LSF wall systems built using the new composite panel system were undertaken at a higher load ratio using a gas furnace designed to deliver heat in accordance with the standard time temperature curve in AS 1530.4 (SA, 2005). Fire tests included the measurements of load-deformation characteristics of LSF walls until failure as well as associated time-temperature measurements across the thickness and along the length of all the specimens. Tests of LSF walls under axial compression load have shown the improvement to their fire performance and fire resistance rating when the new composite panel was used. Hence this research recommends the use of the new composite panel system for cold-formed LSF walls. The numerical study was undertaken using a finite element program ABAQUS. The finite element analyses were conducted under both steady state and transient state conditions using the measured hot and cold flange temperature distributions from the fire tests. The elevated temperature reduction factors for mechanical properties were based on the equations proposed by Dolamune Kankanamge and Mahendran (2011). These finite element models were first validated by comparing their results with experimental test results from this study and Kolarkar (2010). The developed finite element models were able to predict the failure times within 5 minutes. The validated model was then used in a detailed numerical study into the strength of cold-formed thin-walled steel channels used in both the conventional and the new composite panel systems to increase the understanding of their behaviour under nonuniform elevated temperature conditions and to develop fire design rules. The measured time-temperature distributions obtained from the fire tests were used. Since the fire tests showed that the plasterboards provided sufficient lateral restraint until the failure of LSF wall panels, this assumption was also used in the analyses and was further validated by comparison with experimental results. Hence in this study of LSF wall studs, only the flexural buckling about the major axis and local buckling were considered. A new fire design method was proposed using AS/NZS 4600 (SA, 2005), NAS (AISI, 2007) and Eurocode 3 Part 1.3 (ECS, 2006). The importance of considering thermal bowing, magnified thermal bowing and neutral axis shift in the fire design was also investigated. A spread sheet based design tool was developed based on the above design codes to predict the failure load ratio versus time and temperature for varying LSF wall configurations including insulations. Idealised time-temperature profiles were developed based on the measured temperature values of the studs. This was used in a detailed numerical study to fully understand the structural behaviour of LSF wall panels. Appropriate equations were proposed to find the critical temperatures for different composite panels, varying in steel thickness, steel grade and screw spacing for any load ratio. Hence useful and simple design rules were proposed based on the current cold-formed steel structures and fire design standards, and their accuracy and advantages were discussed. The results were also used to validate the fire design rules developed based on AS/NZS 4600 (SA, 2005) and Eurocode Part 1.3 (ECS, 2006). This demonstrated the significant improvements to the design method when compared to the currently used prescriptive design methods for LSF wall systems under fire conditions. In summary, this research has developed comprehensive experimental and numerical thermal and structural performance data for both the conventional and the proposed new load bearing LSF wall systems under standard fire conditions. Finite element models were developed to predict the failure times of LSF walls accurately. Idealized hot flange temperature profiles were developed for non-insulated, cavity and externally insulated load bearing wall systems. Suitable fire design rules and spread sheet based design tools were developed based on the existing standards to predict the ultimate failure load, failure times and failure temperatures of LSF wall studs. Simplified equations were proposed to find the critical temperatures for varying wall panel configurations and load ratios. The results from this research are useful to both structural and fire engineers and researchers. Most importantly, this research has significantly improved the knowledge and understanding of cold-formed LSF loadbearing walls under standard fire conditions.
Resumo:
The rapid growth in the number of users using social networks and the information that a social network requires about their users make the traditional matching systems insufficiently adept at matching users within social networks. This paper introduces the use of clustering to form communities of users and, then, uses these communities to generate matches. Forming communities within a social network helps to reduce the number of users that the matching system needs to consider, and helps to overcome other problems from which social networks suffer, such as the absence of user activities' information about a new user. The proposed system has been evaluated on a dataset obtained from an online dating website. Empirical analysis shows that accuracy of the matching process is increased using the community information.
Resumo:
Recent increases in cycling have led to many media articles highlighting concerns about interactions between cyclists and pedestrians on footpaths and off-road paths. Under the Australian Road Rules, adults are not allowed to ride on footpaths unless accompanying a child 12 years of age or younger. However, this rule does not apply in Queensland. This paper reviews international studies that examine the safety of footpath cycling for both cyclists and pedestrians, and relevant Australian crash and injury data. The results of a survey of more than 2,500 Queensland adult cyclists are presented in terms of the frequency of footpath cycling, the characteristics of those cyclists and the characteristics of self-reported footpath crashes. A third of the respondents reported riding on the footpath and, of those, about two-thirds did so reluctantly. Riding on the footpath was more common for utilitarian trips and for new riders, although the average distance ridden on footpaths was greater for experienced riders. About 5% of distance ridden and a similar percentage of self-reported crashes occurred on footpaths. These data are discussed in terms of the Safe Systems principle of separating road users with vastly different levels of kinetic energy. The paper concludes that footpaths are important facilities for both inexperienced and experienced riders and for utilitarian riding, especially in locations riders consider do not provide a safe system for cycling.
Resumo:
A 4-cylinder Ford 2701C test engine was used in this study to explore the impact of ethanol fumigation on gaseous and particle emission concentrations. The fumigation technique delivered vaporised ethanol into the intake manifold of the engine, using an injector, a pump and pressure regulator, a heat exchanger for vaporising ethanol and a separate fuel tank and lines. Gaseous (Nitric oxide (NO), Carbon monoxide (CO) and hydrocarbons (HC)) and particulate emissions (particle mass (PM2.5) and particle number) testing was conducted at intermediate speed (1700 rpm) using 4 load settings with ethanol substitution percentages ranging from 10-40 % (by energy). With ethanol fumigation, NO and PM2.5 emissions were reduced, whereas CO and HC emissions increased considerably and particle number emissions increased at most test settings. It was found that ethanol fumigation reduced the excess air factor for the engine and this led to increased emissions of CO and HC, but decreased emissions of NO. PM2.5 emissions were reduced with ethanol fumigation, as ethanol has a very low “sooting” tendency. This is due to the higher hydrogen-to-carbon ratio of this fuel, and also because ethanol does not contain aromatics, both of which are known soot precursors. The use of a diesel oxidation catalyst (as an after-treatment device) is recommended to achieve a reduction in the four pollutants that are currently regulated for compression ignition engines. The increase in particle number emissions with ethanol fumigation was due to the formation of volatile (organic) particles; consequently, using a diesel oxidation catalyst will also assist in reducing particle number emissions.
Resumo:
Nineteen studies met the inclusion criteria. A skin temperature reduction of 5–15 °C, in accordance with the recent PRICE (Protection, Rest, Ice, Compression and Elevation) guidelines, were achieved using cold air, ice massage, crushed ice, cryotherapy cuffs, ice pack, and cold water immersion. There is evidence supporting the use and effectiveness of thermal imaging in order to access skin temperature following the application of cryotherapy. Thermal imaging is a safe and non-invasive method of collecting skin temperature. Although further research is required, in terms of structuring specific guidelines and protocols, thermal imaging appears to be an accurate and reliable method of collecting skin temperature data following cryotherapy. Currently there is ambiguity regarding the optimal skin temperature reductions in a medical or sporting setting. However, this review highlights the ability of several different modalities of cryotherapy to reduce skin temperature.