750 resultados para zero-inflated data


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The rapid growth in the number of users using social networks and the information that a social network requires about their users make the traditional matching systems insufficiently adept at matching users within social networks. This paper introduces the use of clustering to form communities of users and, then, uses these communities to generate matches. Forming communities within a social network helps to reduce the number of users that the matching system needs to consider, and helps to overcome other problems from which social networks suffer, such as the absence of user activities' information about a new user. The proposed system has been evaluated on a dataset obtained from an online dating website. Empirical analysis shows that accuracy of the matching process is increased using the community information.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent increases in cycling have led to many media articles highlighting concerns about interactions between cyclists and pedestrians on footpaths and off-road paths. Under the Australian Road Rules, adults are not allowed to ride on footpaths unless accompanying a child 12 years of age or younger. However, this rule does not apply in Queensland. This paper reviews international studies that examine the safety of footpath cycling for both cyclists and pedestrians, and relevant Australian crash and injury data. The results of a survey of more than 2,500 Queensland adult cyclists are presented in terms of the frequency of footpath cycling, the characteristics of those cyclists and the characteristics of self-reported footpath crashes. A third of the respondents reported riding on the footpath and, of those, about two-thirds did so reluctantly. Riding on the footpath was more common for utilitarian trips and for new riders, although the average distance ridden on footpaths was greater for experienced riders. About 5% of distance ridden and a similar percentage of self-reported crashes occurred on footpaths. These data are discussed in terms of the Safe Systems principle of separating road users with vastly different levels of kinetic energy. The paper concludes that footpaths are important facilities for both inexperienced and experienced riders and for utilitarian riding, especially in locations riders consider do not provide a safe system for cycling.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Serving as a powerful tool for extracting localized variations in non-stationary signals, applications of wavelet transforms (WTs) in traffic engineering have been introduced; however, lacking in some important theoretical fundamentals. In particular, there is little guidance provided on selecting an appropriate WT across potential transport applications. This research described in this paper contributes uniquely to the literature by first describing a numerical experiment to demonstrate the shortcomings of commonly-used data processing techniques in traffic engineering (i.e., averaging, moving averaging, second-order difference, oblique cumulative curve, and short-time Fourier transform). It then mathematically describes WT’s ability to detect singularities in traffic data. Next, selecting a suitable WT for a particular research topic in traffic engineering is discussed in detail by objectively and quantitatively comparing candidate wavelets’ performances using a numerical experiment. Finally, based on several case studies using both loop detector data and vehicle trajectories, it is shown that selecting a suitable wavelet largely depends on the specific research topic, and that the Mexican hat wavelet generally gives a satisfactory performance in detecting singularities in traffic and vehicular data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The encryption method is a well established technology for protecting sensitive data. However, once encrypted, the data can no longer be easily queried. The performance of the database depends on how to encrypt the sensitive data. In this paper we review the conventional encryption method which can be partially queried and propose the encryption method for numerical data which can be effectively queried. The proposed system includes the design of the service scenario, and metadata.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The National Road Safety Strategy 2011-2020 outlines plans to reduce the burden of road trauma via improvements and interventions relating to safe roads, safe speeds, safe vehicles, and safe people. It also highlights that a key aspect in achieving these goals is the availability of comprehensive data on the issue. The use of data is essential so that more in-depth epidemiologic studies of risk can be conducted as well as to allow effective evaluation of road safety interventions and programs. Before utilising data to evaluate the efficacy of prevention programs it is important for a systematic evaluation of the quality of underlying data sources to be undertaken to ensure any trends which are identified reflect true estimates rather than spurious data effects. However, there has been little scientific work specifically focused on establishing core data quality characteristics pertinent to the road safety field and limited work undertaken to develop methods for evaluating data sources according to these core characteristics. There are a variety of data sources in which traffic-related incidents and resulting injuries are recorded, which are collected for a variety of defined purposes. These include police reports, transport safety databases, emergency department data, hospital morbidity data and mortality data to name a few. However, as these data are collected for specific purposes, each of these data sources suffers from some limitations when seeking to gain a complete picture of the problem. Limitations of current data sources include: delays in data being available, lack of accurate and/or specific location information, and an underreporting of crashes involving particular road user groups such as cyclists. This paper proposes core data quality characteristics that could be used to systematically assess road crash data sources to provide a standardised approach for evaluating data quality in the road safety field. The potential for data linkage to qualitatively and quantitatively improve the quality and comprehensiveness of road crash data is also discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

-International recognition of need for public health response to child maltreatment -Need for early intervention at health system level -Important role of health professionals in identifying, reporting, documenting suspician of maltreatment -Up to 10% of all children presenting at ED’s are victims and without identification, 35% reinjured and 5% die -In Qld, mandatory reporting requirement for doctors and nurses for suspected abuse or neglect

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We report three developments toward resolving the challenge of the apparent basal polytomy of neoavian birds. First, we describe improved conditional down-weighting techniques to reduce noise relative to signal for deeper divergences and find increased agreement between data sets. Second, we present formulae for calculating the probabilities of finding predefined groupings in the optimal tree. Finally, we report a significant increase in data: nine new mitochondrial (mt) genomes (the dollarbird, New Zealand kingfisher, great potoo, Australian owlet-nightjar, white-tailed trogon, barn owl, a roadrunner [a ground cuckoo], New Zealand long-tailed cuckoo, and the peach-faced lovebird) and together they provide data for each of the six main groups of Neoaves proposed by Cracraft J (2001). We use his six main groups of modern birds as priors for evaluation of results. These include passerines, cuckoos, parrots, and three other groups termed “WoodKing” (woodpeckers/rollers/kingfishers), “SCA” (owls/potoos/owlet-nightjars/hummingbirds/swifts), and “Conglomerati.” In general, the support is highly significant with just two exceptions, the owls move from the “SCA” group to the raptors, particularly accipitrids (buzzards/eagles) and the osprey, and the shorebirds may be an independent group from the rest of the “Conglomerati”. Molecular dating mt genomes support a major diversification of at least 12 neoavian lineages in the Late Cretaceous. Our results form a basis for further testing with both nuclear-coding sequences and rare genomic changes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mandatory data breach notification laws have been a significant legislative reform in response to unauthorized disclosures of personal information by public and private sector organizations. These laws originated in the state-based legislatures of the United States during the last decade and have subsequently garnered worldwide legislative interest. We contend that there are conceptual and practical concerns regarding mandatory data breach notification laws which limit the scope of their applicability, particularly in relation to existing information privacy law regimes. We outline these concerns here, in the light of recent European Union and Australian legal developments in this area.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the medical and healthcare arena, patients‟ data is not just their own personal history but also a valuable large dataset for finding solutions for diseases. While electronic medical records are becoming popular and are used in healthcare work places like hospitals, as well as insurance companies, and by major stakeholders such as physicians and their patients, the accessibility of such information should be dealt with in a way that preserves privacy and security. Thus, finding the best way to keep the data secure has become an important issue in the area of database security. Sensitive medical data should be encrypted in databases. There are many encryption/ decryption techniques and algorithms with regard to preserving privacy and security. Currently their performance is an important factor while the medical data is being managed in databases. Another important factor is that the stakeholders should decide more cost-effective ways to reduce the total cost of ownership. As an alternative, DAS (Data as Service) is a popular outsourcing model to satisfy the cost-effectiveness but it takes a consideration that the encryption/ decryption modules needs to be handled by trustworthy stakeholders. This research project is focusing on the query response times in a DAS model (AES-DAS) and analyses the comparison between the outsourcing model and the in-house model which incorporates Microsoft built-in encryption scheme in a SQL Server. This research project includes building a prototype of medical database schemas. There are 2 types of simulations to carry out the project. The first stage includes 6 databases in order to carry out simulations to measure the performance between plain-text, Microsoft built-in encryption and AES-DAS (Data as Service). Particularly, the AES-DAS incorporates implementations of symmetric key encryption such as AES (Advanced Encryption Standard) and a Bucket indexing processor using Bloom filter. The results are categorised such as character type, numeric type, range queries, range queries using Bucket Index and aggregate queries. The second stage takes the scalability test from 5K to 2560K records. The main result of these simulations is that particularly as an outsourcing model, AES-DAS using the Bucket index shows around 3.32 times faster than a normal AES-DAS under the 70 partitions and 10K record-sized databases. Retrieving Numeric typed data takes shorter time than Character typed data in AES-DAS. The aggregation query response time in AES-DAS is not as consistent as that in MS built-in encryption scheme. The scalability test shows that the DBMS reaches in a certain threshold; the query response time becomes rapidly slower. However, there is more to investigate in order to bring about other outcomes and to construct a secured EMR (Electronic Medical Record) more efficiently from these simulations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The National Morbidity, Mortality, and Air Pollution Study (NMMAPS) was designed to examine the health effects of air pollution in the United States. The primary question was whether particulate matter was responsible for the associations between air pollution and daily mortality. Secondary questions concerned measurement error in air pollution and mortality displacement.1 Since then, NMMAPS has been used to answer many important questions in environmental epidemiology...

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The effects of tumour motion during radiation therapy delivery have been widely investigated. Motion effects have become increasingly important with the introduction of dynamic radiotherapy delivery modalities such as enhanced dynamic wedges (EDWs) and intensity modulated radiation therapy (IMRT) where a dynamically collimated radiation beam is delivered to the moving target, resulting in dose blurring and interplay effects which are a consequence of the combined tumor and beam motion. Prior to this work, reported studies on the EDW based interplay effects have been restricted to the use of experimental methods for assessing single-field non-fractionated treatments. In this work, the interplay effects have been investigated for EDW treatments. Single and multiple field treatments have been studied using experimental and Monte Carlo (MC) methods. Initially this work experimentally studies interplay effects for single-field non-fractionated EDW treatments, using radiation dosimetry systems placed on a sinusoidaly moving platform. A number of wedge angles (60º, 45º and 15º), field sizes (20 × 20, 10 × 10 and 5 × 5 cm2), amplitudes (10-40 mm in step of 10 mm) and periods (2 s, 3 s, 4.5 s and 6 s) of tumor motion are analysed (using gamma analysis) for parallel and perpendicular motions (where the tumor and jaw motions are either parallel or perpendicular to each other). For parallel motion it was found that both the amplitude and period of tumor motion affect the interplay, this becomes more prominent where the collimator tumor speeds become identical. For perpendicular motion the amplitude of tumor motion is the dominant factor where as varying the period of tumor motion has no observable effect on the dose distribution. The wedge angle results suggest that the use of a large wedge angle generates greater dose variation for both parallel and perpendicular motions. The use of small field size with a large tumor motion results in the loss of wedged dose distribution for both parallel and perpendicular motion. From these single field measurements a motion amplitude and period have been identified which show the poorest agreement between the target motion and dynamic delivery and these are used as the „worst case motion parameters.. The experimental work is then extended to multiple-field fractionated treatments. Here a number of pre-existing, multiple–field, wedged lung plans are delivered to the radiation dosimetry systems, employing the worst case motion parameters. Moreover a four field EDW lung plan (using a 4D CT data set) is delivered to the IMRT quality control phantom with dummy tumor insert over four fractions using the worst case parameters i.e. 40 mm amplitude and 6 s period values. The analysis of the film doses using gamma analysis at 3%-3mm indicate the non averaging of the interplay effects for this particular study with a gamma pass rate of 49%. To enable Monte Carlo modelling of the problem, the DYNJAWS component module (CM) of the BEAMnrc user code is validated and automated. DYNJAWS has been recently introduced to model the dynamic wedges. DYNJAWS is therefore commissioned for 6 MV and 10 MV photon energies. It is shown that this CM can accurately model the EDWs for a number of wedge angles and field sizes. The dynamic and step and shoot modes of the CM are compared for their accuracy in modelling the EDW. It is shown that dynamic mode is more accurate. An automation of the DYNJAWS specific input file has been carried out. This file specifies the probability of selection of a subfield and the respective jaw coordinates. This automation simplifies the generation of the BEAMnrc input files for DYNJAWS. The DYNJAWS commissioned model is then used to study multiple field EDW treatments using MC methods. The 4D CT data of an IMRT phantom with the dummy tumor is used to produce a set of Monte Carlo simulation phantoms, onto which the delivery of single field and multiple field EDW treatments is simulated. A number of static and motion multiple field EDW plans have been simulated. The comparison of dose volume histograms (DVHs) and gamma volume histograms (GVHs) for four field EDW treatments (where the collimator and patient motion is in the same direction) using small (15º) and large wedge angles (60º) indicates a greater mismatch between the static and motion cases for the large wedge angle. Finally, to use gel dosimetry as a validation tool, a new technique called the „zero-scan method. is developed for reading the gel dosimeters with x-ray computed tomography (CT). It has been shown that multiple scans of a gel dosimeter (in this case 360 scans) can be used to reconstruct a zero scan image. This zero scan image has a similar precision to an image obtained by averaging the CT images, without the additional dose delivered by the CT scans. In this investigation the interplay effects have been studied for single and multiple field fractionated EDW treatments using experimental and Monte Carlo methods. For using the Monte Carlo methods the DYNJAWS component module of the BEAMnrc code has been validated and automated and further used to study the interplay for multiple field EDW treatments. Zero-scan method, a new gel dosimetry readout technique has been developed for reading the gel images using x-ray CT without losing the precision and accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Monitoring environmental health is becoming increasingly important as human activity and climate change place greater pressure on global biodiversity. Acoustic sensors provide the ability to collect data passively, objectively and continuously across large areas for extended periods. While these factors make acoustic sensors attractive as autonomous data collectors, there are significant issues associated with large-scale data manipulation and analysis. We present our current research into techniques for analysing large volumes of acoustic data efficiently. We provide an overview of a novel online acoustic environmental workbench and discuss a number of approaches to scaling analysis of acoustic data; online collaboration, manual, automatic and human-in-the loop analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Distraction whilst driving on an approach to a signalized intersection is particularly dangerous, as potential vehicular conflicts and resulting angle collisions tend to be severe. This study examines the decisions of distracted drivers during the onset of amber lights. Driving simulator data were obtained from a sample of 58 drivers under baseline and handheld mobile phone conditions at the University of IOWA - National Advanced Driving Simulator. Explanatory variables include age, gender, cell phone use, distance to stop-line, and speed. An iterative combination of decision tree and logistic regression analyses are employed to identify main effects, non-linearities, and interactions effects. Results show that novice (16-17 years) and younger (18-25 years) drivers’ had heightened amber light running risk while distracted by cell phone, and speed and distance thresholds yielded significant interaction effects. Driver experience captured by age has a multiplicative effect with distraction, making the combined effect of being inexperienced and distracted particularly risky. Solutions are needed to combat the use of mobile phones whilst driving.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most approaches to business process compliance are restricted to the analysis of the structure of processes. It has been argued that full regulatory compliance requires information on not only the structure of processes but also on what the tasks in a process do. To this end Governatori and Sadiq[2007] proposed to extend business processes with semantic annotations. We propose a methodology to automatically extract one kind of such annotations; in particular the annotations related to the data schema and templates linked to the various tasks in a business process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The quadrupole coupling constants (qcc) for39K and23Na ions in glycerol have been calculated from linewidths measured as a function of temperature (which in turn results in changes in solution viscosity). The qcc of39K in glycerol is found to be 1.7 MHz, and that of23Na is 1.6 MHz. The relaxation behavior of39K and23Na ions in glycerol shows magnetic field and temperature dependence consistent with the equations for transverse relaxation more commonly used to describe the reorientation of nuclei in a molecular framework with intramolecular field gradients. It is shown, however, that τc is not simply proportional to the ratio of viscosity/temperature (ηT). The 39K qcc in glycerol and the value of 1.3 MHz estimated for this nucleus in aqueous solution are much greater than values of 0.075 to 0.12 MHz calculated from T2 measurements of39K in freshly excised rat tissues. This indicates that, in biological samples, processes such as exchange of potassium between intracellular compartments or diffusion of ions through locally ordered regions play a significant role in determining the effective quadrupole coupling constant and correlation time governing39K relaxation. T1 and T2 measurements of rat muscle at two magnetic fields also indicate that a more complex correlation function may be required to describe the relaxation of39K in tissue. Similar results and conclusions are found for23Na.