424 resultados para Blog datasets
Resumo:
This thesis investigates and develops techniques for accurately detecting Internet-based Distributed Denial-of-Service (DDoS) Attacks where an adversary harnesses the power of thousands of compromised machines to disrupt the normal operations of a Web-service provider, resulting in significant down-time and financial losses. This thesis also develops methods to differentiate these attacks from similar-looking benign surges in web-traffic known as Flash Events (FEs). This thesis also addresses an intrinsic challenge in research associated with DDoS attacks, namely, the extreme scarcity of public domain datasets (due to legal and privacy issues) by developing techniques to realistically emulate DDoS attack and FE traffic.
Resumo:
Queensland University of Technology (QUT) Library offers a range of resources and services to researchers as part of their research support portfolio. This poster will present key features of two of the data management services offered by research support staff at QUT Library. The first service is QUT Research Data Finder (RDF), a product of the Australian National Data Service (ANDS) funded Metadata Stores project. RDF is a data registry (metadata repository) that aims to publicise datasets that are research outputs arising from completed QUT research projects. The second is a software and code registry, which is currently under development with the sole purpose of improving discovery of source code and software as QUT research outputs. RESEARCH DATA FINDER As an integrated metadata repository, Research Data Finder aligns with institutional sources of truth, such as QUT’s research administration system, ResearchMaster, as well as QUT’s Academic Profiles system to provide high quality data descriptions that increase awareness of, and access to, shareable research data. The repository and its workflows are designed to foster better data management practices, enhance opportunities for collaboration and research, promote cross-disciplinary research and maximise the impact of existing research data sets. SOFTWARE AND CODE REGISTRY The QUT Library software and code registry project stems from concerns amongst researchers with regards to development activities, storage, accessibility, discoverability and impact, sharing, copyright and IP ownership of software and code. As a result, the Library is developing a registry for code and software research outputs, which will use existing Research Data Finder architecture. The underpinning software for both registries is VIVO, open source software developed by Cornell University. The registry will use the Research Data Finder service instance of VIVO and will include a searchable interface, links to code/software locations and metadata feeds to Research Data Australia. Key benefits of the project include:improving the discoverability and reuse of QUT researchers’ code and software amongst QUT and the QUT research community; increasing the profile of QUT research outputs on a national level by providing a metadata feed to Research Data Australia, and; improving the metrics for access and reuse of code and software in the repository.
Resumo:
This paper is a work in progress that examines current consumer engagement with eHealth information through Smartphones or tablets. We focus on three activity types: seeking, posting and ‘other’ engagement activity and compare two age groups, 25-40s and over 40-55s. Findings show that around 30% of the younger age group is engaging with Government and other Health providers’ websites, receiving eHealth emails, and reading other people’s comments about health related issues in online discussion groups/websites/blog. Approximately 20% engage with Government and other Health providers’ social media and watch or listen to audio or video podcasts. For the older age group, their most active engagement with eHealth information is in the seeking category through Government or other health websites (approximately 15%), and less than 10% for social media sites. Their posting activity is less than 5%. Other activities show that less than 15% of the older age group engages through receiving emails and reading blogs, less than 10% watch or listen to podcasts, and their online consulting activity is less than 7%. We note that scores are low for both groups in terms of engaging with eHealth information through Twitter.
Resumo:
Operational modal analysis (OMA) is prevalent in modal identifi cation of civil structures. It asks for response measurements of the underlying structure under ambient loads. A valid OMA method requires the excitation be white noise in time and space. Although there are numerous applications of OMA in the literature, few have investigated the statistical distribution of a measurement and the infl uence of such randomness to modal identifi cation. This research has attempted modifi ed kurtosis to evaluate the statistical distribution of raw measurement data. In addition, a windowing strategy employing this index has been proposed to select quality datasets. In order to demonstrate how the data selection strategy works, the ambient vibration measurements of a laboratory bridge model and a real cable-stayed bridge have been respectively considered. The analysis incorporated with frequency domain decomposition (FDD) as the target OMA approach for modal identifi cation. The modal identifi cation results using the data segments with different randomness have been compared. The discrepancy in FDD spectra of the results indicates that, in order to fulfi l the assumption of an OMA method, special care shall be taken in processing a long vibration measurement data. The proposed data selection strategy is easy-to-apply and verifi ed effective in modal analysis.
Resumo:
Currently there are ~3000 known species of Sarcophagidae (Diptera), which are classified into 173 genera in three subfamilies. Almost 25% of sarcophagids belong to the genus Sarcophaga (sensu lato) however little is known about the validity of, and relationships between the ~150 (or more) subgenera of Sarcophaga s.l. In this preliminary study, we evaluated the usefulness of three sources of data for resolving relationships between 35 species from 14 Sarcophaga s.l. subgenera: the mitochondrial COI barcode region, ~800. bp of the nuclear gene CAD, and 110 morphological characters. Bayesian, maximum likelihood (ML) and maximum parsimony (MP) analyses were performed on the combined dataset. Much of the tree was only supported by the Bayesian and ML analyses, with the MP tree poorly resolved. The genus Sarcophaga s.l. was resolved as monophyletic in both the Bayesian and ML analyses and strong support was obtained at the species-level. Notably, the only subgenus consistently resolved as monophyletic was Liopygia. The monophyly of and relationships between the remaining Sarcophaga s.l. subgenera sampled remain questionable. We suggest that future phylogenetic studies on the genus Sarcophaga s.l. use combined datasets for analyses. We also advocate the use of additional data and a range of inference strategies to assist with resolving relationships within Sarcophaga s.l.
Resumo:
When it comes to discussing art and complex cultural issues, does the emphasis on provocation merely reduce issues into straightforward oppositions, at the cost of developed argument, consistency and any nuanced engagement? A response to Camille Paglia's "how capitalism can save art?"
Resumo:
We conducted an association study across the human leukocyte antigen (HLA) complex to identify loci associated with multiple sclerosis (MS). Comparing 1927 SNPs in 1618 MS cases and 3413 controls of European ancestry, we identified seven SNPs that were independently associated with MS conditional on the others (each ). All associations were significant in an independent replication cohort of 2212 cases and 2251 controls () and were highly significant in the combined dataset (). The associated SNPs included proxies for HLA-DRB1*15:01 and HLA-DRB1*03:01, and SNPs in moderate linkage disequilibrium (LD) with HLA-A*02:01, HLA-DRB1*04:01 and HLA-DRB1*13:03. We also found a strong association with rs9277535 in the class II gene HLA-DPB1 (discovery set , replication set , combined ). HLA-DPB1 is located centromeric of the more commonly typed class II genes HLA-DRB1, -DQA1 and -DQB1. It is separated from these genes by a recombination hotspot, and the association is not affected by conditioning on genotypes at DRB1, DQA1 and DQB1. Hence rs9277535 represents an independent MS-susceptibility locus of genome-wide significance. It is correlated with the HLA-DPB1*03:01 allele, which has been implicated previously in MS in smaller studies. Further genotyping in large datasets is required to confirm and resolve this association.
Resumo:
Tag recommendation is a specific recommendation task for recommending metadata (tag) for a web resource (item) during user annotation process. In this context, sparsity problem refers to situation where tags need to be produced for items with few annotations or for user who tags few items. Most of the state of the art approaches in tag recommendation are rarely evaluated or perform poorly under this situation. This paper presents a combined method for mitigating sparsity problem in tag recommendation by mainly expanding and ranking candidate tags based on similar items’ tags and existing tag ontology. We evaluated the approach on two public social bookmarking datasets. The experiment results show better accuracy for recommendation in sparsity situation over several state of the art methods.
Resumo:
The Bluetooth technology is being increasingly used to track vehicles throughout their trips, within urban networks and across freeway stretches. One important opportunity offered by this type of data is the measurement of Origin-Destination patterns, emerging from the aggregation and clustering of individual trips. In order to obtain accurate estimations, however, a number of issues need to be addressed, through data filtering and correction techniques. These issues mainly stem from the use of the Bluetooth technology amongst drivers, and the physical properties of the Bluetooth sensors themselves. First, not all cars are equipped with discoverable Bluetooth devices and the Bluetooth-enabled vehicles may belong to some small socio-economic groups of users. Second, the Bluetooth datasets include data from various transport modes; such as pedestrian, bicycles, cars, taxi driver, buses and trains. Third, the Bluetooth sensors may fail to detect all of the nearby Bluetooth-enabled vehicles. As a consequence, the exact journey for some vehicles may become a latent pattern that will need to be extracted from the data. Finally, sensors that are in close proximity to each other may have overlapping detection areas, thus making the task of retrieving the correct travelled path even more challenging. The aim of this paper is twofold. We first give a comprehensive overview of the aforementioned issues. Further, we propose a methodology that can be followed, in order to cleanse, correct and aggregate Bluetooth data. We postulate that the methods introduced by this paper are the first crucial steps that need to be followed in order to compute accurate Origin-Destination matrices in urban road networks.
Resumo:
Collisions between pedestrians and vehicles continue to be a major problem throughout the world. Pedestrians trying to cross roads and railway tracks without any caution are often highly susceptible to collisions with vehicles and trains. Continuous financial, human and other losses have prompted transport related organizations to come up with various solutions addressing this issue. However, the quest for new and significant improvements in this area is still ongoing. This work addresses this issue by building a general framework using computer vision techniques to automatically monitor pedestrian movements in such high-risk areas to enable better analysis of activity, and the creation of future alerting strategies. As a result of rapid development in the electronics and semi-conductor industry there is extensive deployment of CCTV cameras in public places to capture video footage. This footage can then be used to analyse crowd activities in those particular places. This work seeks to identify the abnormal behaviour of individuals in video footage. In this work we propose using a Semi-2D Hidden Markov Model (HMM), Full-2D HMM and Spatial HMM to model the normal activities of people. The outliers of the model (i.e. those observations with insufficient likelihood) are identified as abnormal activities. Location features, flow features and optical flow textures are used as the features for the model. The proposed approaches are evaluated using the publicly available UCSD datasets, and we demonstrate improved performance using a Semi-2D Hidden Markov Model compared to other state of the art methods. Further we illustrate how our proposed methods can be applied to detect anomalous events at rail level crossings.
Resumo:
In this paper, we explore the effectiveness of patch-based gradient feature extraction methods when applied to appearance-based gait recognition. Extending existing popular feature extraction methods such as HOG and LDP, we propose a novel technique which we term the Histogram of Weighted Local Directions (HWLD). These 3 methods are applied to gait recognition using the GEI feature, with classification performed using SRC. Evaluations on the CASIA and OULP datasets show significant improvements using these patch-based methods over existing implementations, with the proposed method achieving the highest recognition rate for the respective datasets. In addition, the HWLD can easily be extended to 3D, which we demonstrate using the GEV feature on the DGD dataset, observing improvements in performance.
Resumo:
Automated crowd counting has become an active field of computer vision research in recent years. Existing approaches are scene-specific, as they are designed to operate in the single camera viewpoint that was used to train the system. Real world camera networks often span multiple viewpoints within a facility, including many regions of overlap. This paper proposes a novel scene invariant crowd counting algorithm that is designed to operate across multiple cameras. The approach uses camera calibration to normalise features between viewpoints and to compensate for regions of overlap. This compensation is performed by constructing an 'overlap map' which provides a measure of how much an object at one location is visible within other viewpoints. An investigation into the suitability of various feature types and regression models for scene invariant crowd counting is also conducted. The features investigated include object size, shape, edges and keypoints. The regression models evaluated include neural networks, K-nearest neighbours, linear and Gaussian process regresion. Our experiments demonstrate that accurate crowd counting was achieved across seven benchmark datasets, with optimal performance observed when all features were used and when Gaussian process regression was used. The combination of scene invariance and multi camera crowd counting is evaluated by training the system on footage obtained from the QUT camera network and testing it on three cameras from the PETS 2009 database. Highly accurate crowd counting was observed with a mean relative error of less than 10%. Our approach enables a pre-trained system to be deployed on a new environment without any additional training, bringing the field one step closer toward a 'plug and play' system.
Resumo:
Endometrial cancer is one of the most common female diseases in developed nations and is the most commonly diagnosed gynaecological cancer in Australia. The disease is commonly classified by histology: endometrioid or non-endometrioid endometrial cancer. While non-endometrioid endometrial cancers are accepted to be high-grade, aggressive cancers, endometrioid cancers (comprising 80% of all endometrial cancers diagnosed) generally carry a favourable patient prognosis. However, endometrioid endometrial cancer patients endure significant morbidity due to surgery and radiotherapy used for disease treatment, and patients with recurrent disease have a 5-year survival rate of less than 50%. Genetic analysis of women with endometrial cancer could uncover novel markers associated with disease risk and/or prognosis, which could then be used to identify women at high risk and for the use of specialised treatments. Proteases are widely accepted to play an important role in the development and progression of cancer. This PhD project hypothesised that SNPs from two protease gene families, the matrix metalloproteases (MMPs, including their tissue inhibitors, TIMPs) and the tissue kallikrein-related peptidases (KLKs) would be associated with endometrial cancer susceptibility and/or prognosis. In the first part of this study, optimisation of the genotyping techniques was performed. Results from previously published endometrial cancer genetic association studies were attempted to be validated in a large, multicentre replication set (maximum cases n = 2,888, controls n = 4,483, 3 studies). The rs11224561 progesterone receptor SNP (PGR, A/G) was observed to be associated with increased endometrial cancer risk (per A allele OR 1.31, 95% CI 1.12-1.53; p-trend = 0.001), a result which was initially reported among a Chinese sample set. Previously reported associations for the remaining 8 SNPs investigated for this section of the PhD study were not confirmed, thereby reinforcing the importance of validation of genetic association studies. To examine the effect of SNPs from the MMP and KLK families on endometrial cancer risk, we selected the most significantly associated MMP and KLK SNPs from genome-wide association study analysis (GWAS) to be genotyped in the GWAS replication set (cases n = 4,725, controls n = 9,803, 13 studies). The significance of the MMP24 rs932562 SNP was unchanged after incorporation of the stage 2 samples (Stage 1 per allele OR 1.18, p = 0.002; Combined Stage 1 and 2 OR 1.09, p = 0.002). The rs10426 SNP, located 3' to KLK10 was predicted by bioinformatic analysis to effect miRNA binding. This SNP was observed in the GWAS stage 1 result to exhibit a recessive effect on endometrial cancer risk, a result which was not validated in the stage 2 sample set (Stage 1 OR 1.44, p = 0.007; Combined Stage 1 and 2 OR 1.14, p = 0.08). Investigation of the regions imputed surrounding the MMP, TIMP and KLK genes did not reveal any significant targets for further analysis. Analysis of the case data from the endometrial cancer GWAS to identify genetic variation associated with cancer grade did not reveal SNPs from the MMP, TIMP or KLK genes to be statistically significant. However, the representation of SNPs from the MMP, TIMP and KLK families by the GWAS genotyping platform used in this PhD project was examined and observed to be very low, with the genetic variation of four genes (MMP23A, MMP23B, MMP28 and TIMP1) not captured at all by this technique. This suggests that comprehensive candidate gene association studies will be required to assess the role of SNPs from these genes with endometrial cancer risk and prognosis. Meta-analysis of gene expression microarray datasets curated as part of this PhD study identified a number of MMP, TIMP and KLK genes to display differential expression by endometrial cancer status (MMP2, MMP10, MMP11, MMP13, MMP19, MMP25 and KLK1) and histology (MMP2, MMP11, MMP12, MMP26, MMP28, TIMP2, TIMP3, KLK6, KLK7, KLK11 and KLK12). In light of these findings these genes should be prioritised for future targeted genetic association studies. Two SNPs located 43.5 Mb apart on chromosome 15 were observed from the GWAS analysis to be associated with increased endometrial cancer grade, results that were validated in silico in two independent datasets. One of these SNPs, rs8035725 is located in the 5' untranslated region of a MYC promoter binding protein DENND4A (Stage 1 OR 1.15, p = 9.85 x 10P -5 P, combined Stage 1 and in silico validation OR 1.13, p = 5.24 x 10P -6 P). This SNP has previously been reported to alter the expression of PTPLAD1, a gene involved in the synthesis of very long fatty acid chains and in the Rac1 signaling pathway. Meta-analysis of gene expression microarray data found PTPLAD1 to display increased expression in the aggressive non-endometrioid histology compared with endometrioid endometrial cancer, suggesting that the causal SNP underlying the observed genetic association may influence expression of this gene. Neither rs8035725 nor significant SNPs identified by imputation were predicted bioinformatically to affect transcription factor binding sites, indicating that further studies are required to assess their potential effect on other regulatory elements. The other grade- associated SNP, rs6606792, is located upstream of an inferred pseudogene, ELMO2P1 (Stage 1 OR 1.12, p = 5 x 10P -5 P; combined Stage 1 and in silico validation OR 1.09, p = 3.56 x 10P -5 P). Imputation of the ±1 Mb region surrounding this SNP revealed a cluster of significantly associated variants which are predicted to abolish various transcription factor binding sites, and would be expected to decrease gene expression. ELMO2P1 was not included on the microarray platforms collected for this PhD, and so its expression could not be investigated. However, the high sequence homology of ELMO2P1 with ELMO2, a gene important to cell motility, indicates that ELMO2 could be the parent gene for ELMO2P1 and as such, ELMO2P1 could function to regulate the expression of ELMO2. Increased expression of ELMO2 was seen to be associated with increasing endometrial cancer grade, as well as with aggressive endometrial cancer histological subtypes by microarray meta-analysis. Thus, it is hypothesised that SNPs in linkage disequilibrium with rs6606792 decrease the transcription of ELMO2P1, reducing the regulatory effect of ELMO2P1 on ELMO2 expression. Consequently, ELMO2 expression is increased, cell motility is enhanced leading to an aggressive endometrial cancer phenotype. In summary, these findings have identified several areas of research for further study. The results presented in this thesis provide evidence that a SNP in PGR is associated with risk of developing endometrial cancer. This PhD study also reports two independent loci on chromosome 15 to be associated with increased endometrial cancer grade, and furthermore, genes associated with these SNPs to be differentially expressed according in aggressive subtypes and/or by grade. The studies reported in this thesis support the need for comprehensive SNP association studies on prioritised MMP, TIMP and KLK genes in large sample sets. Until these studies are performed, the role of MMP, TIMP and KLK genetic variation remains unclear. Overall, this PhD study has contributed to the understanding of genetic variation involvement in endometrial cancer susceptibility and prognosis. Importantly, the genetic regions highlighted in this study could lead to the identification of novel gene targets to better understand the biology of endometrial cancer and also aid in the development of therapeutics directed at treating this disease.
Resumo:
Background Single nucleotide polymorphisms (SNPs) rs429358 (ε4) and rs7412 (ε2), both invoking changes in the amino-acid sequence of the apolipoprotein E (APOE) gene, have previously been tested for association with multiple sclerosis (MS) risk. However, none of these studies was sufficiently powered to detect modest effect sizes at acceptable type-I error rates. As both SNPs are only imperfectly captured on commonly used microarray genotyping platforms, their evaluation in the context of genome-wide association studies has been hindered until recently. Methods We genotyped 12 740 subjects hitherto not studied for their APOE status, imputed raw genotype data from 8739 subjects from five independent genome wide association studies datasets using the most recent high-resolution reference panels, and extracted genotype data for 8265 subjects from previous candidate gene assessments. Results Despite sufficient power to detect associations at genome-wide significance thresholds across a range of ORs, our analyses did not support a role of rs429358 or rs7412 on MS susceptibility. This included meta-analyses of the combined data across 13 913 MS cases and 15 831 controls (OR=0.95, p=0.259, and OR 1.07, p=0.0569, for rs429358 and rs7412, respectively). Conclusion Given the large sample size of our analyses, it is unlikely that the two APOE missense SNPs studied here exert any relevant effects on MS susceptibility.
Resumo:
Objectives This study introduces and assesses the precision of a standardized protocol for anthropometric measurement of the juvenile cranium using three-dimensional surface rendered models, for implementation in forensic investigation or paleodemographic research. Materials and methods A subset of multi-slice computed tomography (MSCT) DICOM datasets (n=10) of modern Australian subadults (birth—10 years) was accessed from the “Skeletal Biology and Forensic Anthropology Virtual Osteological Database” (n>1200), obtained from retrospective clinical scans taken at Brisbane children hospitals (2009–2013). The capabilities of Geomagic Design X™ form the basis of this study; introducing standardized protocols using triangle surface mesh models to (i) ascertain linear dimensions using reference plane networks and (ii) calculate the area of complex regions of interest on the cranium. Results The protocols described in this paper demonstrate high levels of repeatability between five observers of varying anatomical expertise and software experience. Intra- and inter-observer error was indiscernible with total technical error of measurement (TEM) values ≤0.56 mm, constituting <0.33% relative error (rTEM) for linear measurements; and a TEM value of ≤12.89 mm2, equating to <1.18% (rTEM) of the total area of the anterior fontanelle and contiguous sutures. Conclusions Exploiting the advances of MSCT in routine clinical assessment, this paper assesses the application of this virtual approach to acquire highly reproducible morphometric data in a non-invasive manner for human identification and population studies in growth and development. The protocols and precision testing presented are imperative for the advancement of “virtual anthropology” into routine Australian medico-legal death investigation.