957 resultados para Imbalanced datasets


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Summary: More than ever before contemporary societies are characterised by the huge amounts of data being transferred. Authorities, companies, academia and other stakeholders refer to Big Data when discussing the importance of large and complex datasets and developing possible solutions for their use. Big Data promises to be the next frontier of innovation for institutions and individuals, yet it also offers possibilities to predict and influence human behaviour with ever-greater precision

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we demonstrate passive vision-based localization in environments more than two orders of magnitude darker than the current benchmark using a 100 webcam and a 500 camera. Our approach uses the camera’s maximum exposure duration and sensor gain to achieve appropriately exposed images even in unlit night-time environments, albeit with extreme levels of motion blur. Using the SeqSLAM algorithm, we first evaluate the effect of variable motion blur caused by simulated exposures of 132 ms to 10000 ms duration on localization performance. We then use actual long exposure camera datasets to demonstrate day-night localization in two different environments. Finally we perform a statistical analysis that compares the baseline performance of matching unprocessed greyscale images to using patch normalization and local neighbourhood normalization – the two key SeqSLAM components. Our results and analysis show for the first time why the SeqSLAM algorithm is effective, and demonstrate the potential for cheap camera-based localization systems that function across extreme perceptual change.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper investigates engaging experienced birders, as volunteer citizen scientists, to analyze large recorded audio datasets gathered through environmental acoustic monitoring. Although audio data is straightforward to gather, automated analysis remains a challenging task; the existing expertise, local knowledge and motivation of the birder community can complement computational approaches and provide distinct benefits. We explored both the culture and practice of birders, and paradigms for interacting with recorded audio data. A variety of candidate design elements were tested with birders. This study contributes an understanding of how virtual interactions and practices can be developed to complement existing practices of experienced birders in the physical world. In so doing this study contributes a new approach to engagement in e-science. Whereas most citizen science projects task lay participants with discrete real world or artificial activities, sometimes using extrinsic motivators, this approach builds on existing intrinsically satisfying practices.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Council of Australian Governments (COAG) in 2003 gave in-principle approval to a best-practice report recommending a holistic approach to managing natural disasters in Australia incorporating a move from a traditional response-centric approach to a greater focus on mitigation, recovery and resilience with community well-being at the core. Since that time, there have been a range of complementary developments that have supported the COAG recommended approach. Developments have been administrative, legislative and technological, both, in reaction to the COAG initiative and resulting from regular natural disasters. This paper reviews the characteristics of the spatial data that is becoming increasingly available at Federal, state and regional jurisdictions with respect to their being fit for the purpose for disaster planning and mitigation and strengthening community resilience. In particular, Queensland foundation spatial data, which is increasingly accessible by the public under the provisions of the Right to Information Act 2009, Information Privacy Act 2009, and recent open data reform initiatives are evaluated. The Fitzroy River catchment and floodplain is used as a case study for the review undertaken. The catchment covers an area of 142,545 km2, the largest river catchment flowing to the eastern coast of Australia. The Fitzroy River basin experienced extensive flooding during the 2010–2011 Queensland floods. The basin is an area of important economic, environmental and heritage values and contains significant infrastructure critical for the mining and agricultural sectors, the two most important economic sectors for Queensland State. Consequently, the spatial datasets for this area play a critical role in disaster management and for protecting critical infrastructure essential for economic and community well-being. The foundation spatial datasets are assessed for disaster planning and mitigation purposes using data quality indicators such as resolution, accuracy, integrity, validity and audit trail.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents two algorithms to automate the detection of marine species in aerial imagery. An algorithm from an initial pilot study is presented in which morphology operations and colour analysis formed the basis of its working principle. A second approach is presented in which saturation channel and histogram-based shape profiling were used. We report on performance for both algorithms using datasets collected from an unmanned aerial system at an altitude of 1000 ft. Early results have demonstrated recall values of 48.57% and 51.4%, and precision values of 4.01% and 4.97%.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we use the algorithm SeqSLAM to address the question, how little and what quality of visual information is needed to localize along a familiar route? We conduct a comprehensive investigation of place recognition performance on seven datasets while varying image resolution (primarily 1 to 512 pixel images), pixel bit depth, field of view, motion blur, image compression and matching sequence length. Results confirm that place recognition using single images or short image sequences is poor, but improves to match or exceed current benchmarks as the matching sequence length increases. We then present place recognition results from two experiments where low-quality imagery is directly caused by sensor limitations; in one, place recognition is achieved along an unlit mountain road by using noisy, long-exposure blurred images, and in the other, two single pixel light sensors are used to localize in an indoor environment. We also show failure modes caused by pose variance and sequence aliasing, and discuss ways in which they may be overcome. By showing how place recognition along a route is feasible even with severely degraded image sequences, we hope to provoke a re-examination of how we develop and test future localization and mapping systems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

During the last several decades, the quality of natural resources and their services have been exposed to significant degradation from increased urban populations combined with the sprawl of settlements, development of transportation networks and industrial activities (Dorsey, 2003; Pauleit et al., 2005). As a result of this environmental degradation, a sustainable framework for urban development is required to provide the resilience of natural resources and ecosystems. Sustainable urban development refers to the management of cities with adequate infrastructure to support the needs of its population for the present and future generations as well as maintain the sustainability of its ecosystems (UNEP/IETC, 2002; Yigitcanlar, 2010). One of the important strategic approaches for planning sustainable cities is „ecological planning‟. Ecological planning is a multi-dimensional concept that aims to preserve biodiversity richness and ecosystem productivity through the sustainable management of natural resources (Barnes et al., 2005). As stated by Baldwin (1985, p.4), ecological planning is the initiation and operation of activities to direct and control the acquisition, transformation, disruption and disposal of resources in a manner capable of sustaining human activities with a minimum disruption of ecosystem processes. Therefore, ecological planning is a powerful method for creating sustainable urban ecosystems. In order to explore the city as an ecosystem and investigate the interaction between the urban ecosystem and human activities, a holistic urban ecosystem sustainability assessment approach is required. Urban ecosystem sustainability assessment serves as a tool that helps policy and decision-makers in improving their actions towards sustainable urban development. There are several methods used in urban ecosystem sustainability assessment among which sustainability indicators and composite indices are the most commonly used tools for assessing the progress towards sustainable land use and urban management. Currently, a variety of composite indices are available to measure the sustainability at the local, national and international levels. However, the main conclusion drawn from the literature review is that they are too broad to be applied to assess local and micro level sustainability and no benchmark value for most of the indicators exists due to limited data availability and non-comparable data across countries. Mayer (2008, p. 280) advocates that by stating "as different as the indices may seem, many of them incorporate the same underlying data because of the small number of available sustainability datasets". Mori and Christodoulou (2011) also argue that this relative evaluation and comparison brings along biased assessments, as data only exists for some entities, which also means excluding many nations from evaluation and comparison. Thus, there is a need for developing an accurate and comprehensive micro-level urban ecosystem sustainability assessment method. In order to develop such a model, it is practical to adopt an approach that uses a method to utilise indicators for collecting data, designate certain threshold values or ranges, perform a comparative sustainability assessment via indices at the micro-level, and aggregate these assessment findings to the local level. Hereby, through this approach and model, it is possible to produce sufficient and reliable data to enable comparison at the local level, and provide useful results to inform the local planning, conservation and development decision-making process to secure sustainable ecosystems and urban futures. To advance research in this area, this study investigated the environmental impacts of an existing urban context by using a composite index with an aim to identify the interaction between urban ecosystems and human activities in the context of environmental sustainability. In this respect, this study developed a new comprehensive urban ecosystem sustainability assessment tool entitled the „Micro-level Urban-ecosystem Sustainability IndeX‟ (MUSIX). The MUSIX model is an indicator-based indexing model that investigates the factors affecting urban sustainability in a local context. The model outputs provide local and micro-level sustainability reporting guidance to help policy-making concerning environmental issues. A multi-method research approach, which is based on both quantitative analysis and qualitative analysis, was employed in the construction of the MUSIX model. First, a qualitative research was conducted through an interpretive and critical literature review in developing a theoretical framework and indicator selection. Afterwards, a quantitative research was conducted through statistical and spatial analyses in data collection, processing and model application. The MUSIX model was tested in four pilot study sites selected from the Gold Coast City, Queensland, Australia. The model results detected the sustainability performance of current urban settings referring to six main issues of urban development: (1) hydrology, (2) ecology, (3) pollution, (4) location, (5) design, and; (6) efficiency. For each category, a set of core indicators was assigned which are intended to: (1) benchmark the current situation, strengths and weaknesses, (2) evaluate the efficiency of implemented plans, and; (3) measure the progress towards sustainable development. While the indicator set of the model provided specific information about the environmental impacts in the area at the parcel scale, the composite index score provided general information about the sustainability of the area at the neighbourhood scale. Finally, in light of the model findings, integrated ecological planning strategies were developed to guide the preparation and assessment of development and local area plans in conjunction with the Gold Coast Planning Scheme, which establishes regulatory provisions to achieve ecological sustainability through the formulation of place codes, development codes, constraint codes and other assessment criteria that provide guidance for best practice development solutions. These relevant strategies can be summarised as follows: • Establishing hydrological conservation through sustainable stormwater management in order to preserve the Earth’s water cycle and aquatic ecosystems; • Providing ecological conservation through sustainable ecosystem management in order to protect biological diversity and maintain the integrity of natural ecosystems; • Improving environmental quality through developing pollution prevention regulations and policies in order to promote high quality water resources, clean air and enhanced ecosystem health; • Creating sustainable mobility and accessibility through designing better local services and walkable neighbourhoods in order to promote safe environments and healthy communities; • Sustainable design of urban environment through climate responsive design in order to increase the efficient use of solar energy to provide thermal comfort, and; • Use of renewable resources through creating efficient communities in order to provide long-term management of natural resources for the sustainability of future generations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Distal radius fractures stabilized by open reduction internal fixation (ORIF) have become increasingly common. There is currently no consensus on the optimal time to commence range of motion (ROM) exercises post-ORIF. A retrospective cohort review was conducted over a five-year period to compare wrist and forearm range of motion outcomes and number of therapy sessions between patients who commenced active ROM exercises within the first seven days and from day eight onward following ORIF of distal radius fractures. One hundred and twenty-one patient cases were identified. Clinical data, active ROM at initial and discharge therapy assessments, fracture type, surgical approaches, and number of therapy sessions attended were recorded. One hundred and seven (88.4%) cases had complete datasets. The early active ROM group (n = 37) commenced ROM a mean (SD) of 4.27 (1.8) days post-ORIF. The comparator group (n = 70) commenced ROM exercises 24.3 (13.6) days post-ORIF. No significant differences were identified between groups in ROM at initial or discharge assessments, or therapy sessions attended. The results from this study indicate that patients who commenced active ROM exercises an average of 24 days after surgery achieved comparable ROM outcomes with similar number of therapy sessions to those who commenced ROM exercises within the first week.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis investigates and develops techniques for accurately detecting Internet-based Distributed Denial-of-Service (DDoS) Attacks where an adversary harnesses the power of thousands of compromised machines to disrupt the normal operations of a Web-service provider, resulting in significant down-time and financial losses. This thesis also develops methods to differentiate these attacks from similar-looking benign surges in web-traffic known as Flash Events (FEs). This thesis also addresses an intrinsic challenge in research associated with DDoS attacks, namely, the extreme scarcity of public domain datasets (due to legal and privacy issues) by developing techniques to realistically emulate DDoS attack and FE traffic.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Queensland University of Technology (QUT) Library offers a range of resources and services to researchers as part of their research support portfolio. This poster will present key features of two of the data management services offered by research support staff at QUT Library. The first service is QUT Research Data Finder (RDF), a product of the Australian National Data Service (ANDS) funded Metadata Stores project. RDF is a data registry (metadata repository) that aims to publicise datasets that are research outputs arising from completed QUT research projects. The second is a software and code registry, which is currently under development with the sole purpose of improving discovery of source code and software as QUT research outputs. RESEARCH DATA FINDER As an integrated metadata repository, Research Data Finder aligns with institutional sources of truth, such as QUT’s research administration system, ResearchMaster, as well as QUT’s Academic Profiles system to provide high quality data descriptions that increase awareness of, and access to, shareable research data. The repository and its workflows are designed to foster better data management practices, enhance opportunities for collaboration and research, promote cross-disciplinary research and maximise the impact of existing research data sets. SOFTWARE AND CODE REGISTRY The QUT Library software and code registry project stems from concerns amongst researchers with regards to development activities, storage, accessibility, discoverability and impact, sharing, copyright and IP ownership of software and code. As a result, the Library is developing a registry for code and software research outputs, which will use existing Research Data Finder architecture. The underpinning software for both registries is VIVO, open source software developed by Cornell University. The registry will use the Research Data Finder service instance of VIVO and will include a searchable interface, links to code/software locations and metadata feeds to Research Data Australia. Key benefits of the project include:improving the discoverability and reuse of QUT researchers’ code and software amongst QUT and the QUT research community; increasing the profile of QUT research outputs on a national level by providing a metadata feed to Research Data Australia, and; improving the metrics for access and reuse of code and software in the repository.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Operational modal analysis (OMA) is prevalent in modal identifi cation of civil structures. It asks for response measurements of the underlying structure under ambient loads. A valid OMA method requires the excitation be white noise in time and space. Although there are numerous applications of OMA in the literature, few have investigated the statistical distribution of a measurement and the infl uence of such randomness to modal identifi cation. This research has attempted modifi ed kurtosis to evaluate the statistical distribution of raw measurement data. In addition, a windowing strategy employing this index has been proposed to select quality datasets. In order to demonstrate how the data selection strategy works, the ambient vibration measurements of a laboratory bridge model and a real cable-stayed bridge have been respectively considered. The analysis incorporated with frequency domain decomposition (FDD) as the target OMA approach for modal identifi cation. The modal identifi cation results using the data segments with different randomness have been compared. The discrepancy in FDD spectra of the results indicates that, in order to fulfi l the assumption of an OMA method, special care shall be taken in processing a long vibration measurement data. The proposed data selection strategy is easy-to-apply and verifi ed effective in modal analysis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Currently there are ~3000 known species of Sarcophagidae (Diptera), which are classified into 173 genera in three subfamilies. Almost 25% of sarcophagids belong to the genus Sarcophaga (sensu lato) however little is known about the validity of, and relationships between the ~150 (or more) subgenera of Sarcophaga s.l. In this preliminary study, we evaluated the usefulness of three sources of data for resolving relationships between 35 species from 14 Sarcophaga s.l. subgenera: the mitochondrial COI barcode region, ~800. bp of the nuclear gene CAD, and 110 morphological characters. Bayesian, maximum likelihood (ML) and maximum parsimony (MP) analyses were performed on the combined dataset. Much of the tree was only supported by the Bayesian and ML analyses, with the MP tree poorly resolved. The genus Sarcophaga s.l. was resolved as monophyletic in both the Bayesian and ML analyses and strong support was obtained at the species-level. Notably, the only subgenus consistently resolved as monophyletic was Liopygia. The monophyly of and relationships between the remaining Sarcophaga s.l. subgenera sampled remain questionable. We suggest that future phylogenetic studies on the genus Sarcophaga s.l. use combined datasets for analyses. We also advocate the use of additional data and a range of inference strategies to assist with resolving relationships within Sarcophaga s.l.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We conducted an association study across the human leukocyte antigen (HLA) complex to identify loci associated with multiple sclerosis (MS). Comparing 1927 SNPs in 1618 MS cases and 3413 controls of European ancestry, we identified seven SNPs that were independently associated with MS conditional on the others (each ). All associations were significant in an independent replication cohort of 2212 cases and 2251 controls () and were highly significant in the combined dataset (). The associated SNPs included proxies for HLA-DRB1*15:01 and HLA-DRB1*03:01, and SNPs in moderate linkage disequilibrium (LD) with HLA-A*02:01, HLA-DRB1*04:01 and HLA-DRB1*13:03. We also found a strong association with rs9277535 in the class II gene HLA-DPB1 (discovery set , replication set , combined ). HLA-DPB1 is located centromeric of the more commonly typed class II genes HLA-DRB1, -DQA1 and -DQB1. It is separated from these genes by a recombination hotspot, and the association is not affected by conditioning on genotypes at DRB1, DQA1 and DQB1. Hence rs9277535 represents an independent MS-susceptibility locus of genome-wide significance. It is correlated with the HLA-DPB1*03:01 allele, which has been implicated previously in MS in smaller studies. Further genotyping in large datasets is required to confirm and resolve this association.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tag recommendation is a specific recommendation task for recommending metadata (tag) for a web resource (item) during user annotation process. In this context, sparsity problem refers to situation where tags need to be produced for items with few annotations or for user who tags few items. Most of the state of the art approaches in tag recommendation are rarely evaluated or perform poorly under this situation. This paper presents a combined method for mitigating sparsity problem in tag recommendation by mainly expanding and ranking candidate tags based on similar items’ tags and existing tag ontology. We evaluated the approach on two public social bookmarking datasets. The experiment results show better accuracy for recommendation in sparsity situation over several state of the art methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Bluetooth technology is being increasingly used to track vehicles throughout their trips, within urban networks and across freeway stretches. One important opportunity offered by this type of data is the measurement of Origin-Destination patterns, emerging from the aggregation and clustering of individual trips. In order to obtain accurate estimations, however, a number of issues need to be addressed, through data filtering and correction techniques. These issues mainly stem from the use of the Bluetooth technology amongst drivers, and the physical properties of the Bluetooth sensors themselves. First, not all cars are equipped with discoverable Bluetooth devices and the Bluetooth-enabled vehicles may belong to some small socio-economic groups of users. Second, the Bluetooth datasets include data from various transport modes; such as pedestrian, bicycles, cars, taxi driver, buses and trains. Third, the Bluetooth sensors may fail to detect all of the nearby Bluetooth-enabled vehicles. As a consequence, the exact journey for some vehicles may become a latent pattern that will need to be extracted from the data. Finally, sensors that are in close proximity to each other may have overlapping detection areas, thus making the task of retrieving the correct travelled path even more challenging. The aim of this paper is twofold. We first give a comprehensive overview of the aforementioned issues. Further, we propose a methodology that can be followed, in order to cleanse, correct and aggregate Bluetooth data. We postulate that the methods introduced by this paper are the first crucial steps that need to be followed in order to compute accurate Origin-Destination matrices in urban road networks.