851 resultados para Large scale graph processing
Resumo:
Wireless networked control systems (WNCSs) have been widely used in the areas of manufacturing and industrial processing over the last few years. They provide real-time control with a unique characteristic: periodic traffic. These systems have a time-critical requirement. Due to current wireless mechanisms, the WNCS performance suffers from long time-varying delays, packet dropout, and inefficient channel utilization. Current wirelessly networked applications like WNCSs are designed upon the layered architecture basis. The features of this layered architecture constrain the performance of these demanding applications. Numerous efforts have attempted to use cross-layer design (CLD) approaches to improve the performance of various networked applications. However, the existing research rarely considers large-scale networks and congestion network conditions in WNCSs. In addition, there is a lack of discussions on how to apply CLD approaches in WNCSs. This thesis proposes a cross-layer design methodology to address the issues of periodic traffic timeliness, as well as to promote the efficiency of channel utilization in WNCSs. The design of the proposed CLD is highlighted by the measurement of the underlying network condition, the classification of the network state, and the adjustment of sampling period between sensors and controllers. This period adjustment is able to maintain the minimally allowable sampling period, and also maximize the control performance. Extensive simulations are conducted using the network simulator NS-2 to evaluate the performance of the proposed CLD. The comparative studies involve two aspects of communications, with and without using the proposed CLD, respectively. The results show that the proposed CLD is capable of fulfilling the timeliness requirement under congested network conditions, and is also able to improve the channel utilization efficiency and the proportion of effective data in WNCSs.
Resumo:
Background Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study. Results In general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics. Conclusions CRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended.
Resumo:
This thesis presents a novel approach to mobile robot navigation using visual information towards the goal of long-term autonomy. A novel concept of a continuous appearance-based trajectory is proposed in order to solve the limitations of previous robot navigation systems, and two new algorithms for mobile robots, CAT-SLAM and CAT-Graph, are presented and evaluated. These algorithms yield performance exceeding state-of-the-art methods on public benchmark datasets and large-scale real-world environments, and will help enable widespread use of mobile robots in everyday applications.
Resumo:
MapReduce frameworks such as Hadoop are well suited to handling large sets of data which can be processed separately and independently, with canonical applications in information retrieval and sales record analysis. Rapid advances in sequencing technology have ensured an explosion in the availability of genomic data, with a consequent rise in the importance of large scale comparative genomics, often involving operations and data relationships which deviate from the classical Map Reduce structure. This work examines the application of Hadoop to patterns of this nature, using as our focus a wellestablished workflow for identifying promoters - binding sites for regulatory proteins - Across multiple gene regions and organisms, coupled with the unifying step of assembling these results into a consensus sequence. Our approach demonstrates the utility of Hadoop for problems of this nature, showing how the tyranny of the "dominant decomposition" can be at least partially overcome. It also demonstrates how load balance and the granularity of parallelism can be optimized by pre-processing that splits and reorganizes input files, allowing a wide range of related problems to be brought under the same computational umbrella.
Resumo:
The terrorist attacks in the United States on September 11, 2001 appeared to be a harbinger of increased terrorism and violence in the 21st century, bringing terrorism and political violence to the forefront of public discussion. Questions about these events abound, and “Estimating the Historical and Future Probabilities of Large Scale Terrorist Event” [Clauset and Woodard (2013)] asks specifically, “how rare are large scale terrorist events?” and, in general, encourages discussion on the role of quantitative methods in terrorism research and policy and decision-making. Answering the primary question raises two challenges. The first is identify- ing terrorist events. The second is finding a simple yet robust model for rare events that has good explanatory and predictive capabilities. The challenges of identifying terrorist events is acknowledged and addressed by reviewing and using data from two well-known and reputable sources: the Memorial Institute for the Prevention of Terrorism-RAND database (MIPT-RAND) [Memorial Institute for the Prevention of Terrorism] and the Global Terror- ism Database (GTD) [National Consortium for the Study of Terrorism and Responses to Terrorism (START) (2012), LaFree and Dugan (2007)]. Clauset and Woodard (2013) provide a detailed discussion of the limitations of the data and the models used, in the context of the larger issues surrounding terrorism and policy.
Resumo:
Real-time image analysis and classification onboard robotic marine vehicles, such as AUVs, is a key step in the realisation of adaptive mission planning for large-scale habitat mapping in previously unexplored environments. This paper describes a novel technique to train, process, and classify images collected onboard an AUV used in relatively shallow waters with poor visibility and non-uniform lighting. The approach utilises Förstner feature detectors and Laws texture energy masks for image characterisation, and a bag of words approach for feature recognition. To improve classification performance we propose a usefulness gain to learn the importance of each histogram component for each class. Experimental results illustrate the performance of the system in characterisation of a variety of marine habitats and its ability to operate onboard an AUV's main processor suitable for real-time mission planning.
Resumo:
The growing public concern about the complexity, cost and uncertain efficacy of the statuary environmental impact assessment process applying to large-scale projects in Queensland is reviewed. This is based on field data gathered over the past six years sat large-scale marina developments that access major environmental reserves along the coast. An ecological design proposal to broaden the process consisted with both government aspirations and regional ecological parameters - termed Regional Landscape Strategies - would allow the existing Environmental Impact Asessment to be modified alone potentially more practicable and effective lines.
Resumo:
This thesis investigates the fusion of 3D visual information with 2D image cues to provide 3D semantic maps of large-scale environments in which a robot traverses for robotic applications. A major theme of this thesis was to exploit the availability of 3D information acquired from robot sensors to improve upon 2D object classification alone. The proposed methods have been evaluated on several indoor and outdoor datasets collected from mobile robotic platforms including a quadcopter and ground vehicle covering several kilometres of urban roads.
Resumo:
Molecular biology is a scientific discipline which has changed fundamentally in character over the past decade to rely on large scale datasets – public and locally generated - and their computational analysis and annotation. Undergraduate education of biologists must increasingly couple this domain context with a data-driven computational scientific method. Yet modern programming and scripting languages and rich computational environments such as R and MATLAB present significant barriers to those with limited exposure to computer science, and may require substantial tutorial assistance over an extended period if progress is to be made. In this paper we report our experience of undergraduate bioinformatics education using the familiar, ubiquitous spreadsheet environment of Microsoft Excel. We describe a configurable extension called QUT.Bio.Excel, a custom ribbon, supporting a rich set of data sources, external tools and interactive processing within the spreadsheet, and a range of problems to demonstrate its utility and success in addressing the needs of students over their studies.
Resumo:
In elite sports, nearly all performances are captured on video. Despite the massive amounts of video that has been captured in this domain over the last 10-15 years, most of it remains in an 'unstructured' or 'raw' form, meaning it can only be viewed or manually annotated/tagged with higher-level event labels which is time consuming and subjective. As such, depending on the detail or depth of annotation, the value of the collected repositories of archived data is minimal as it does not lend itself to large-scale analysis and retrieval. One such example is swimming, where each race of a swimmer is captured on a camcorder and in-addition to the split-times (i.e., the time it takes for each lap), stroke rate and stroke-lengths are manually annotated. In this paper, we propose a vision-based system which effectively 'digitizes' a large collection of archived swimming races by estimating the location of the swimmer in each frame, as well as detecting the stroke rate. As the videos are captured from moving hand-held cameras which are located at different positions and angles, we show our hierarchical-based approach to tracking the swimmer and their different parts is robust to these issues and allows us to accurately estimate the swimmer location and stroke rates.
Resumo:
Accurate three-dimensional representations of cultural heritage sites are highly valuable for scientific study, conservation, and educational purposes. In addition to their use for archival purposes, 3D models enable efficient and precise measurement of relevant natural and architectural features. Many cultural heritage sites are large and complex, consisting of multiple structures spatially distributed over tens of thousands of square metres. The process of effectively digitising such geometrically complex locations requires measurements to be acquired from a variety of viewpoints. While several technologies exist for capturing the 3D structure of objects and environments, none are ideally suited to complex, large-scale sites, mainly due to their limited coverage or acquisition efficiency. We explore the use of a recently developed handheld mobile mapping system called Zebedee in cultural heritage applications. The Zebedee system is capable of efficiently mapping an environment in three dimensions by continually acquiring data as an operator holding the device traverses through the site. The system was deployed at the former Peel Island Lazaret, a culturally significant site in Queensland, Australia, consisting of dozens of buildings of various sizes spread across an area of approximately 400 × 250 m. With the Zebedee system, the site was scanned in half a day, and a detailed 3D point cloud model (with over 520 million points) was generated from the 3.6 hours of acquired data in 2.6 hours. We present results demonstrating that Zebedee was able to accurately capture both site context and building detail comparable in accuracy to manual measurement techniques, and at a greatly increased level of efficiency and scope. The scan allowed us to record derelict buildings that previously could not be measured because of the scale and complexity of the site. The resulting 3D model captures both interior and exterior features of buildings, including structure, materials, and the contents of rooms.
Resumo:
1.Marine ecosystems provide critically important goods and services to society, and hence their accelerated degradation underpins an urgent need to take rapid, ambitious and informed decisions regarding their conservation and management. 2.The capacity, however, to generate the detailed field data required to inform conservation planning at appropriate scales is limited by time and resource consuming methods for collecting and analysing field data at the large scales required. 3.The ‘Catlin Seaview Survey’, described here, introduces a novel framework for large-scale monitoring of coral reefs using high-definition underwater imagery collected using customized underwater vehicles in combination with computer vision and machine learning. This enables quantitative and geo-referenced outputs of coral reef features such as habitat types, benthic composition, and structural complexity (rugosity) to be generated across multiple kilometre-scale transects with a spatial resolution ranging from 2 to 6 m2. 4.The novel application of technology described here has enormous potential to contribute to our understanding of coral reefs and associated impacts by underpinning management decisions with kilometre-scale measurements of reef health. 5.Imagery datasets from an initial survey of 500 km of seascape are freely available through an online tool called the Catlin Global Reef Record. Outputs from the image analysis using the technologies described here will be updated on the online repository as work progresses on each dataset. 6.Case studies illustrate the utility of outputs as well as their potential to link to information from remote sensing. The potential implications of the innovative technologies on marine resource management and conservation are also discussed, along with the accuracy and efficiency of the methodologies deployed.
Resumo:
Automated remote ultrasound detectors allow large amounts of data on bat presence and activity to be collected. Processing of such data involves identifying bat species from their echolocation calls. Automated species identification has the potential to provide more consistent, predictable, and potentially higher levels of accuracy than identification by humans. In contrast, identification by humans permits flexibility and intelligence in identification, as well as the incorporation of features and patterns that may be difficult to quantify. We compared humans with artificial neural networks (ANNs) in their ability to classify short recordings of bat echolocation calls of variable signal to noise ratios; these sequences are typical of those obtained from remote automated recording systems that are often used in large-scale ecological studies. We presented 45 recordings (1–4 calls) produced by known species of bats to ANNs and to 26 human participants with 1 month to 23 years of experience in acoustic identification of bats. Humans correctly classified 86% of recordings to genus and 56% to species; ANNs correctly identified 92% and 62%, respectively. There was no significant difference between the performance of ANNs and that of humans, but ANNs performed better than about 75% of humans. There was little relationship between the experience of the human participants and their classification rate. However, humans with <1 year of experience performed worse than others. Currently, identification of bat echolocation calls by humans is suitable for ecological research, after careful consideration of biases. However, improvements to ANNs and the data that they are trained on may in future increase their performance to beyond those demonstrated by humans.
Resumo:
The extent of exothermicity associated with the construction of large-volume methacrylate monolithic columns has somewhat obstructed the realisation of large-scale rapid biomolecule purification especially for plasmid-based products which have proven to herald future trends in biotechnology. A novel synthesis technique via a heat expulsion mechanism was employed to prepare a 40 mL methacrylate monolith with a homogeneous radial pore structure along its thickness. Radial temperature gradient was recorded to be only 1.8 °C. Maximum radial temperature recorded at the centre of the monolith was 62.3 °C, which was only 2.3 °C higher than the actual polymerisation temperature. Pore characterisation of the monolithic polymer showed unimodal pore size distributions at different radial positions with an identical modal pore size of 400 nm. Chromatographic characterisation of the polymer after functionalisation with amino groups displayed a persistent dynamic binding capacity of 15.5 mg of plasmid DNA/mL. The maximum pressure drop recorded was only 0.12 MPa at a flow rate of 10 mL/min. The polymer demonstrated rapid separation ability by fractionating Escherichia coli DH5α-pUC19 clarified lysate in only 3 min after loading. The plasmid sample collected after the fast purification process was tested to be a homogeneous supercoiled plasmid with DNA electrophoresis and restriction analysis.
Resumo:
The creation of a commercially viable and a large-scale purification process for plasmid DNA (pDNA) production requires a whole-systems continuous or semi-continuous purification strategy employing optimised stationary adsorption phase(s) without the use of expensive and toxic chemicals, avian/bovine-derived enzymes and several built-in unit processes, thus affecting overall plasmid recovery, processing time and economics. Continuous stationary phases are known to offer fast separation due to their large pore diameter making large molecule pDNA easily accessible with limited mass transfer resistance even at high flow rates. A monolithic stationary sorbent was synthesised via free radical liquid porogenic polymerisation of ethylene glycol dimethacrylate (EDMA) and glycidyl methacrylate (GMA) with surface and pore characteristics tailored specifically for plasmid binding, retention and elution. The polymer was functionalised with an amine active group for anion-exchange purification of pDNA from cleared lysate obtained from E. coli DH5α-pUC19 pellets in RNase/protease-free process. Characterization of the resin showed a unique porous material with 70% of the pores sizes above 300 nm. The final product isolated from anion-exchange purification in only 5 min was pure and homogenous supercoiled pDNA with no gDNA, RNA and protein contamination as confirmed with DNA electrophoresis, restriction analysis and SDS page. The resin showed a maximum binding capacity of 15.2 mg/mL and this capacity persisted after several applications of the resin. This technique is cGMP compatible and commercially viable for rapid isolation of pDNA.