920 resultados para Data representation
Resumo:
This article presents a new neural pattern recognition architecture on multichannel data representation. The architecture emploies generalized ART modules as building blocks to construct a supervised learning system generating recognition codes on channels dynamically selected in context using serial and parallel match trackings led by inter-ART vigilance signals.
Resumo:
This paper presents a framework for a telecommunications interface which allows data from sensors embedded in Smart Grid applications to reliably archive data in an appropriate time-series database. The challenge in doing so is two-fold, firstly the various formats in which sensor data is represented, secondly the problems of telecoms reliability. A prototype of the authors' framework is detailed which showcases the main features of the framework in a case study featuring Phasor Measurement Units (PMU) as the application. Useful analysis of PMU data is achieved whenever data from multiple locations can be compared on a common time axis. The prototype developed highlights its reliability, extensibility and adoptability; features which are largely deferred from industry standards for data representation to proprietary database solutions. The open source framework presented provides link reliability for any type of Smart Grid sensor and is interoperable with existing proprietary database systems, and open database systems. The features of the authors' framework allow for researchers and developers to focus on the core of their real-time or historical analysis applications, rather than having to spend time interfacing with complex protocols.
Resumo:
Contém resumo
Resumo:
The current state of health and biomedicine includes an enormity of heterogeneous data ‘silos’, collected for different purposes and represented differently, that are presently impossible to share or analyze in toto. The greatest challenge for large-scale and meaningful analyses of health-related data is to achieve a uniform data representation for data extracted from heterogeneous source representations. Based upon an analysis and categorization of heterogeneities, a process for achieving comparable data content by using a uniform terminological representation is developed. This process addresses the types of representational heterogeneities that commonly arise in healthcare data integration problems. Specifically, this process uses a reference terminology, and associated "maps" to transform heterogeneous data to a standard representation for comparability and secondary use. The capture of quality and precision of the “maps” between local terms and reference terminology concepts enhances the meaning of the aggregated data, empowering end users with better-informed queries for subsequent analyses. A data integration case study in the domain of pediatric asthma illustrates the development and use of a reference terminology for creating comparable data from heterogeneous source representations. The contribution of this research is a generalized process for the integration of data from heterogeneous source representations, and this process can be applied and extended to other problems where heterogeneous data needs to be merged.
Resumo:
Information extraction or knowledge discovery from large data sets should be linked to data aggregation process. Data aggregation process can result in a new data representation with decreased number of objects of a given set. A deterministic approach to separable data aggregation means a lesser number of objects without mixing of objects from different categories. A statistical approach is less restrictive and allows for almost separable data aggregation with a low level of mixing of objects from different categories. Layers of formal neurons can be designed for the purpose of data aggregation both in the case of deterministic and statistical approach. The proposed designing method is based on minimization of the of the convex and piecewise linear (CPL) criterion functions.
Resumo:
With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.
Resumo:
Thanks to the advanced technologies and social networks that allow the data to be widely shared among the Internet, there is an explosion of pervasive multimedia data, generating high demands of multimedia services and applications in various areas for people to easily access and manage multimedia data. Towards such demands, multimedia big data analysis has become an emerging hot topic in both industry and academia, which ranges from basic infrastructure, management, search, and mining to security, privacy, and applications. Within the scope of this dissertation, a multimedia big data analysis framework is proposed for semantic information management and retrieval with a focus on rare event detection in videos. The proposed framework is able to explore hidden semantic feature groups in multimedia data and incorporate temporal semantics, especially for video event detection. First, a hierarchical semantic data representation is presented to alleviate the semantic gap issue, and the Hidden Coherent Feature Group (HCFG) analysis method is proposed to capture the correlation between features and separate the original feature set into semantic groups, seamlessly integrating multimedia data in multiple modalities. Next, an Importance Factor based Temporal Multiple Correspondence Analysis (i.e., IF-TMCA) approach is presented for effective event detection. Specifically, the HCFG algorithm is integrated with the Hierarchical Information Gain Analysis (HIGA) method to generate the Importance Factor (IF) for producing the initial detection results. Then, the TMCA algorithm is proposed to efficiently incorporate temporal semantics for re-ranking and improving the final performance. At last, a sampling-based ensemble learning mechanism is applied to further accommodate the imbalanced datasets. In addition to the multimedia semantic representation and class imbalance problems, lack of organization is another critical issue for multimedia big data analysis. In this framework, an affinity propagation-based summarization method is also proposed to transform the unorganized data into a better structure with clean and well-organized information. The whole framework has been thoroughly evaluated across multiple domains, such as soccer goal event detection and disaster information management.
Resumo:
Executive Summary The objective of this report was to use the Sydney Opera House as a case study of the application of Building Information Modelling (BIM). The Sydney opera House is a complex, large building with very irregular building configuration, that makes it a challenging test. A number of key concerns are evident at SOH: • the building structure is complex, and building service systems - already the major cost of ongoing maintenance - are undergoing technology change, with new computer based services becoming increasingly important. • the current “documentation” of the facility is comprised of several independent systems, some overlapping and is inadequate to service current and future services required • the building has reached a milestone age in terms of the condition and maintainability of key public areas and service systems, functionality of spaces and longer term strategic management. • many business functions such as space or event management require up-to-date information of the facility that are currently inadequately delivered, expensive and time consuming to update and deliver to customers. • major building upgrades are being planned that will put considerable strain on existing Facilities Portfolio services, and their capacity to manage them effectively While some of these concerns are unique to the House, many will be common to larger commercial and institutional portfolios. The work described here supported a complementary task which sought to identify if a building information model – an integrated building database – could be created, that would support asset & facility management functions (see Sydney Opera House – FM Exemplar Project, Report Number: 2005-001-C-4 Building Information Modelling for FM at Sydney Opera House), a business strategy that has been well demonstrated. The development of the BIMSS - Open Specification for BIM has been surprisingly straightforward. The lack of technical difficulties in converting the House’s existing conventions and standards to the new model based environment can be related to three key factors: • SOH Facilities Portfolio – the internal group responsible for asset and facility management - have already well established building and documentation policies in place. The setting and adherence to well thought out operational standards has been based on the need to create an environment that is understood by all users and that addresses the major business needs of the House. • The second factor is the nature of the IFC Model Specification used to define the BIM protocol. The IFC standard is based on building practice and nomenclature, widely used in the construction industries across the globe. For example the nomenclature of building parts – eg ifcWall, corresponds to our normal terminology, but extends the traditional drawing environment currently used for design and documentation. This demonstrates that the international IFC model accurately represents local practice for building data representation and management. • a BIM environment sets up opportunities for innovative processes that can exploit the rich data in the model and improve services and functions for the House: for example several high-level processes have been identified that could benefit from standardized Building Information Models such as maintenance processes using engineering data, business processes using scheduling, venue access, security data and benchmarking processes using building performance data. The new technology matches business needs for current and new services. The adoption of IFC compliant applications opens the way forward for shared building model collaboration and new processes, a significant new focus of the BIM standards. In summary, SOH current building standards have been successfully drafted for a BIM environment and are confidently expected to be fully developed when BIM is adopted operationally by SOH. These BIM standards and their application to the Opera House are intended as a template for other organisations to adopt for the own procurement and facility management activities. Appendices provide an overview of the IFC Integrated Object Model and an understanding IFC Model Data.
Resumo:
“SOH see significant benefit in digitising its drawings and operation and maintenance manuals. Since SOH do not currently have digital models of the Opera House structure or other components, there is an opportunity for this national case study to promote the application of Digital Facility Modelling using standardized Building Information Models (BIM)”. The digital modelling element of this project examined the potential of building information models for Facility Management focusing on the following areas: • The re-usability of building information for FM purposes • BIM as an Integrated information model for facility management • Extendibility of the BIM to cope with business specific requirements • Commercial facility management software using standardised building information models • The ability to add (organisation specific) intelligence to the model • A roadmap for SOH to adopt BIM for FM The project has established that BIM – building information modelling - is an appropriate and potentially beneficial technology for the storage of integrated building, maintenance and management data for SOH. Based on the attributes of a BIM, several advantages can be envisioned: consistency in the data, intelligence in the model, multiple representations, source of information for intelligent programs and intelligent queries. The IFC – open building exchange standard – specification provides comprehensive support for asset and facility management functions, and offers new management, collaboration and procurement relationships based on sharing of intelligent building data. The major advantages of using an open standard are: information can be read and manipulated by any compliant software, reduced user “lock in” to proprietary solutions, third party software can be the “best of breed” to suit the process and scope at hand, standardised BIM solutions consider the wider implications of information exchange outside the scope of any particular vendor, information can be archived as ASCII files for archival purposes, and data quality can be enhanced as the now single source of users’ information has improved accuracy, correctness, currency, completeness and relevance. SOH current building standards have been successfully drafted for a BIM environment and are confidently expected to be fully developed when BIM is adopted operationally by SOH. There have been remarkably few technical difficulties in converting the House’s existing conventions and standards to the new model based environment. This demonstrates that the IFC model represents world practice for building data representation and management (see Sydney Opera House – FM Exemplar Project Report Number 2005-001-C-3, Open Specification for BIM: Sydney Opera House Case Study). Availability of FM applications based on BIM is in its infancy but focussed systems are already in operation internationally and show excellent prospects for implementation systems at SOH. In addition to the generic benefits of standardised BIM described above, the following FM specific advantages can be expected from this new integrated facilities management environment: faster and more effective processes, controlled whole life costs and environmental data, better customer service, common operational picture for current and strategic planning, visual decision-making and a total ownership cost model. Tests with partial BIM data – provided by several of SOH’s current consultants – show that the creation of a SOH complete model is realistic, but subject to resolution of compliance and detailed functional support by participating software applications. The showcase has demonstrated successfully that IFC based exchange is possible with several common BIM based applications through the creation of a new partial model of the building. Data exchanged has been geometrically accurate (the SOH building structure represents some of the most complex building elements) and supports rich information describing the types of objects, with their properties and relationships.
Resumo:
Abstract. For interactive systems, recognition, reproduction, and generalization of observed motion data are crucial for successful interaction. In this paper, we present a novel method for analysis of motion data that we refer to as K-OMM-trees. K-OMM-trees combine Ordered Means Models (OMMs) a model-based machine learning approach for time series with an hierarchical analysis technique for very large data sets, the K-tree algorithm. The proposed K-OMM-trees enable unsupervised prototype extraction of motion time series data with hierarchical data representation. After introducing the algorithmic details, we apply the proposed method to a gesture data set that includes substantial inter-class variations. Results from our studies show that K-OMM-trees are able to substantially increase the recognition performance and to learn an inherent data hierarchy with meaningful gesture abstractions.
Resumo:
Creative Statement: “There are those who see Planet Earth as a gigantic living being, one that feeds and nurtures humanity and myriad other species – an entity that must be cared for. Then there are those who see it as a rock full of riches to be pilfered heedlessly in a short-term quest for over-abundance. This ‘cradle to grave’ mentality, it would seem, is taking its toll (unless you’re a virulent disbeliever in climate change). Why not, ask artists Priscilla Bracks and Gavin Sade, take a different approach? To this end they have set out on a near impossible task; to visualise the staggering quantity of carbon produced by Australia every year. Their eerie, glowing plastic cube resembles something straight out of Dr Who or The X Files. And, like the best science fiction, it has technical realities at its heart. Every One, Every Day tangibly illustrates our greenhouse gas output – its 27m3 volume is approximately the amount of green-house gas emitted per capita, daily. Every One, Every Dayis lit by an array of LED’s displaying light patterns representing energy use generated by data from the Australian Energy Market. Every One, Every Day was formed from recycled, polyethylene – used milk bottles – ‘lent’ to the artists by a Visy recycling facility. At the end of the Vivid Festival this plastic will be returned to Visy, where it will re-enter the stream of ‘technical nutrients.’ Could we make another world? One that emulates the continuing cycles of nature? One that uses our ‘technical nutrients’ such as plastic and steel in continual cycles, just like a deciduous tree dropping leaves to compost itself and keep it’s roots warm and moist?” (Ashleigh Crawford. Melbourne – April, 2013) Artistic Research Statement: The research focus of this work is on exploring how to represent complex statistics and data at a human scale, and how produce a work where a large percentage of the materials could be recycled. The surface of Every One, Every Day is clad in tiles made from polyethylene, from primarily recycled milk bottles, ‘lent’ to the artists by the Visy recycling facility in Sydney. The tiles will be returned to Visy for recycling. As such the work can be viewed as an intervention in the industrial ecology of polyethylene, and in the process demonstrates how to sustain cycles of technical materials – by taking the output of a recycling facility back to a manufacturer to produce usable materials. In terms of data visualisation, Every One, Every Day takes the form of a cube with a volume of 27 cubic meters. The annual per capita emissions figures for Australia are cited as ranging between 18 to 25 tons. Assuming the lower figure, 18tons per capital annually, the 27 cubic meters represents approximately one day per capita of CO2 emissions – where CO2 is a gas at 15C and 1 atmosphere of pressure. The work also explores real time data visualisation by using an array of 600 controllable LEDs inside the cube. Illumination patterns are derived from a real time data from the Australian Energy Market, using the dispatch interval price and demand graph for New South Wales. The two variables of demand and price are mapped to properties of the illumination - hue, brightness, movement, frequency etc. The research underpinning the project spanned industrial ecology to data visualization and public art practices. The result is that Every One, Every Day is one of the first public artworks that successfully bring together materials, physical form, and real time data representation in a unified whole.
Resumo:
This chapter addresses opportunities for problem posing in developing young children’s statistical literacy, with a focus on student-directed investigations. Although the notion of problem posing has broadened in recent years, there nevertheless remains limited research on how problem posing can be integrated within the regular mathematics curriculum, especially in the areas of statistics and probability. The chapter first reviews briefly aspects of problem posing that have featured in the literature over the years. Consideration is next given to the importance of developing children’s statistical literacy in which problem posing is an inherent feature. Some findings from a school playground investigation conducted in four, fourth-grade classes illustrate the different ways in which children posed investigative questions, how they made predictions about their outcomes and compared these with their findings, and the ways in which they chose to represent their findings.
Resumo:
In recent years, XML has been accepted as the format of messages for several applications. Prominent examples include SOAP for Web services, XMPP for instant messaging, and RSS and Atom for content syndication. This XML usage is understandable, as the format itself is a well-accepted standard for structured data, and it has excellent support for many popular programming languages, so inventing an application-specific format no longer seems worth the effort. Simultaneously with this XML's rise to prominence there has been an upsurge in the number and capabilities of various mobile devices. These devices are connected through various wireless technologies to larger networks, and a goal of current research is to integrate them seamlessly into these networks. These two developments seem to be at odds with each other. XML as a fully text-based format takes up more processing power and network bandwidth than binary formats would, whereas the battery-powered nature of mobile devices dictates that energy, both in processing and transmitting, be utilized efficiently. This thesis presents the work we have performed to reconcile these two worlds. We present a message transfer service that we have developed to address what we have identified as the three key issues: XML processing at the application level, a more efficient XML serialization format, and the protocol used to transfer messages. Our presentation includes both a high-level architectural view of the whole message transfer service, as well as detailed descriptions of the three new components. These components consist of an API, and an associated data model, for XML processing designed for messaging applications, a binary serialization format for the data model of the API, and a message transfer protocol providing two-way messaging capability with support for client mobility. We also present relevant performance measurements for the service and its components. As a result of this work, we do not consider XML to be inherently incompatible with mobile devices. As the fixed networking world moves toward XML for interoperable data representation, so should the wireless world also do to provide a better-integrated networking infrastructure. However, the problems that XML adoption has touch all of the higher layers of application programming, so instead of concentrating simply on the serialization format we conclude that improvements need to be made in an integrated fashion in all of these layers.
Resumo:
Biological motion has successfully been used for analysis of a person's mood and other psychological traits. Efforts are made to use human gait as a non-invasive mode of biometric. In this reported work, we try to study the effectiveness of biological gait motion of people as a cue to biometric based person recognition. The data is 3D in nature and, hence, has more information with itself than the cues obtained from video-based gait patterns. The high accuracies of person recognition using a simple linear model of data representation and simple neighborhood based classfiers, suggest that it is the nature of the data which is more important than the recognition scheme employed.
Resumo:
Storage systems are widely used and have played a crucial rule in both consumer and industrial products, for example, personal computers, data centers, and embedded systems. However, such system suffers from issues of cost, restricted-lifetime, and reliability with the emergence of new systems and devices, such as distributed storage and flash memory, respectively. Information theory, on the other hand, provides fundamental bounds and solutions to fully utilize resources such as data density, information I/O and network bandwidth. This thesis bridges these two topics, and proposes to solve challenges in data storage using a variety of coding techniques, so that storage becomes faster, more affordable, and more reliable.
We consider the system level and study the integration of RAID schemes and distributed storage. Erasure-correcting codes are the basis of the ubiquitous RAID schemes for storage systems, where disks correspond to symbols in the code and are located in a (distributed) network. Specifically, RAID schemes are based on MDS (maximum distance separable) array codes that enable optimal storage and efficient encoding and decoding algorithms. With r redundancy symbols an MDS code can sustain r erasures. For example, consider an MDS code that can correct two erasures. It is clear that when two symbols are erased, one needs to access and transmit all the remaining information to rebuild the erasures. However, an interesting and practical question is: What is the smallest fraction of information that one needs to access and transmit in order to correct a single erasure? In Part I we will show that the lower bound of 1/2 is achievable and that the result can be generalized to codes with arbitrary number of parities and optimal rebuilding.
We consider the device level and study coding and modulation techniques for emerging non-volatile memories such as flash memory. In particular, rank modulation is a novel data representation scheme proposed by Jiang et al. for multi-level flash memory cells, in which a set of n cells stores information in the permutation induced by the different charge levels of the individual cells. It eliminates the need for discrete cell levels, as well as overshoot errors, when programming cells. In order to decrease the decoding complexity, we propose two variations of this scheme in Part II: bounded rank modulation where only small sliding windows of cells are sorted to generated permutations, and partial rank modulation where only part of the n cells are used to represent data. We study limits on the capacity of bounded rank modulation and propose encoding and decoding algorithms. We show that overlaps between windows will increase capacity. We present Gray codes spanning all possible partial-rank states and using only ``push-to-the-top'' operations. These Gray codes turn out to solve an open combinatorial problem called universal cycle, which is a sequence of integers generating all possible partial permutations.