847 resultados para Web Semantico semantic open data geoSPARQL
Resumo:
This paper presents an overview of the experiments conducted using Hybrid Clustering of XML documents using Constraints (HCXC) method for the clustering task in the INEX 2009 XML Mining track. This technique utilises frequent subtrees generated from the structure to extract the content for clustering the XML documents. It also presents the experimental study using several data representations such as the structure-only, content-only and using both the structure and the content of XML documents for the purpose of clustering them. Unlike previous years, this year the XML documents were marked up using the Wiki tags and contains categories derived by using the YAGO ontology. This paper also presents the results of studying the effect of these tags on XML clustering using the HCXC method.
Resumo:
know personally. They also communicate with other members of the network who are the friends of their friends and may be friends of their friend’s network. They share their experiences and opinions within the social network about an item which may be a product or service. The user faces the problem of evaluating trust in a service or service provider before making a choice. Opinions, reputations and ecommendations will influence users' choice and usage of online resources. Recommendations may be received through a chain of friends of friends, so the problem for the user is to be able to evaluate various types of trust recommendations and reputations. This opinion or ecommendation has a great influence to choose to use or enjoy the item by the other user of the community. Users share information on the level of trust they explicitly assign to other users. This trust can be used to determine while taking decision based on any recommendation. In case of the absence of direct connection of the recommender user, propagated trust could be useful.
Resumo:
This paper investigates in how to utilize ICT and Web 2.0 technologies and e-democracy software for policy decision-making. It introduces a cutting edge decision-making system that integrates the practice of e-petitions, e-consultation, e-rulemaking, e-voting, and proxy voting. The paper demonstrates how under precondition of direct democracy through the use this system the collective intelligence (CI) of a population would be gathered and used throughout the policy process.
Resumo:
Current knowledge about the relationship between transport disadvantage and activity space size is limited to urban areas, and as a result, very little is known to date about this link in a rural context. In addition, although research has identified transport disadvantaged groups based on their size of activity spaces, these studies have, however, not empirically explained such differences and the result is often a poor identification of the problems facing disadvantaged groups. Research has shown that transport disadvantage varies over time. The static nature of analysis using the activity space concept in previous research studies has lacked the ability to identify transport disadvantage in time. Activity space is a dynamic concept; and therefore possesses a great potential in capturing temporal variations in behaviour and access opportunities. This research derives measures of the size and fullness of activity spaces for 157 individuals for weekdays, weekends, and for a week using weekly activity-travel diary data from three case study areas located in rural Northern Ireland. Four focus groups were also conducted in order to triangulate the quantitative findings and to explain the differences between different socio-spatial groups. The findings of this research show that despite having a smaller sized activity space, individuals were not disadvantaged because they were able to access their required activities locally. Car-ownership was found to be an important life line in rural areas. Temporal disaggregation of the data reveals that this is true only on weekends due to a lack of public transport services. In addition, despite activity spaces being at a similar size, the fullness of activity spaces of low-income individuals was found to be significantly lower compared to their high-income counterparts. Focus group data shows that financial constraint, poor connections both between public transport services and between transport routes and opportunities forced individuals to participate in activities located along the main transport corridors.
Resumo:
In vector space based approaches to natural language processing, similarity is commonly measured by taking the angle between two vectors representing words or documents in a semantic space. This is natural from a mathematical point of view, as the angle between unit vectors is, up to constant scaling, the only unitarily invariant metric on the unit sphere. However, similarity judgement tasks reveal that human subjects fail to produce data which satisfies the symmetry and triangle inequality requirements for a metric space. A possible conclusion, reached in particular by Tversky et al., is that some of the most basic assumptions of geometric models are unwarranted in the case of psychological similarity, a result which would impose strong limits on the validity and applicability vector space based (and hence also quantum inspired) approaches to the modelling of cognitive processes. This paper proposes a resolution to this fundamental criticism of of the applicability of vector space models of cognition. We argue that pairs of words imply a context which in turn induces a point of view, allowing a subject to estimate semantic similarity. Context is here introduced as a point of view vector (POVV) and the expected similarity is derived as a measure over the POVV's. Different pairs of words will invoke different contexts and different POVV's. Hence the triangle inequality ceases to be a valid constraint on the angles. We test the proposal on a few triples of words and outline further research.
Resumo:
Information overload has become a serious issue for web users. Personalisation can provide effective solutions to overcome this problem. Recommender systems are one popular personalisation tool to help users deal with this issue. As the base of personalisation, the accuracy and efficiency of web user profiling affects the performances of recommender systems and other personalisation systems greatly. In Web 2.0, the emerging user information provides new possible solutions to profile users. Folksonomy or tag information is a kind of typical Web 2.0 information. Folksonomy implies the users‘ topic interests and opinion information. It becomes another source of important user information to profile users and to make recommendations. However, since tags are arbitrary words given by users, folksonomy contains a lot of noise such as tag synonyms, semantic ambiguities and personal tags. Such noise makes it difficult to profile users accurately or to make quality recommendations. This thesis investigates the distinctive features and multiple relationships of folksonomy and explores novel approaches to solve the tag quality problem and profile users accurately. Harvesting the wisdom of crowds and experts, three new user profiling approaches are proposed: folksonomy based user profiling approach, taxonomy based user profiling approach, hybrid user profiling approach based on folksonomy and taxonomy. The proposed user profiling approaches are applied to recommender systems to improve their performances. Based on the generated user profiles, the user and item based collaborative filtering approaches, combined with the content filtering methods, are proposed to make recommendations. The proposed new user profiling and recommendation approaches have been evaluated through extensive experiments. The effectiveness evaluation experiments were conducted on two real world datasets collected from Amazon.com and CiteULike websites. The experimental results demonstrate that the proposed user profiling and recommendation approaches outperform those related state-of-the-art approaches. In addition, this thesis proposes a parallel, scalable user profiling implementation approach based on advanced cloud computing techniques such as Hadoop, MapReduce and Cascading. The scalability evaluation experiments were conducted on a large scaled dataset collected from Del.icio.us website. This thesis contributes to effectively use the wisdom of crowds and expert to help users solve information overload issues through providing more accurate, effective and efficient user profiling and recommendation approaches. It also contributes to better usages of taxonomy information given by experts and folksonomy information contributed by users in Web 2.0.
Resumo:
Social tags in web 2.0 are becoming another important information source to describe the content of items as well as to profile users’ topic preferences. However, as arbitrary words given by users, tags contains a lot of noise such as tag synonym and semantic ambiguity a large number personal tags that only used by one user, which brings challenges to effectively use tags to make item recommendations. To solve these problems, this paper proposes to use a set of related tags along with their weights to represent semantic meaning of each tag for each user individually. A hybrid recommendation generation approaches that based on the weighted tags are proposed. We have conducted experiments using the real world dataset obtained from Amazon.com. The experimental results show that the proposed approaches outperform the other state of the art approaches.
Resumo:
There has been an increasing interest by governments worldwide in the potential benefits of open access to public sector information (PSI). However, an important question remains: can a government incur tortious liability for incorrect information released online under an open content licence? This paper argues that the release of PSI online for free under an open content licence, specifically a Creative Commons licence, is within the bounds of an acceptable level of risk to government, especially where users are informed of the limitations of the data and appropriate information management policies and principles are in place to ensure accountability for data quality and accuracy.
Resumo:
Performance comparisons between File Signatures and Inverted Files for text retrieval have previously shown several significant shortcomings of file signatures relative to inverted files. The inverted file approach underpins most state-of-the-art search engine algorithms, such as Language and Probabilistic models. It has been widely accepted that traditional file signatures are inferior alternatives to inverted files. This paper describes TopSig, a new approach to the construction of file signatures. Many advances in semantic hashing and dimensionality reduction have been made in recent times, but these were not so far linked to general purpose, signature file based, search engines. This paper introduces a different signature file approach that builds upon and extends these recent advances. We are able to demonstrate significant improvements in the performance of signature file based indexing and retrieval, performance that is comparable to that of state of the art inverted file based systems, including Language models and BM25. These findings suggest that file signatures offer a viable alternative to inverted files in suitable settings and positions the file signatures model in the class of Vector Space retrieval models.
Resumo:
The interoperable and loosely-coupled web services architecture, while beneficial, can be resource-intensive, and is thus susceptible to denial of service (DoS) attacks in which an attacker can use a relatively insignificant amount of resources to exhaust the computational resources of a web service. We investigate the effectiveness of defending web services from DoS attacks using client puzzles, a cryptographic countermeasure which provides a form of gradual authentication by requiring the client to solve some computationally difficult problems before access is granted. In particular, we describe a mechanism for integrating a hash-based puzzle into existing web services frameworks and analyze the effectiveness of the countermeasure using a variety of scenarios on a network testbed. Client puzzles are an effective defence against flooding attacks. They can also mitigate certain types of semantic-based attacks, although they may not be the optimal solution.
Resumo:
Increasingly scientists are using collections of software tools in their research. These tools are typically used in concert, often necessitating laborious and error-prone manual data reformatting and transfer. We present an intuitive workflow environment to support scientists with their research. The workflow, GPFlow, wraps legacy tools, presenting a high level, interactive web-based front end to scientists. The workflow backend is realized by a commercial grade workflow engine (Windows Workflow Foundation). The workflow model is inspired by spreadsheets and is novel in its support for an intuitive method of interaction enabling experimentation as required by many scientists, e.g. bioinformaticians. We apply GPFlow to two bioinformatics experiments and demonstrate its flexibility and simplicity.
Resumo:
In this paper, we propose a search-based approach to join two tables in the absence of clean join attributes. Non-structured documents from the web are used to express the correlations between a given query and a reference list. To implement this approach, a major challenge we meet is how to efficiently determine the number of times and the locations of each clean reference from the reference list that is approximately mentioned in the retrieved documents. We formalize the Approximate Membership Localization (AML) problem and propose an efficient partial pruning algorithm to solve it. A study using real-word data sets demonstrates the effectiveness of our search-based approach, and the efficiency of our AML algorithm.
Resumo:
It is important to promote a sustainable development approach to ensure that economic, environmental and social developments are maintained in balance. Sustainable development and its implications are not just a global concern, it also affects Australia. In particular, rural Australian communities are facing various economic, environmental and social challenges. Thus, the need for sustainable development in rural regions is becoming increasingly important. To promote sustainable development, proper frameworks along with the associated tools optimised for the specific regions, need to be developed. This will ensure that the decisions made for sustainable development are evidence based, instead of subjective opinions. To address these issues, Queensland University of Technology (QUT), through an Australian Research Council (ARC) linkage grant, has initiated research into the development of a Rural Statistical Sustainability Framework (RSSF) to aid sustainable decision making in rural Queensland. This particular branch of the research developed a decision support tool that will become the integrating component of the RSSF. This tool is developed on the web-based platform to allow easy dissemination, quick maintenance and to minimise compatibility issues. The tool is developed based on MapGuide Open Source and it follows the three-tier architecture: Client tier, Web tier and the Server tier. The developed tool is interactive and behaves similar to a familiar desktop-based application. It has the capability to handle and display vector-based spatial data and can give further visual outputs using charts and tables. The data used in this tool is obtained from the QUT research team. Overall the tool implements four tasks to help in the decision-making process. These are the Locality Classification, Trend Display, Impact Assessment and Data Entry and Update. The developed tool utilises open source and freely available software and accounts for easy extensibility and long-term sustainability.
Resumo:
Open-source software systems have become a viable alternative to proprietary systems. We collected data on the usage of an open-source workflow management system developed by a university research group, and examined this data with a focus on how three different user cohorts – students, academics and industry professionals – develop behavioral intentions to use the system. Building upon a framework of motivational components, we examined the group differences in extrinsic versus intrinsic motivations on continued usage intentions. Our study provides a detailed understanding of the use of open-source workflow management systems in different user communities. Moreover, it discusses implications for the provision of workflow management systems, the user-specific management of open-source systems and the development of services in the wider user community.
Resumo:
Researchers are increasingly involved in data-intensive research projects that cut across geographic and disciplinary borders. Quality research now often involves virtual communities of researchers participating in large-scale web-based collaborations, opening their earlystage research to the research community in order to encourage broader participation and accelerate discoveries. The result of such large-scale collaborations has been the production of ever-increasing amounts of data. In short, we are in the midst of a data deluge. Accompanying these developments has been a growing recognition that if the benefits of enhanced access to research are to be realised, it will be necessary to develop the systems and services that enable data to be managed and secured. It has also become apparent that to achieve seamless access to data it is necessary not only to adopt appropriate technical standards, practices and architecture, but also to develop legal frameworks that facilitate access to and use of research data. This chapter provides an overview of the current research landscape in Australia as it relates to the collection, management and sharing of research data. The chapter then explains the Australian legal regimes relevant to data, including copyright, patent, privacy, confidentiality and contract law. Finally, this chapter proposes the infrastructure elements that are required for the proper management of legal interests, ownership rights and rights to access and use data collected or generated by research projects.