903 resultados para linked open data
Resumo:
We present a methodology for legacy language resource adaptation that generates domain-specific sentiment lexicons organized around domain entities described with lexical information and sentiment words described in the context of these entities. We explain the steps of the methodology and we give a working example of our initial results. The resulting lexicons are modelled as Linked Data resources by use of established formats for Linguistic Linked Data (lemon, NIF) and for linked sentiment expressions (Marl), thereby contributing and linking to existing Language Resources in the Linguistic Linked Open Data cloud.
Resumo:
The application of Linked Data technology to the publication of linguistic data promises to facilitate interoperability of these data and has lead to the emergence of the so called Linguistic Linked Data Cloud (LLD) in which linguistic data is published following the Linked Data principles. Three essential issues need to be addressed for such data to be easily exploitable by language technologies: i) appropriate machine-readable licensing information is needed for each dataset, ii) minimum quality standards for Linguistic Linked Data need to be defined, and iii) appropriate vocabularies for publishing Linguistic Linked Data resources are needed. We propose the notion of Licensed Linguistic Linked Data (3LD) in which different licensing models might co-exist, from totally open to more restrictive licenses through to completely closed datasets.
Resumo:
Within the European Union, member states are setting up official data catalogues as entry points to access PSI (Public Sector Information). In this context, it is important to describe the metadata of these data portals, i.e., of data catalogs, and allow for interoperability among them. To tackle these issues, the Government Linked Data Working Group developed DCAT (Data Catalog Vocabulary), an RDF vocabulary for describing the metadata of data catalogs. This topic report analyzes the current use of the DCAT vocabulary in several European data catalogs and proposes some recommendations to deal with an inconsistent use of the metadata across countries. The enrichment of such metadata vocabularies with multilingual descriptions, as well as an account for cultural divergences, is seen as a necessary step to guarantee interoperability and ensure wider adoption.
Resumo:
Recently, experts and practitioners in language resources have started recognizing the benefits of the linked data (LD) paradigm for the representation and exploitation of linguistic data on the Web. The adoption of the LD principles is leading to an emerging ecosystem of multilingual open resources that conform to the Linguistic Linked Open Data Cloud, in which datasets of linguistic data are interconnected and represented following common vocabularies, which facilitates linguistic information discovery, integration and access. In order to contribute to this initiative, this paper summarizes several key aspects of the representation of linguistic information as linked data from a practical perspective. The main goal of this document is to provide the basic ideas and tools for migrating language resources (lexicons, corpora, etc.) as LD on the Web and to develop some useful NLP tasks with them (e.g., word sense disambiguation). Such material was the basis of a tutorial imparted at the EKAW’14 conference, which is also reported in the paper.
Resumo:
We describe a domain ontology development approach that extracts domain terms from folksonomies and enrich them with data and vocabularies from the Linked Open Data cloud. As a result, we obtain lightweight domain ontologies that combine the emergent knowledge of social tagging systems with formal knowledge from Ontologies. In order to illustrate the feasibility of our approach, we have produced an ontology in the financial domain from tags available in Delicious, using DBpedia, OpenCyc and UMBEL as additional knowledge sources.
Open business intelligence: on the importance of data quality awareness in user-friendly data mining
Resumo:
Citizens demand more and more data for making decisions in their daily life. Therefore, mechanisms that allow citizens to understand and analyze linked open data (LOD) in a user-friendly manner are highly required. To this aim, the concept of Open Business Intelligence (OpenBI) is introduced in this position paper. OpenBI facilitates non-expert users to (i) analyze and visualize LOD, thus generating actionable information by means of reporting, OLAP analysis, dashboards or data mining; and to (ii) share the new acquired information as LOD to be reused by anyone. One of the most challenging issues of OpenBI is related to data mining, since non-experts (as citizens) need guidance during preprocessing and application of mining algorithms due to the complexity of the mining process and the low quality of the data sources. This is even worst when dealing with LOD, not only because of the different kind of links among data, but also because of its high dimensionality. As a consequence, in this position paper we advocate that data mining for OpenBI requires data quality-aware mechanisms for guiding non-expert users in obtaining and sharing the most reliable knowledge from the available LOD.
Resumo:
This special issue of the Journal of Urban Technology brings together five articles that are based on presentations given at the Street Computing workshop held on 24 November 2009 in Melbourne in conjunction with the Australian Computer-Human Interaction conference (OZCHI 2009). Our own article introduces the Street Computing vision and explores the potential, challenges and foundations of this research vision. In order to do so, we first look at the currently available sources of information and discuss their link to existing research efforts. Section 2 then introduces the notion of Street Computing and our research approach in more detail. Section 3 looks beyond the core concept itself and summarises related work in this field of interest.
Resumo:
Cities accumulate and distribute vast sets of digital information. Many decision-making and planning processes in councils, local governments and organisations are based on both real-time and historical data. Until recently, only a small, carefully selected subset of this information has been released to the public – usually for specific purposes (e.g. train timetables, release of planning application through websites to name just a few). This situation is however changing rapidly. Regulatory frameworks, such as the Freedom of Information Legislation in the US, the UK, the European Union and many other countries guarantee public access to data held by the state. One of the results of this legislation and changing attitudes towards open data has been the widespread release of public information as part of recent Government 2.0 initiatives. This includes the creation of public data catalogues such as data.gov.au (U.S.), data.gov.uk (U.K.), data.gov.au (Australia) at federal government levels, and datasf.org (San Francisco) and data.london.gov.uk (London) at municipal levels. The release of this data has opened up the possibility of a wide range of future applications and services which are now the subject of intensified research efforts. Previous research endeavours have explored the creation of specialised tools to aid decision-making by urban citizens, councils and other stakeholders (Calabrese, Kloeckl & Ratti, 2008; Paulos, Honicky & Hooker, 2009). While these initiatives represent an important step towards open data, they too often result in mere collections of data repositories. Proprietary database formats and the lack of an open application programming interface (API) limit the full potential achievable by allowing these data sets to be cross-queried. Our research, presented in this paper, looks beyond the pure release of data. It is concerned with three essential questions: First, how can data from different sources be integrated into a consistent framework and made accessible? Second, how can ordinary citizens be supported in easily composing data from different sources in order to address their specific problems? Third, what are interfaces that make it easy for citizens to interact with data in an urban environment? How can data be accessed and collected?
Resumo:
This thesis provides a query model suitable for context sensitive access to a wide range of distributed linked datasets which are available to scientists using the Internet. The model is designed based on scientific research standards which require scientists to provide replicable methods in their publications. Although there are query models available that provide limited replicability, they do not contextualise the process whereby different scientists select dataset locations based on their trust and physical location. In different contexts, scientists need to perform different data cleaning actions, independent of the overall query, and the model was designed to accommodate this function. The query model was implemented as a prototype web application and its features were verified through its use as the engine behind a major scientific data access site, Bio2RDF.org. The prototype showed that it was possible to have context sensitive behaviour for each of the three mirrors of Bio2RDF.org using a single set of configuration settings. The prototype provided executable query provenance that could be attached to scientific publications to fulfil replicability requirements. The model was designed to make it simple to independently interpret and execute the query provenance documents using context specific profiles, without modifying the original provenance documents. Experiments using the prototype as the data access tool in workflow management systems confirmed that the design of the model made it possible to replicate results in different contexts with minimal additions, and no deletions, to query provenance documents.
Resumo:
This special issue of the Journal of Urban Technology brings together five articles that are based on presentations given at the Street Computing Workshop held on 24 November 2009 in Melbourne in conjunction with the Australian Computer- Human Interaction conference (OZCHI 2009). Our own article introduces the Street Computing vision and explores the potential, challenges, and foundations of this research trajectory. In order to do so, we first look at the currently available sources of information and discuss their link to existing research efforts. Section 2 then introduces the notion of Street Computing and our research approach in more detail. Section 3 looks beyond the core concept itself and summarizes related work in this field of interest. We conclude by introducing the papers that have been contributed to this special issue.
Resumo:
As support grows for greater access to information and data held by governments, so does awareness of the need for appropriate policy, technical and legal frameworks to achieve the desired economic and societal outcomes. Since the late 2000s numerous international organizations, inter-governmental bodies and governments have issued open government data policies, which set out key principles underpinning access to, and the release and reuse of data. These policies reiterate the value of government data and establish the default position that it should be openly accessible to the public under transparent and non-discriminatory conditions, which are conducive to innovative reuse of the data. A key principle stated in open government data policies is that legal rights in government information must be exercised in a manner that is consistent with and supports the open accessibility and reusability of the data. In particular, where government information and data is protected by copyright, access should be provided under licensing terms which clearly permit its reuse and dissemination. This principle has been further developed in the policies issued by Australian Governments into a specific requirement that Government agencies are to apply the Creative Commons Attribution licence (CC BY) as the default licensing position when releasing government information and data. A wide-ranging survey of the practices of Australian Government agencies in managing their information and data, commissioned by the Office of the Australian Information Commissioner in 2012, provides valuable insights into progress towards the achievement of open government policy objectives and the adoption of open licensing practices. The survey results indicate that Australian Government agencies are embracing open access and a proactive disclosure culture and that open licensing under Creative Commons licences is increasingly prevalent. However, the finding that ‘[t]he default position of open access licensing is not clearly or robustly stated, nor properly reflected in the practice of Government agencies’ points to the need to further develop the policy framework and the principles governing information access and reuse, and to provide practical guidance tools on open licensing if the broadest range of government information and data is to be made available for innovative reuse.
Resumo:
Enterprises, both public and private, have rapidly commenced using the benefits of enterprise resource planning (ERP) combined with business analytics and “open data sets” which are often outside the control of the enterprise to gain further efficiencies, build new service operations and increase business activity. In many cases, these business activities are based around relevant software systems hosted in a “cloud computing” environment. “Garbage in, garbage out”, or “GIGO”, is a term long used to describe problems in unqualified dependency on information systems, dating from the 1960s. However, a more pertinent variation arose sometime later, namely “garbage in, gospel out” signifying that with large scale information systems, such as ERP and usage of open datasets in a cloud environment, the ability to verify the authenticity of those data sets used may be almost impossible, resulting in dependence upon questionable results. Illicit data set “impersonation” becomes a reality. At the same time the ability to audit such results may be an important requirement, particularly in the public sector. This paper discusses the need for enhancement of identity, reliability, authenticity and audit services, including naming and addressing services, in this emerging environment and analyses some current technologies that are offered and which may be appropriate. However, severe limitations to addressing these requirements have been identified and the paper proposes further research work in the area.
Resumo:
Enterprise resource planning (ERP) systems are rapidly being combined with “big data” analytics processes and publicly available “open data sets”, which are usually outside the arena of the enterprise, to expand activity through better service to current clients as well as identifying new opportunities. Moreover, these activities are now largely based around relevant software systems hosted in a “cloud computing” environment. However, the over 50- year old phrase related to mistrust in computer systems, namely “garbage in, garbage out” or “GIGO”, is used to describe problems of unqualified and unquestioning dependency on information systems. However, a more relevant GIGO interpretation arose sometime later, namely “garbage in, gospel out” signifying that with large scale information systems based around ERP and open datasets as well as “big data” analytics, particularly in a cloud environment, the ability to verify the authenticity and integrity of the data sets used may be almost impossible. In turn, this may easily result in decision making based upon questionable results which are unverifiable. Illicit “impersonation” of and modifications to legitimate data sets may become a reality while at the same time the ability to audit any derived results of analysis may be an important requirement, particularly in the public sector. The pressing need for enhancement of identity, reliability, authenticity and audit services, including naming and addressing services, in this emerging environment is discussed in this paper. Some current and appropriate technologies currently being offered are also examined. However, severe limitations in addressing the problems identified are found and the paper proposes further necessary research work for the area. (Note: This paper is based on an earlier unpublished paper/presentation “Identity, Addressing, Authenticity and Audit Requirements for Trust in ERP, Analytics and Big/Open Data in a ‘Cloud’ Computing Environment: A Review and Proposal” presented to the Department of Accounting and IT, College of Management, National Chung Chen University, 20 November 2013.)
Resumo:
This paper addresses the development of trust in the use of Open Data through incorporation of appropriate authentication and integrity parameters for use by end user Open Data application developers in an architecture for trustworthy Open Data Services. The advantages of this architecture scheme is that it is far more scalable, not another certificate-based hierarchy that has problems with certificate revocation management. With the use of a Public File, if the key is compromised: it is a simple matter of the single responsible entity replacing the key pair with a new one and re-performing the data file signing process. Under this proposed architecture, the the Open Data environment does not interfere with the internal security schemes that might be employed by the entity. However, this architecture incorporates, when needed, parameters from the entity, e.g. person who authorized publishing as Open Data, at the time that datasets are created/added.
Resumo:
The member states of the European Union are faced with the challenges of handling “big data” as well as with a growing impact of the supranational level. Given that the success of efforts at European level strongly depends on corresponding national and local activities, i.e., the quality of implementation and the degree of consistency, this chapter centers upon the coherence of European strategies and national implementations concerning the reuse of public sector information. Taking the City of Vienna’s open data activities as an illustrative example, we seek an answer to the question whether and to what extent developments at European level and other factors have an effect on local efforts towards open data. We find that the European Commission’s ambitions are driven by a strong economic argumentation, while the efforts of the City of Vienna have only very little to do with the European orientation and are rather dominated by lifestyle and administrative reform arguments. Hence, we observe a decoupling of supranational strategies and national implementation activities. The very reluctant attitude at Austrian federal level might be one reason for this, nationally induced barriers—such as the administrative culture—might be another. In order to enhance the correspondence between the strategies of the supranational level and those of the implementers at national and regional levels, the strengthening of soft law measures could be promising.