967 resultados para Web Mining


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research investigates the prevalence of sports-related terms among the Web sites of the world’s leading companies, the Fortune Global 500. An automated process copied about four gigabytes of textual data, around 70 million words, from their sites. The subsequent analysis revealed regional and industry differences in the distribution of sports-related terms, the popularity of tennis stars and few references to sports stars, especially in Asia.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Search engines have forever changed the way people access and discover knowledge, allowing information about almost any subject to be quickly and easily retrieved within seconds. As increasingly more material becomes available electronically the influence of search engines on our lives will continue to grow. This presents the problem of how to find what information is contained in each search engine, what bias a search engine may have, and how to select the best search engine for a particular information need. This research introduces a new method, search engine content analysis, in order to solve the above problem. Search engine content analysis is a new development of traditional information retrieval field called collection selection, which deals with general information repositories. Current research in collection selection relies on full access to the collection or estimations of the size of the collections. Also collection descriptions are often represented as term occurrence statistics. An automatic ontology learning method is developed for the search engine content analysis, which trains an ontology with world knowledge of hundreds of different subjects in a multilevel taxonomy. This ontology is then mined to find important classification rules, and these rules are used to perform an extensive analysis of the content of the largest general purpose Internet search engines in use today. Instead of representing collections as a set of terms, which commonly occurs in collection selection, they are represented as a set of subjects, leading to a more robust representation of information and a decrease of synonymy. The ontology based method was compared with ReDDE (Relevant Document Distribution Estimation method for resource selection) using the standard R-value metric, with encouraging results. ReDDE is the current state of the art collection selection method which relies on collection size estimation. The method was also used to analyse the content of the most popular search engines in use today, including Google and Yahoo. In addition several specialist search engines such as Pubmed and the U.S. Department of Agriculture were analysed. In conclusion, this research shows that the ontology based method mitigates the need for collection size estimation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite the increased offering of online communication channels to support web-based retail systems, there is limited marketing research that investigates how these channels act singly, or in combination with online channels, to influence an individual' s intention to purchase online. If the marketer's strategy is to encourage online transactions, this requires a focus on consumer acceptance of the web-based transaction technology, rather than the purchase of the products per se. The exploratory study reported in this paper examines normative influences from referent groups in an individual's on and offline social communication networks that might affect their intention to use online transaction facilities. The findings suggest that for non-adopters, there is no normative influence from referents in either network. For adopters, one online and one offline referent norm positively influenced this group's intentions to use online transaction facilities. The implications of these findings are discussed together with future research directions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The construction industry has adapted information technology in its processes in terms of computer aided design and drafting, construction documentation and maintenance. The data generated within the construction industry has become increasingly overwhelming. Data mining is a sophisticated data search capability that uses classification algorithms to discover patterns and correlations within a large volume of data. This paper presents the selection and application of data mining techniques on maintenance data of buildings. The results of applying such techniques and potential benefits of utilising their results to identify useful patterns of knowledge and correlations to support decision making of improving the management of building life cycle are presented and discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This report demonstrates the development of: (a) object-oriented representation to provide 3D interactive environment using data provided by Woods Bagot; (b) establishing basis of agent technology for mining building maintenance data, and (C) 3D interaction in virtual environments using object-oriented representation. Applying data mining over industry maintenance database has been demonstrated in the previous report.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This report demonstrates the development of: • Development of software agents for data mining • Link data mining to building model in virtual environments • Link knowledge development with building model in virtual environments • Demonstration of software agents for data mining • Populate with maintenance data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The building life cycle process is complex and prone to fragmentation as it moves through its various stages. The number of participants, and the diversity, specialisation and isolation both in space and time of their activities, have dramatically increased over time. The data generated within the construction industry has become increasingly overwhelming. Most currently available computer tools for the building industry have offered productivity improvement in the transmission of graphical drawings and textual specifications, without addressing more fundamental changes in building life cycle management. Facility managers and building owners are primarily concerned with highlighting areas of existing or potential maintenance problems in order to be able to improve the building performance, satisfying occupants and minimising turnover especially the operational cost of maintenance. In doing so, they collect large amounts of data that is stored in the building’s maintenance database. The work described in this paper is targeted at adding value to the design and maintenance of buildings by turning maintenance data into information and knowledge. Data mining technology presents an opportunity to increase significantly the rate at which the volumes of data generated through the maintenance process can be turned into useful information. This can be done using classification algorithms to discover patterns and correlations within a large volume of data. This paper presents how and what data mining techniques can be applied on maintenance data of buildings to identify the impediments to better performance of building assets. It demonstrates what sorts of knowledge can be found in maintenance records. The benefits to the construction industry lie in turning passive data in databases into knowledge that can improve the efficiency of the maintenance process and of future designs that incorporate that maintenance knowledge.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The rate of water reform in Australia is gathering pace with Federal and State initiatives promoting a more integrated approach to water management. This approach encompasses a more competitive environment and a greater role for the private sector. There is a growing recognition of the importance of water recycling in these initiatives and the need to provide opportunities for its development. In March 2008 the Productivity Commission published its discussion paper on urban water reform (Productivity Commission, 2008). The paper cited inadequate institutional arrangements for the management of Australian urban water resources and noted the benefits to be gained from a comprehensive public review of urban water management. This development can be supported through the promotion of a sewer mining industry. This industry, offers flexible and innovative solutions to water recycling demands in a variety of situations and structures. In addition it has the capability of satisfying government competition and private sector policy initiatives.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This project is an extension of a previous CRC project (220-059-B) which developed a program for life prediction of gutters in Queensland schools. A number of sources of information on service life of metallic building components were formed into databases linked to a Case-Based Reasoning Engine which extracted relevant cases from each source. In the initial software, no attempt was made to choose between the results offered or construct a case for retention in the casebase. In this phase of the project, alternative data mining techniques will be explored and evaluated. A process for selecting a unique service life prediction for each query will also be investigated. This report summarises the initial evaluation of several data mining techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper deals with the problem of using the data mining models in a real-world situation where the user can not provide all the inputs with which the predictive model is built. A learning system framework, Query Based Learning System (QBLS), is developed for improving the performance of the predictive models in practice where not all inputs are available for querying to the system. The automatic feature selection algorithm called Query Based Feature Selection (QBFS) is developed for selecting features to obtain a balance between the relative minimum subset of features and the relative maximum classification accuracy. Performance of the QBLS system and the QBFS algorithm is successfully demonstrated with a real-world application

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Maps have been published on the world wide web since its inception (Cartwright, 1999) and are still accessed and viewed by millions of users today (Peterson, 2003). While early webbased GIS products lacked a complete set of cartographic capabilities, the functionality within such systems has significantly increased over recent years. Functionalities once found only in desktop GIS products are now available in web-based GIS applications, for example, data entry, basic editing, and analysis. Applications based on web-GIS are becoming more widespread and the web-based GIS environment is replacing the traditional desktop GIS platforms in many organizations. Therefore, development of a new cartographic method for web-based GIS is vital. The broad aim of this project is to examine and discuss the challenges and opportunities of innovative cartography methods for web-based GIS platforms. The work introduces a recently developed cartographic methodology, which is based on a web-based GIS portal by the Survey of Israel (SOI). The work discusses the prospects and constraints of such methods in improving web-GIS interfaces and usability for the end user. The work also tables the preliminary findings of the initial implementation of the web-based GIS cartographic method within the portal of the Survey of Israel, as well as the applicability of those methods elsewhere.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Decision support systems (DSS) have evolved rapidly during the last decade from stand alone or limited networked solutions to online participatory solutions. One of the major enablers of this change is the fastest growing areas of geographical information system (GIS) technology development that relates to the use of the Internet as a means to access, display, and analyze geospatial data remotely. World-wide many federal, state, and particularly local governments are designing to facilitate data sharing using interactive Internet map servers. This new generation DSS or planning support systems (PSS), interactive Internet map server, is the solution for delivering dynamic maps and GIS data and services via the world-wide Web, and providing public participatory GIS (PPGIS) opportunities to a wider community (Carver, 2001; Jankowski & Nyerges, 2001). It provides a highly scalable framework for GIS Web publishing, Web-based public participatory GIS (WPPGIS), which meets the needs of corporate intranets and demands of worldwide Internet access (Craig, 2002). The establishment of WPPGIS provides spatial data access through a support centre or a GIS portal to facilitate efficient access to and sharing of related geospatial data (Yigitcanlar, Baum, & Stimson, 2003). As more and more public and private entities adopt WPPGIS technology, the importance and complexity of facilitating geospatial data sharing is growing rapidly (Carver, 2003). Therefore, this article focuses on the online public participation dimension of the GIS technology. The article provides an overview of recent literature on GIS and WPPGIS, and includes a discussion on the potential use of these technologies in providing a democratic platform for the public in decision-making.