791 resultados para Web Data Mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a novel application of fuzzy logic to web data mining for two basic problems of a website: popularity and satisfaction. Popularity means that people will visit the website while satisfaction refers to the usefulness of the site. We will illustrate that the popularity of a website is a fuzzy logic problem. It is an important characteristic of a website in order to survive in Internet commerce. The satisfaction of a website is also a fuzzy logic problem that represents the degree of success in the application of information technology to the business. We propose a framework of fuzzy logic for the representation of these two problems based on web data mining techniques to fuzzify the attributes of a website.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present CORDER (COmmunity Relation Discovery by named Entity Recognition) an un-supervised machine learning algorithm that exploits named entity recognition and co-occurrence data to associate individuals in an organization with their expertise and associates. We discuss the problems associated with evaluating unsupervised learners and report our initial evaluation experiments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Melanoma is a highly aggressive and therapy resistant tumor for which the identification of specific markers and therapeutic targets is highly desirable. We describe here the development and use of a bioinformatic pipeline tool, made publicly available under the name of EST2TSE, for the in silico detection of candidate genes with tissue-specific expression. Using this tool we mined the human EST (Expressed Sequence Tag) database for sequences derived exclusively from melanoma. We found 29 UniGene clusters of multiple ESTs with the potential to predict novel genes with melanoma-specific expression. Using a diverse panel of human tissues and cell lines, we validated the expression of a subset of three previously uncharacterized genes (clusters Hs.295012, Hs.518391, and Hs.559350) to be highly restricted to melanoma/melanocytes and named them RMEL1, 2 and 3, respectively. Expression analysis in nevi, primary melanomas, and metastatic melanomas revealed RMEL1 as a novel melanocytic lineage-specific gene up-regulated during melanoma development. RMEL2 expression was restricted to melanoma tissues and glioblastoma. RMEL3 showed strong up-regulation in nevi and was lost in metastatic tumors. Interestingly, we found correlations of RMEL2 and RMEL3 expression with improved patient outcome, suggesting tumor and/or metastasis suppressor functions for these genes. The three genes are composed of multiple exons and map to 2q12.2, 1q25.3, and 5q11.2, respectively. They are well conserved throughout primates, but not other genomes, and were predicted as having no coding potential, although primate-conserved and human-specific short ORFs could be found. Hairpin RNA secondary structures were also predicted. Concluding, this work offers new melanoma-specific genes for future validation as prognostic markers or as targets for the development of therapeutic strategies to treat melanoma.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O aumento de tecnologias disponíveis na Web favoreceu o aparecimento de diversas formas de informação, recursos e serviços. Este aumento aliado à constante necessidade de formação e evolução das pessoas, quer a nível pessoal como profissional, incentivou o desenvolvimento área de sistemas de hipermédia adaptativa educacional - SHAE. Estes sistemas têm a capacidade de adaptar o ensino consoante o modelo do aluno, características pessoais, necessidades, entre outros aspetos. Os SHAE permitiram introduzir mudanças relativamente à forma de ensino, passando do ensino tradicional que se restringia apenas ao uso de livros escolares até à utilização de ferramentas informáticas que através do acesso à internet disponibilizam material didático, privilegiando o ensino individualizado. Os SHAE geram grande volume de dados, informação contida no modelo do aluno e todos os dados relativos ao processo de aprendizagem de cada aluno. Facilmente estes dados são ignorados e não se procede a uma análise cuidada que permita melhorar o conhecimento do comportamento dos alunos durante o processo de ensino, alterando a forma de aprendizagem de acordo com o aluno e favorecendo a melhoria dos resultados obtidos. O objetivo deste trabalho foi selecionar e aplicar algumas técnicas de Data Mining a um SHAE, PCMAT - Mathematics Collaborative Educational System. A aplicação destas técnicas deram origem a modelos de dados que transformaram os dados em informações úteis e compreensíveis, essenciais para a geração de novos perfis de alunos, padrões de comportamento de alunos, regras de adaptação e pedagógicas. Neste trabalho foram criados alguns modelos de dados recorrendo à técnica de Data Mining de classificação, abordando diferentes algoritmos. Os resultados obtidos permitirão definir novas regras de adaptação e padrões de comportamento dos alunos, poderá melhorar o processo de aprendizagem disponível num SHAE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this project a research both in finding predictors via clustering techniques and in reviewing the Data Mining free software is achieved. The research is based in a case of study, from where additionally to the KDD free software used by the scientific community; a new free tool for pre-processing the data is presented. The predictors are intended for the e-learning domain as the data from where these predictors have to be inferred are student qualifications from different e-learning environments. Through our case of study not only clustering algorithms are tested but also additional goals are proposed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Development of methods to explore data from educational settings, to understand better the learning process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: Growing numbers of researchers work on improving the results of Web Mining by exploiting semantic structures in the Web, and they use Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The second aim of this paper is to use these concepts to circumscribe what Web space is, what it represents and how it can be represented and analyzed. This is used to sketch the role that Semantic Web Mining and the software agents and human agents involved in it can play in the evolution of Web space.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: an increasing number of researchers is working on improving the results of Web Mining by exploiting semantic structures in the Web, and they make use of Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The Semantic Web is the second-generation WWW, enriched by machine-processable information which supports the user in his tasks. Given the enormous size even of today’s Web, it is impossible to manually enrich all of these resources. Therefore, automated schemes for learning the relevant information are increasingly being used. Web Mining aims at discovering insights about the meaning of Web resources and their usage. Given the primarily syntactical nature of the data being mined, the discovery of meaning is impossible based on these data only. Therefore, formalizations of the semantics of Web sites and navigation behavior are becoming more and more common. Furthermore, mining the Semantic Web itself is another upcoming application. We argue that the two areas Web Mining and Semantic Web need each other to fulfill their goals, but that the full potential of this convergence is not yet realized. This paper gives an overview of where the two areas meet today, and sketches ways of how a closer integration could be profitable.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this position paper, we claim that the need for time consuming data preparation and result interpretation tasks in knowledge discovery, as well as for costly expert consultation and consensus building activities required for ontology building can be reduced through exploiting the interplay of data mining and ontology engineering. The aim is to obtain in a semi-automatic way new knowledge from distributed data sources that can be used for inference and reasoning, as well as to guide the extraction of further knowledge from these data sources. The proposed approach is based on the creation of a novel knowledge discovery method relying on the combination, through an iterative ?feedbackloop?, of (a) data mining techniques to make emerge implicit models from data and (b) pattern-based ontology engineering to capture these models in reusable, conceptual and inferable artefacts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En esta memoria se presenta el diseño y desarrollo de una aplicación en la nube destinada a la compartición de objetos y servicios. El desarrollo de esta aplicación surge dentro del proyecto de I+D+i, SITAC: Social Internet of Things – Apps by and for the Crowd ITEA 2 11020, que trata de crear una arquitectura integradora y un “ecosistema” que incluya plataformas, herramientas y metodologías para facilitar la conexión y cooperación de entidades de distinto tipo conectadas a la red bien sean sistemas, máquinas, dispositivos o personas con dispositivos móviles personales como tabletas o teléfonos móviles. El proyecto innovará mediante la utilización de un modelo inspirado en las redes sociales para facilitar y unificar las interacciones tanto entre personas como entre personas y dispositivos. En este contexto surge la necesidad de desarrollar una aplicación destinada a la compartición de recursos en la nube que pueden ser tanto lógicos como físicos, y que esté orientada al big data. Ésta será la aplicación presentada en este trabajo, el “Resource Sharing Center”, que ofrece un servicio web para el intercambio y compartición de contenido, y un motor de recomendaciones basado en las preferencias de los usuarios. Con este objetivo, se han usado tecnologías de despliegue en la nube, como Elastic Beanstalk (el PaaS de Amazon Web Services), S3 (el sistema de almacenamiento de Amazon Web Services), SimpleDB (base de datos NoSQL) y HTML5 con JavaScript y Twitter Bootstrap para el desarrollo del front-end, siendo Python y Node.js las tecnologías usadas en el back end, y habiendo contribuido a la mejora de herramientas de clustering sobre big data. Por último, y de cara a realizar el estudio sobre las pruebas de carga de la aplicación se ha usado la herramienta ApacheJMeter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Citizens demand more and more data for making decisions in their daily life. Therefore, mechanisms that allow citizens to understand and analyze linked open data (LOD) in a user-friendly manner are highly required. To this aim, the concept of Open Business Intelligence (OpenBI) is introduced in this position paper. OpenBI facilitates non-expert users to (i) analyze and visualize LOD, thus generating actionable information by means of reporting, OLAP analysis, dashboards or data mining; and to (ii) share the new acquired information as LOD to be reused by anyone. One of the most challenging issues of OpenBI is related to data mining, since non-experts (as citizens) need guidance during preprocessing and application of mining algorithms due to the complexity of the mining process and the low quality of the data sources. This is even worst when dealing with LOD, not only because of the different kind of links among data, but also because of its high dimensionality. As a consequence, in this position paper we advocate that data mining for OpenBI requires data quality-aware mechanisms for guiding non-expert users in obtaining and sharing the most reliable knowledge from the available LOD.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Comunicación presentada en las XVI Jornadas de Ingeniería del Software y Bases de Datos, JISBD 2011, A Coruña, 5-7 septiembre 2011.