941 resultados para web content
Resumo:
Estudio del grado de accesibilidad de las webs corporativas de las universidades españolas, según el cumplimiento de las Web Content Accessibility Guidelines (WCAG) y otros indicadores.
Resumo:
Current-day web search engines (e.g., Google) do not crawl and index a significant portion of theWeb and, hence, web users relying on search engines only are unable to discover and access a large amount of information from the non-indexable part of the Web. Specifically, dynamic pages generated based on parameters provided by a user via web search forms (or search interfaces) are not indexed by search engines and cannot be found in searchers’ results. Such search interfaces provide web users with an online access to myriads of databases on the Web. In order to obtain some information from a web database of interest, a user issues his/her query by specifying query terms in a search form and receives the query results, a set of dynamic pages that embed required information from a database. At the same time, issuing a query via an arbitrary search interface is an extremely complex task for any kind of automatic agents including web crawlers, which, at least up to the present day, do not even attempt to pass through web forms on a large scale. In this thesis, our primary and key object of study is a huge portion of the Web (hereafter referred as the deep Web) hidden behind web search interfaces. We concentrate on three classes of problems around the deep Web: characterization of deep Web, finding and classifying deep web resources, and querying web databases. Characterizing deep Web: Though the term deep Web was coined in 2000, which is sufficiently long ago for any web-related concept/technology, we still do not know many important characteristics of the deep Web. Another matter of concern is that surveys of the deep Web existing so far are predominantly based on study of deep web sites in English. One can then expect that findings from these surveys may be biased, especially owing to a steady increase in non-English web content. In this way, surveying of national segments of the deep Web is of interest not only to national communities but to the whole web community as well. In this thesis, we propose two new methods for estimating the main parameters of deep Web. We use the suggested methods to estimate the scale of one specific national segment of the Web and report our findings. We also build and make publicly available a dataset describing more than 200 web databases from the national segment of the Web. Finding deep web resources: The deep Web has been growing at a very fast pace. It has been estimated that there are hundred thousands of deep web sites. Due to the huge volume of information in the deep Web, there has been a significant interest to approaches that allow users and computer applications to leverage this information. Most approaches assumed that search interfaces to web databases of interest are already discovered and known to query systems. However, such assumptions do not hold true mostly because of the large scale of the deep Web – indeed, for any given domain of interest there are too many web databases with relevant content. Thus, the ability to locate search interfaces to web databases becomes a key requirement for any application accessing the deep Web. In this thesis, we describe the architecture of the I-Crawler, a system for finding and classifying search interfaces. Specifically, the I-Crawler is intentionally designed to be used in deepWeb characterization studies and for constructing directories of deep web resources. Unlike almost all other approaches to the deep Web existing so far, the I-Crawler is able to recognize and analyze JavaScript-rich and non-HTML searchable forms. Querying web databases: Retrieving information by filling out web search forms is a typical task for a web user. This is all the more so as interfaces of conventional search engines are also web forms. At present, a user needs to manually provide input values to search interfaces and then extract required data from the pages with results. The manual filling out forms is not feasible and cumbersome in cases of complex queries but such kind of queries are essential for many web searches especially in the area of e-commerce. In this way, the automation of querying and retrieving data behind search interfaces is desirable and essential for such tasks as building domain-independent deep web crawlers and automated web agents, searching for domain-specific information (vertical search engines), and for extraction and integration of information from various deep web resources. We present a data model for representing search interfaces and discuss techniques for extracting field labels, client-side scripts and structured data from HTML pages. We also describe a representation of result pages and discuss how to extract and store results of form queries. Besides, we present a user-friendly and expressive form query language that allows one to retrieve information behind search interfaces and extract useful data from the result pages based on specified conditions. We implement a prototype system for querying web databases and describe its architecture and components design.
Resumo:
The purpose of this thesis is to study, investigate and compare usability of open source cms. The thesis examines and compares usability aspect of some open source cms. The research is divided into two complementary parts –theoretical part and analytical part. The theoretical part mainly describes open source web content management systems, usability and the evaluation methods. The analytical part is to compare and analyze the results found from the empirical research. Heuristic evaluation method was used to measure usability problems in the interfaces. The study is fairly limited in scope; six tasks were designed and implemented in each interface for discovering defects in the interfaces. Usability problems were rated according to their level of severity. Time it took by each task, level of problem’s severity and type of heuristics violated will be recorded, analyzed and compared. The results of this study indicate that the comparing systems provide usable interfaces, and WordPress is recognized as the most usable system.
Resumo:
Technological innovations, the development of the internet, and globalization have increased the number and complexity of web applications. As a result, keeping web user interfaces understandable and usable (in terms of ease-of-use, effectiveness, and satisfaction) is a challenge. As part of this, designing userintuitive interface signs (i.e., the small elements of web user interface, e.g., navigational link, command buttons, icons, small images, thumbnails, etc.) is an issue for designers. Interface signs are key elements of web user interfaces because ‘interface signs’ act as a communication artefact to convey web content and system functionality, and because users interact with systems by means of interface signs. In the light of the above, applying semiotic (i.e., the study of signs) concepts on web interface signs will contribute to discover new and important perspectives on web user interface design and evaluation. The thesis mainly focuses on web interface signs and uses the theory of semiotic as a background theory. The underlying aim of this thesis is to provide valuable insights to design and evaluate web user interfaces from a semiotic perspective in order to improve overall web usability. The fundamental research question is formulated as What do practitioners and researchers need to be aware of from a semiotic perspective when designing or evaluating web user interfaces to improve web usability? From a methodological perspective, the thesis follows a design science research (DSR) approach. A systematic literature review and six empirical studies are carried out in this thesis. The empirical studies are carried out with a total of 74 participants in Finland. The steps of a design science research process are followed while the studies were designed and conducted; that includes (a) problem identification and motivation, (b) definition of objectives of a solution, (c) design and development, (d) demonstration, (e) evaluation, and (f) communication. The data is collected using observations in a usability testing lab, by analytical (expert) inspection, with questionnaires, and in structured and semi-structured interviews. User behaviour analysis, qualitative analysis and statistics are used to analyze the study data. The results are summarized as follows and have lead to the following contributions. Firstly, the results present the current status of semiotic research in UI design and evaluation and highlight the importance of considering semiotic concepts in UI design and evaluation. Secondly, the thesis explores interface sign ontologies (i.e., sets of concepts and skills that a user should know to interpret the meaning of interface signs) by providing a set of ontologies used to interpret the meaning of interface signs, and by providing a set of features related to ontology mapping in interpreting the meaning of interface signs. Thirdly, the thesis explores the value of integrating semiotic concepts in usability testing. Fourthly, the thesis proposes a semiotic framework (Semiotic Interface sign Design and Evaluation – SIDE) for interface sign design and evaluation in order to make them intuitive for end users and to improve web usability. The SIDE framework includes a set of determinants and attributes of user-intuitive interface signs, and a set of semiotic heuristics to design and evaluate interface signs. Finally, the thesis assesses (a) the quality of the SIDE framework in terms of performance metrics (e.g., thoroughness, validity, effectiveness, reliability, etc.) and (b) the contributions of the SIDE framework from the evaluators’ perspective.
Resumo:
Search engines exploit the Web's hyperlink structure to help infer information content. The new phenomenon of personal Web logs, or 'blogs', encourage more extensive annotation of Web content. If their resulting link structures bias the Web crawling applications that search engines depend upon, there are implications for another form of annotation rapidly on the rise, the Semantic Web. We conducted a Web crawl of 160 000 pages in which the link structure of the Web is compared with that of several thousand blogs. Results show that the two link structures are significantly different. We analyse the differences and infer the likely effect upon the performance of existing and future Web agents. The Semantic Web offers new opportunities to navigate the Web, but Web agents should be designed to take advantage of the emerging link structures, or their effectiveness will diminish.
Resumo:
This work involves the organization and content perspectives on Enterprise Content Management (ECM) framework. The case study at the Federal University of Rio Grande do Norte was based on ECM model to analyse the information management provided by the three main administrative systems: The Integrated Management of Academic Activities (SIGAA), Integrated System of Inheritance, and Contracts Administration (SIPAC) and the Integrated System for Administration and Human Resources (SIGRH). A case study protocol was designed to provide greater reliability to research process. Four propositions were examined in order to reach the specific objectives of identification and evaluation of ECM components from UFRN perspective. The preliminary phase provided the guidelines for the data collection. In total, 75 individuals were interviewed. Interviews with four managers directly involved on systems design were recorded (average duration of 90 minutes). The 70 remaining individuals were approached in random way in UFRN s units, including teachers, administrative-technical employees and students. The results showed the presence of many ECM elements in the management of UFRN administrative information. The technological component with higher presence was "management of web content / collaboration". But initiatives of other components (e.g. email and document management) were found and are in continuous improvement. The assessment made use of eQual 4.0 to examine the effectiveness of applications under three factors: usability, quality of information and offered service. In general, the quality offered by the systems was very good and walk side by side with the obtained benefits of ECM strategy adoption in the context of the whole institution
Resumo:
In the context of Software Engineering, web accessibility is gaining more room, establishing itself as an important quality attribute. This fact is due to initiatives of institutions such as the W3C (World Wide Web Consortium) and the introduction of norms and laws such as Section 508 that underlie the importance of developing accessible Web sites and applications. Despite these improvements, the lack of web accessibility is still a persistent problem, and could be related to the moment or phase in which this requirement is solved within the development process. From the moment when Web accessibility is generally regarded as a programming problem or treated when the application is already developed entirely. Thus, consider accessibility already during activities of analysis and requirements specification shows itself a strategy to facilitate project progress, avoiding rework in advanced phases of software development because of possible errors, or omissions in the elicitation. The objective of this research is to develop a method and a tool to support requirements elicitation of web accessibility. The strategy for the requirements elicitation of this method is grounded by the Goal-Oriented approach NFR Framework and the use of catalogs NFRs, created based on the guidelines contained in WCAG 2.0 (Web Content Accessibility Guideline) proposed by W3C
Resumo:
Web content hosting, in which a Web server stores and provides Web access to documents for different customers, is becoming increasingly common. For example, a web server can host webpages for several different companies and individuals. Traditionally, Web Service Providers (WSPs) provide all customers with the same level of performance (best-effort service). Most service differentiation has been in the pricing structure (individual vs. business rates) or the connectivity type (dial-up access vs. leased line, etc.). This report presents DiffServer, a program that implements two simple, server-side, application-level mechanisms (server-centric and client-centric) to provide different levels of web service. The results of the experiments show that there is not much overhead due to the addition of this additional layer of abstraction between the client and the Apache web server under light load conditions. Also, the average waiting time for high priority requests decreases significantly after they are assigned priorities as compared to a FIFO approach.
Resumo:
Traditionally, ontologies describe knowledge representation in a denotational, formalized, and deductive way. In addition, in this paper, we propose a semiotic, inductive, and approximate approach to ontology creation. We define a conceptual framework, a semantics extraction algorithm, and a first proof of concept applying the algorithm to a small set of Wikipedia documents. Intended as an extension to the prevailing top-down ontologies, we introduce an inductive fuzzy grassroots ontology, which organizes itself organically from existing natural language Web content. Using inductive and approximate reasoning to reflect the natural way in which knowledge is processed, the ontology’s bottom-up build process creates emergent semantics learned from the Web. By this means, the ontology acts as a hub for computing with words described in natural language. For Web users, the structural semantics are visualized as inductive fuzzy cognitive maps, allowing an initial form of intelligence amplification. Eventually, we present an implementation of our inductive fuzzy grassroots ontology Thus,this paper contributes an algorithm for the extraction of fuzzy grassroots ontologies from Web data by inductive fuzzy classification.
Resumo:
In his in uential article about the evolution of the Web, Berners-Lee [1] envisions a Semantic Web in which humans and computers alike are capable of understanding and processing information. This vision is yet to materialize. The main obstacle for the Semantic Web vision is that in today's Web meaning is rooted most often not in formal semantics, but in natural language and, in the sense of semiology, emerges not before interpretation and processing. Yet, an automated form of interpretation and processing can be tackled by precisiating raw natural language. To do that, Web agents extract fuzzy grassroots ontologies through induction from existing Web content. Inductive fuzzy grassroots ontologies thus constitute organically evolved knowledge bases that resemble automated gradual thesauri, which allow precisiating natural language [2]. The Web agents' underlying dynamic, self-organizing, and best-effort induction, enable a sub-syntactical bottom up learning of semiotic associations. Thus, knowledge is induced from the users' natural use of language in mutual Web interactions, and stored in a gradual, thesauri-like lexical-world knowledge database as a top-level ontology, eventually allowing a form of computing with words [3]. Since when computing with words the objects of computation are words, phrases and propositions drawn from natural languages, it proves to be a practical notion to yield emergent semantics for the Semantic Web. In the end, an improved understanding by computers on the one hand should upgrade human- computer interaction on the Web, and, on the other hand allow an initial version of human- intelligence amplification through the Web.
Resumo:
For the main part, electronic government (or e-government for short) aims to put digital public services at disposal for citizens, companies, and organizations. To that end, in particular, e-government comprises the application of Information and Communications Technology (ICT) to support government operations and provide better governmental services (Fraga, 2002) as possible with traditional means. Accordingly, e-government services go further as traditional governmental services and aim to fundamentally alter the processes in which public services are generated and delivered, after this manner transforming the entire spectrum of relationships of public bodies with its citizens, businesses and other government agencies (Leitner, 2003). To implement this transformation, one of the most important points is to inform the citizen, business, and/or other government agencies faithfully and in an accessible way. This allows all the partaking participants of governmental affairs for a transition from passive information access to active participation (Palvia and Sharma, 2007). In addition, by a corresponding handling of the participants' data, a personalization towards these participants may even be accomplished. For instance, by creating significant user profiles as a kind of participants' tailored knowledge structures, a better-quality governmental service may be provided (i.e., expressed by individualized governmental services). To create such knowledge structures, thus known information (e.g., a social security number) can be enriched by vague information that may be accurate to a certain degree only. Hence, fuzzy knowledge structures can be generated, which help improve governmental-participants relationship. The Web KnowARR framework (Portmann and Thiessen, 2013; Portmann and Pedrycz, 2014; Portmann and Kaltenrieder, 2014), which I introduce in my presentation, allows just all these participants to be automatically informed about changes of Web content regarding a- respective governmental action. The name Web KnowARR thereby stands for a self-acting entity (i.e. instantiated form the conceptual framework) that knows or apprehends the Web. In this talk, the frameworks respective three main components from artificial intelligence research (i.e. knowledge aggregation, representation, and reasoning), as well as its specific use in electronic government will be briefly introduced and discussed.
Resumo:
El presente Trabajo de Fin de Grado se enmarca dentro del sistema web de la asignaturade Procesadores de Lenguajes perteneciente al departamento de Lenguajes y Sistemas Informáticos e Ingeniería de Software de la Escuela Técnica Superior de Ingenieros Informáticos de la Universidad Politécnica de Madrid. Este Trabajo consta de varias líneas de desarrollo, que se engloban dentro de dicho marco y surgen de la necesidad de mejorar el sistema para hacer que éste sea accesible a todo tipo de usuarios, y a la vez se mantenga actualizado según las tecnologías más recientes. En primer lugar, el presente Trabajo se centra en estudiar la accesibilidad de la web de la asignatura de Procesadores de Lenguajes siguiendo las Pautas de Accesibilidad al Contenido en la Web (Web Content Accessibility Guidelines, WCAG) en su segunda versión (2.0). Para ello, se ha llevado a cabo un informe detallado que recoge los resultados de este estudio sobre los criterios de aceptación de las WCAG, y posteriormente se han implementado los cambios necesarios para solucionar los criterios erróneos detectados. De esta manera se puede asegurar que la web es accesible para personas con distintos tipos de discapacidad. Así mismo, y siguiendo el criterio de conseguir una web más accesible, se ha adaptado el sistema a tecnologías más recientes. En el momento de empezar el Trabajo, el sistema web contaba con una serie de páginas estáticas (XHTML 1.1 + CSS 2.1) y una serie de páginas dinámicas (XHTML 1.1 + CSS 2.1 + PHP + MySQL). Estas páginas han sido actualizadas a sus versiones más recientes (HTML 5 y CSS 3). La web cuenta también con un sistema de creación de grupos de prácticas que facilita su gestión tanto a profesores como a alumnos, además de facilitar el alta de los estudiantes de la asignatura. El sistema posee además un módulo de administración para que el personal docente pueda gestionarlo. Sobre este sistema web implantado en la actualidad, se ha realizado una batería de pruebas para garantizar su correcto funcionamiento, y se han corregido todos los errores detectados durante dicho proceso. Al mismo tiempo, se han implementado nuevas funcionalidades que han ido surgiendo desde la creación del sistema hasta el momento presente. Por último, se ha desarrollado un sistema de avisos RSS que permite a los alumnos de la asignatura permanecer al corriente de los avisos y noticias publicados en el tablón de anuncios de la web. Este sistema de avisos RSS servirá también para otros sitios web del Centro que utilicen el tablón de avisos multipropósito y podrá ser visualizado tanto en inglés como en español. ---ABSTRACT---The present final year project is set within the framework of the subject “Procesadores de Lenguajes”, that belongs to the “Computer Languages and Systems and Software Engineering” department of the Escuela Técnica Superior de Ingenieros Informáticos of the Polytechnic University of Madrid. This study is divided in several angles of development that are included inside the abovementioned framework. They all emerge from the necessity of upgrading the system in order to make it accessible to everybody and the same time bringing it up to date to the latest technologies. First of all, it is focused on the study of the accessibility of the web site of the subject Procesadores de Lenguajes, following the second version of the Web Content Accessibility Guidelines (WCAG 2.0). In order to do this, an in-depth report containing the results of the study on the acceptance criteria of the WCAG has been developed. Right afterwards, necessary changes were implemented to correct the erroneous criteria detected. Similarly, and following the criteria of achieving a more accessible web site, the system has been adapted to updated technologies. At the start point, the web system consisted in a series of static pages (XHTML 1.1 + CSS 2.1) and a series of dynamic ones (XHTML 1.1 + CSS 2.1 + PHP + MySQL). These pages have been updated to their latest versions (HTML 5 and CSS 3). The web site has a system for the creation of working groups that makes their management easier, both for the teachers and for the students, as well as the registration process. The teaching staff can also manage the system through the administration module. Over the current web system, sets of several tests have taken place in order to guarantee its correct functioning and all the errors that appeared have been corrected. Likewise, new functionalities have been implemented, and those have been arising since the creation of the system till the present time. Finally, an RSS alert system has been developed, allowing students to keep updated on the news and alerts published in the website noticeboard. This RSS alert system will be shared with other websites of the School using the multipurpose noticeboard, and will be available both in Spanish and English.
Resumo:
En la realización de este proyecto se ha tratado principalmente la temática del web scraping sobre documentos HTML en Android. Como resultado del mismo, se ha propuesto una metodología para poder realizar web scraping en aplicaciones implementadas para este sistema operativo y se desarrollará una aplicación basada en esta metodología que resulte útil a los alumnos de la escuela. Web scraping se puede definir como una técnica basada en una serie de algoritmos de búsqueda de contenido con el fin de obtener una determinada información de páginas web, descartando aquella que no sea relevante. Como parte central, se ha dedicado bastante tiempo al estudio de los navegadores y servidores Web, y del lenguaje HTML presente en casi todas las páginas web en la actualidad así como de los mecanismos utilizados para la comunicación entre cliente y servidor ya que son los pilares en los que se basa esta técnica. Se ha realizado un estudio de las técnicas y herramientas necesarias, aportándose todos los conceptos teóricos necesarios, así como la proposición de una posible metodología para su implementación. Finalmente se ha codificado la aplicación UPMdroid, desarrollada con el fin de ejemplificar la implementación de la metodología propuesta anteriormente y a la vez desarrollar una aplicación cuya finalidad es brindar al estudiante de la ETSIST un soporte móvil en Android que le facilite el acceso y la visualización de aquellos datos más importantes del curso académico como son: el horario de clases y las calificaciones de las asignaturas en las que se matricule. Esta aplicación, además de implementar la metodología propuesta, es una herramienta muy interesante para el alumno, ya que le permite utilizar de una forma sencilla e intuitiva gran número de funcionalidades de la escuela solucionando así los problemas de visualización de contenido web en los dispositivos. ABSTRACT. The main topic of this project is about the web scraping over HTML documents on Android OS. As a result thereof, it is proposed a methodology to perform web scraping in deployed applications for this operating system and based on this methodology that is useful to the ETSIST school students. Web scraping can be defined as a technique based on a number of content search algorithms in order to obtain certain information from web pages, discarding those that are not relevant. As a main part, has spent considerable time studying browsers and Web servers, and the HTML language that is present today in almost all websites as well as the mechanisms used for communication between client and server because they are the pillars which this technique is based. We performed a study of the techniques and tools needed, providing all the necessary theoretical concepts, as well as the proposal of a possible methodology for implementation. Finally it has codified UPMdroid application, developed in order to illustrate the implementation of the previously proposed methodology and also to give the student a mobile ETSIST Android support to facilitate access and display those most important data of the current academic year such as: class schedules and scores for the subjects in which you are enrolled. This application, in addition to implement the proposed methodology is also a very interesting tool for the student, as it allows a simple and intuitive way of use these school functionalities thus fixing the viewing web content on devices.
Resumo:
La web ha evolucionado hacia la participación en la creación de contenido tanto por desarrolladores expertos como por usuarios finales sin un gran conocimiento en esta área. A pesar de que su uso es igual de válido y funcional, las diferencias entre la calidad de los productos desarrollados por ambos puede llegar a ser considerable. Esta característica se observa con mayor claridad cuando se analizan los web components. El trabajo consiste en el desarrollo de un entorno capaz de recoger las métricas de calidad de los componentes, basadas en la interacción con ellos por parte de los usuarios. A partir de las métricas obtenidas, se determinará su calidad para realizar una mejora de la misma, en función de las características valoradas. La selección de las métricas se realiza mediante un estudio de las características que definen a un componente, y permiten ser analizadas. Para poder llevar a cabo la construcción del portal, se ha descrito un prototipo capaz de proporcionar un sistema para permitir que los componentes intercambien información entre ellos. El modelo ha sido integrado en los componentes que se han de evaluar para obtener nuevas métricas sobre esta característica. Se ha desarrollado un dashboard que permite la interacción sin limitaciones de los usuarios con los componentes, facilitándoles un sistema para conectar componentes, utilizando para ello el sistema previamente descrito. Como conclusión del trabajo, se puede observar la necesidad de integrar los componentes web en un entorno real para poder determinar su calidad. Debido a que la calidad está determinada por los usuarios que consumen los componentes, se ha de contar con su opinión en la cuantificación de la misma.---ABSTRACT---Recently, the web has evolved to the collaboration between professional developers and end users with limited knowledge to create web content. Although both solutions are correct and functional, the differences in the quality between them can be appreciable. This feature is shown clearly when the web components are analyzed. The work is composed of the development of a virtual environment which is able to pick the quality measures of the components, based on the interaction between these components and the user. The measures are the starting point to decide the quality, and improve them with the rated measures. The measures selection is done through a study of the main features of a component. This selection can be analyzed. In order to create the website, a prototype has been specified to provide a system in which the components can be trade information between them. The interconnection model has been integrated in the components to evaluate. A dashboard has been developed to allow users interacting with the components without rules, making them possible connecting components through the model. The main conclusion of the work is the necessity of integrating web components in a real environment to decide their quality. Due to the fact that the quality is measured in terms of the rate of the users, it is a must to give them the main roles in the establishment of that quality.
Resumo:
Web transaction data between Web visitors and Web functionalities usually convey user task-oriented behavior pattern. Mining such type of click-stream data will lead to capture usage pattern information. Nowadays Web usage mining technique has become one of most widely used methods for Web recommendation, which customizes Web content to user-preferred style. Traditional techniques of Web usage mining, such as Web user session or Web page clustering, association rule and frequent navigational path mining can only discover usage pattern explicitly. They, however, cannot reveal the underlying navigational activities and identify the latent relationships that are associated with the patterns among Web users as well as Web pages. In this work, we propose a Web recommendation framework incorporating Web usage mining technique based on Probabilistic Latent Semantic Analysis (PLSA) model. The main advantages of this method are, not only to discover usage-based access pattern, but also to reveal the underlying latent factor as well. With the discovered user access pattern, we then present user more interested content via collaborative recommendation. To validate the effectiveness of proposed approach, we conduct experiments on real world datasets and make comparisons with some existing traditional techniques. The preliminary experimental results demonstrate the usability of the proposed approach.