984 resultados para Search based on sketch
Resumo:
Current-day web search engines (e.g., Google) do not crawl and index a significant portion of theWeb and, hence, web users relying on search engines only are unable to discover and access a large amount of information from the non-indexable part of the Web. Specifically, dynamic pages generated based on parameters provided by a user via web search forms (or search interfaces) are not indexed by search engines and cannot be found in searchers’ results. Such search interfaces provide web users with an online access to myriads of databases on the Web. In order to obtain some information from a web database of interest, a user issues his/her query by specifying query terms in a search form and receives the query results, a set of dynamic pages that embed required information from a database. At the same time, issuing a query via an arbitrary search interface is an extremely complex task for any kind of automatic agents including web crawlers, which, at least up to the present day, do not even attempt to pass through web forms on a large scale. In this thesis, our primary and key object of study is a huge portion of the Web (hereafter referred as the deep Web) hidden behind web search interfaces. We concentrate on three classes of problems around the deep Web: characterization of deep Web, finding and classifying deep web resources, and querying web databases. Characterizing deep Web: Though the term deep Web was coined in 2000, which is sufficiently long ago for any web-related concept/technology, we still do not know many important characteristics of the deep Web. Another matter of concern is that surveys of the deep Web existing so far are predominantly based on study of deep web sites in English. One can then expect that findings from these surveys may be biased, especially owing to a steady increase in non-English web content. In this way, surveying of national segments of the deep Web is of interest not only to national communities but to the whole web community as well. In this thesis, we propose two new methods for estimating the main parameters of deep Web. We use the suggested methods to estimate the scale of one specific national segment of the Web and report our findings. We also build and make publicly available a dataset describing more than 200 web databases from the national segment of the Web. Finding deep web resources: The deep Web has been growing at a very fast pace. It has been estimated that there are hundred thousands of deep web sites. Due to the huge volume of information in the deep Web, there has been a significant interest to approaches that allow users and computer applications to leverage this information. Most approaches assumed that search interfaces to web databases of interest are already discovered and known to query systems. However, such assumptions do not hold true mostly because of the large scale of the deep Web – indeed, for any given domain of interest there are too many web databases with relevant content. Thus, the ability to locate search interfaces to web databases becomes a key requirement for any application accessing the deep Web. In this thesis, we describe the architecture of the I-Crawler, a system for finding and classifying search interfaces. Specifically, the I-Crawler is intentionally designed to be used in deepWeb characterization studies and for constructing directories of deep web resources. Unlike almost all other approaches to the deep Web existing so far, the I-Crawler is able to recognize and analyze JavaScript-rich and non-HTML searchable forms. Querying web databases: Retrieving information by filling out web search forms is a typical task for a web user. This is all the more so as interfaces of conventional search engines are also web forms. At present, a user needs to manually provide input values to search interfaces and then extract required data from the pages with results. The manual filling out forms is not feasible and cumbersome in cases of complex queries but such kind of queries are essential for many web searches especially in the area of e-commerce. In this way, the automation of querying and retrieving data behind search interfaces is desirable and essential for such tasks as building domain-independent deep web crawlers and automated web agents, searching for domain-specific information (vertical search engines), and for extraction and integration of information from various deep web resources. We present a data model for representing search interfaces and discuss techniques for extracting field labels, client-side scripts and structured data from HTML pages. We also describe a representation of result pages and discuss how to extract and store results of form queries. Besides, we present a user-friendly and expressive form query language that allows one to retrieve information behind search interfaces and extract useful data from the result pages based on specified conditions. We implement a prototype system for querying web databases and describe its architecture and components design.
Resumo:
OBJECTIVE: To perform a critical review focusing on the applicability in clinical daily practice of data from three randomized controlled trials (RCTs): SWOG 8794, EORTC 22911, and ARO/AUO 96-02. METHODS AND MATERIALS: An analytical framework, based on the identified population, interventions, comparators, and outcomes (PICO) was used to refine the search of the evidence from the three large randomized trials regarding the use of radiation therapy after prostatectomy as adjuvant therapy (ART). RESULTS: With regard to the inclusion criteria: (1) POPULATION: in the time since they were designed, in two among three trial (SWOG 8794 and EORTC 22911) patients had a detectable PSA at the time of randomization, thus representing de facto a substantial proportion of patients who eventually received salvage RT (SRT) at non-normalised PSA levels rather than ART. (2) INTERVENTIONS: although all the trials showed the benefit of postoperative ART compared to a wait-and-see approach, the dose herein employed would be now considered inadequate; (3) COMPARATORS: the comparison arm in all the 3 RCTs was an uncontrolled observation arm, where patients who subsequently developed biochemical failure were treated in various ways, with up to half of them receiving SRT at PSA well above 1ng/mL, a level that would be now deemed inappropriate; (4) OUTCOMES: only in one trial (SWOG 8794) ART was found to significantly improve overall survival compared to observation, with a ten-year overall survival rate of 74% vs. 66%, although this might be partly the result of imbalanced risk factors due to competing event risk stratification. CONCLUSIONS: ART has a high level of evidence due to three RCTs with at least 10-year follow-up recording a benefit in biochemical PFS, but its penetrance in present daily clinics should be reconsidered. While the benefit of ART or SRT is eagerly expected from ongoing randomized trials, a dynamic risk-stratified approach should drive the decisions making process.
Resumo:
OBJECTIVE: The aim of this study is to review highly cited articles that focus on non-publication of studies, and to develop a consistent and comprehensive approach to defining (non-) dissemination of research findings. SETTING: We performed a scoping review of definitions of the term 'publication bias' in highly cited publications. PARTICIPANTS: Ideas and experiences of a core group of authors were collected in a draft document, which was complemented by the findings from our literature search. INTERVENTIONS: The draft document including findings from the literature search was circulated to an international group of experts and revised until no additional ideas emerged and consensus was reached. PRIMARY OUTCOMES: We propose a new approach to the comprehensive conceptualisation of (non-) dissemination of research. SECONDARY OUTCOMES: Our 'What, Who and Why?' approach includes issues that need to be considered when disseminating research findings (What?), the different players who should assume responsibility during the various stages of conducting a clinical trial and disseminating clinical trial documents (Who?), and motivations that might lead the various players to disseminate findings selectively, thereby introducing bias in the dissemination process (Why?). CONCLUSIONS: Our comprehensive framework of (non-) dissemination of research findings, based on the results of a scoping literature search and expert consensus will facilitate the development of future policies and guidelines regarding the multifaceted issue of selective publication, historically referred to as 'publication bias'.
Resumo:
Peer-reviewed
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
This research project is a contribution to the global field of information retrieval, specifically, to develop tools to enable information access in digital documents. We recognize the need to provide the user with flexible access to the contents of large, potentially complex digital documents, with means other than a search function or a handful of metadata elements. The goal is to produce a text browsing tool offering a maximum of information based on a fairly superficial linguistic analysis. We are concerned with a type of extensive single-document indexing, and not indexing by a set of keywords (see Klement, 2002, for a clear distinction between the two). The desired browsing tool would not only give at a glance the main topics discussed in the document, but would also present relationships between these topics. It would also give direct access to the text (via hypertext links to specific passages). The present paper, after reviewing previous research on this and similar topics, discusses the methodology and the main characteristics of a prototype we have devised. Experimental results are presented, as well as an analysis of remaining hurdles and potential applications.
Resumo:
Sharing of information with those in need of it has always been an idealistic goal of networked environments. With the proliferation of computer networks, information is so widely distributed among systems, that it is imperative to have well-organized schemes for retrieval and also discovery. This thesis attempts to investigate the problems associated with such schemes and suggests a software architecture, which is aimed towards achieving a meaningful discovery. Usage of information elements as a modelling base for efficient information discovery in distributed systems is demonstrated with the aid of a novel conceptual entity called infotron.The investigations are focused on distributed systems and their associated problems. The study was directed towards identifying suitable software architecture and incorporating the same in an environment where information growth is phenomenal and a proper mechanism for carrying out information discovery becomes feasible. An empirical study undertaken with the aid of an election database of constituencies distributed geographically, provided the insights required. This is manifested in the Election Counting and Reporting Software (ECRS) System. ECRS system is a software system, which is essentially distributed in nature designed to prepare reports to district administrators about the election counting process and to generate other miscellaneous statutory reports.Most of the distributed systems of the nature of ECRS normally will possess a "fragile architecture" which would make them amenable to collapse, with the occurrence of minor faults. This is resolved with the help of the penta-tier architecture proposed, that contained five different technologies at different tiers of the architecture.The results of experiment conducted and its analysis show that such an architecture would help to maintain different components of the software intact in an impermeable manner from any internal or external faults. The architecture thus evolved needed a mechanism to support information processing and discovery. This necessitated the introduction of the noveI concept of infotrons. Further, when a computing machine has to perform any meaningful extraction of information, it is guided by what is termed an infotron dictionary.The other empirical study was to find out which of the two prominent markup languages namely HTML and XML, is best suited for the incorporation of infotrons. A comparative study of 200 documents in HTML and XML was undertaken. The result was in favor ofXML.The concept of infotron and that of infotron dictionary, which were developed, was applied to implement an Information Discovery System (IDS). IDS is essentially, a system, that starts with the infotron(s) supplied as clue(s), and results in brewing the information required to satisfy the need of the information discoverer by utilizing the documents available at its disposal (as information space). The various components of the system and their interaction follows the penta-tier architectural model and therefore can be considered fault-tolerant. IDS is generic in nature and therefore the characteristics and the specifications were drawn up accordingly. Many subsystems interacted with multiple infotron dictionaries that were maintained in the system.In order to demonstrate the working of the IDS and to discover the information without modification of a typical Library Information System (LIS), an Information Discovery in Library Information System (lDLIS) application was developed. IDLIS is essentially a wrapper for the LIS, which maintains all the databases of the library. The purpose was to demonstrate that the functionality of a legacy system could be enhanced with the augmentation of IDS leading to information discovery service. IDLIS demonstrates IDS in action. IDLIS proves that any legacy system could be augmented with IDS effectively to provide the additional functionality of information discovery service.Possible applications of IDS and scope for further research in the field are covered.
Resumo:
The wealth of information available freely on the web and medical image databases poses a major problem for the end users: how to find the information needed? Content –Based Image Retrieval is the obvious solution.A standard called MPEG-7 was evolved to address the interoperability issues of content-based search.The work presented in this thesis mainly concentrates on developing new shape descriptors and a framework for content – based retrieval of scoliosis images.New region-based and contour based shape descriptor is developed based on orthogonal Legendre polymomials.A novel system for indexing and retrieval of digital spine radiographs with scoliosis is presented.
Resumo:
The search for new materials especially those possessing special properties continues at a great pace because of ever growing demands of the modern life. The focus on the use of intrinsically conductive polymers in organic electronic devices has led to the development of a totally new class of smart materials. Polypyrrole (PPy) is one of the most stable known conducting polymers and also one of the easiest to synthesize. In addition, its high conductivity, good redox reversibility and excellent microwave absorbing characteristics have led to the existence of wide and diversified applications for PPy. However, as any conjugated conducting polymer, PPy lacks processability, flexibility and strength which are essential for industrial requirements. Among various approaches to making tractable materials based on PPy, incorporating PPy within an electrically insulating polymer appears to be a promising method, and this has triggered the development of blends or composites. Conductive elastomeric composites of polypyrrole are important in that they are composite materials suitable for devices where flexibility is an important parameter. Moreover these composites can be moulded into complex shapes. In this work an attempt has been made to prepare conducting elastomeric composites by the incorporation of PPy and PPy coated short Nylon-6 fiber with insulating elastomer matrices- natural rubber and acrylonitrile butadiene rubber. It is well established that mechanical properties of rubber composites can be greatly improved by adding short fibers. Generally short fiber reinforced rubber composites are popular in industrial fields because of their processing advantages, low cost, and their greatly improved technical properties such as strength, stiffness, modulus and damping. In the present work, PPy coated fiber is expected to improve the mechanical properties of the elastomer-PPy composites, at the same time increasing the conductivity. In addition to determination of DC conductivity and evaluation of mechanical properties, the work aims to study the thermal stability, dielectric properties and electromagnetic interference shielding effectiveness of the composites. The thesis consists of ten chapters.
Resumo:
In a search for new sensor systems and new methods for underwater vehicle positioning based on visual observation, this paper presents a computer vision system based on coded light projection. 3D information is taken from an underwater scene. This information is used to test obstacle avoidance behaviour. In addition, the main ideas for achieving stabilisation of the vehicle in front of an object are presented
Resumo:
El treball desenvolupat en aquesta tesi presenta un profund estudi i proveïx solucions innovadores en el camp dels sistemes recomanadors. Els mètodes que usen aquests sistemes per a realitzar les recomanacions, mètodes com el Filtrat Basat en Continguts (FBC), el Filtrat Col·laboratiu (FC) i el Filtrat Basat en Coneixement (FBC), requereixen informació dels usuaris per a predir les preferències per certs productes. Aquesta informació pot ser demogràfica (Gènere, edat, adreça, etc), o avaluacions donades sobre algun producte que van comprar en el passat o informació sobre els seus interessos. Existeixen dues formes d'obtenir aquesta informació: els usuaris ofereixen explícitament aquesta informació o el sistema pot adquirir la informació implícita disponible en les transaccions o historial de recerca dels usuaris. Per exemple, el sistema recomanador de pel·lícules MovieLens (http://movielens.umn.edu/login) demana als usuaris que avaluïn almenys 15 pel·lícules dintre d'una escala de * a * * * * * (horrible, ...., ha de ser vista). El sistema genera recomanacions sobre la base d'aquestes avaluacions. Quan els usuaris no estan registrat en el sistema i aquest no té informació d'ells, alguns sistemes realitzen les recomanacions tenint en compte l'historial de navegació. Amazon.com (http://www.amazon.com) realitza les recomanacions tenint en compte les recerques que un usuari a fet o recomana el producte més venut. No obstant això, aquests sistemes pateixen de certa falta d'informació. Aquest problema és generalment resolt amb l'adquisició d'informació addicional, se li pregunta als usuaris sobre els seus interessos o es cerca aquesta informació en fonts addicionals. La solució proposada en aquesta tesi és buscar aquesta informació en diverses fonts, específicament aquelles que contenen informació implícita sobre les preferències dels usuaris. Aquestes fonts poden ser estructurades com les bases de dades amb informació de compres o poden ser no estructurades com les pàgines web on els usuaris deixen la seva opinió sobre algun producte que van comprar o posseïxen. Nosaltres trobem tres problemes fonamentals per a aconseguir aquest objectiu: 1 . La identificació de fonts amb informació idònia per als sistemes recomanadors. 2 . La definició de criteris que permetin la comparança i selecció de les fonts més idònies. 3 . La recuperació d'informació de fonts no estructurades. En aquest sentit, en la tesi proposada s'ha desenvolupat: 1 . Una metodologia que permet la identificació i selecció de les fonts més idònies. Criteris basats en les característiques de les fonts i una mesura de confiança han estat utilitzats per a resoldre el problema de la identificació i selecció de les fonts. 2 . Un mecanisme per a recuperar la informació no estructurada dels usuaris disponible en la web. Tècniques de Text Mining i ontologies s'han utilitzat per a extreure informació i estructurar-la apropiadament perquè la utilitzin els recomanadors. Les contribucions del treball desenvolupat en aquesta tesi doctoral són: 1. Definició d'un conjunt de característiques per a classificar fonts rellevants per als sistemes recomanadors 2. Desenvolupament d'una mesura de rellevància de les fonts calculada sobre la base de les característiques definides 3. Aplicació d'una mesura de confiança per a obtenir les fonts més fiables. La confiança es definida des de la perspectiva de millora de la recomanació, una font fiable és aquella que permet millorar les recomanacions. 4. Desenvolupament d'un algorisme per a seleccionar, des d'un conjunt de fonts possibles, les més rellevants i fiable utilitzant les mitjanes esmentades en els punts previs. 5. Definició d'una ontologia per a estructurar la informació sobre les preferències dels usuaris que estan disponibles en Internet. 6. Creació d'un procés de mapatge que extreu automàticament informació de les preferències dels usuaris disponibles en la web i posa aquesta informació dintre de l'ontologia. Aquestes contribucions permeten aconseguir dos objectius importants: 1 . Millorament de les recomanacions usant fonts d'informació alternatives que sigui rellevants i fiables. 2 . Obtenir informació implícita dels usuaris disponible en Internet.
Resumo:
This paper presents a novel two-pass algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for block base motion compensation. On the basis of research from previous algorithms, especially an on-the-edge motion estimation algorithm called hexagonal search (HEXBS), we propose the LHMEA and the Two-Pass Algorithm (TPA). We introduced hashtable into video compression. In this paper we employ LHMEA for the first-pass search in all the Macroblocks (MB) in the picture. Motion Vectors (MV) are then generated from the first-pass and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of MBs. The evaluation of the algorithm considers the three important metrics being time, compression rate and PSNR. The performance of the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms, Experimental results show that the proposed algorithm can offer the same compression rate as the Full Search. LHMEA with TPA has significant improvement on HEXBS and shows a direction for improving other fast motion estimation algorithms, for example Diamond Search.
Resumo:
In the search for a versatile building block that allows the preparation of heteroditopic tpy-pincer bridging ligands, the synthon 14'-[C6H3(CH2Br)(2)-3,5]-2,2':6',2 ''-terpyridine was synthesized. Facile introduction of diphenylphosphanyl groups in this synthon gave the ligand 14'-[C6H3(CH2PPh2)2-3,5]-2,2':6',2"-terpyridine) ([tpyPC(H)Pj). The asymmetric mononuclear complex [Fe(tpy){tpyPC(H)P}](PF6)(2), prepared by selective coordination of [Fe(tpy)Cl-3] to the tpy moiety of [tpyPC(H)P], was used for the synthesis of the heterodimetallic complex [Fe(tpy)(tpyPCP)Ru(tpy)](PFC,)3, which applies the "complex as ligand" approach. Coordination of the ruthenium centre at the PC(H)P-pincer moiety of [Fe(tpy){tpyPC(H)P}](PF6)(2) has been achieved by applying a transcyclometallation procedure. The ground-state electronic properties of both complexes, investigated by cyclic and square-wave voltammetries and UV/Vis spectroscopy, are discussed and compared with those of [Fe(tPY)(2)](PF6)(2) and [Ru(PCP)(tpy)]Cl, which represent the mononuclear components of the heterodinuclear species. An in situ UV/Vis spectroelectrochemical study was performed in order to localize the oxidation and reduction steps and to gain information about the Fe-II-Ru-II communication in the heterodimetallic system [Fe(tpy)(tpyPCP)Ru(tpy)](PF6)(3) mediated by the bridging ligand [tpyPCP]. Both the voltammetric and spectroelectrochemical results point to only very limited electronic interaction between the metal centres in the ground state.
Resumo:
This paper examines the lead–lag relationship between the FTSE 100 index and index futures price employing a number of time series models. Using 10-min observations from June 1996–1997, it is found that lagged changes in the futures price can help to predict changes in the spot price. The best forecasting model is of the error correction type, allowing for the theoretical difference between spot and futures prices according to the cost of carry relationship. This predictive ability is in turn utilised to derive a trading strategy which is tested under real-world conditions to search for systematic profitable trading opportunities. It is revealed that although the model forecasts produce significantly higher returns than a passive benchmark, the model was unable to outperform the benchmark after allowing for transaction costs.