923 resultados para Search Engine Optimization Methods
Resumo:
Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.
Resumo:
Given the significant growth of the Internet in recent years, marketers have been striving for new techniques and strategies to prosper in the online world. Statistically, search engines have been the most dominant channels of Internet marketing in recent years. However, the mechanics of advertising in such a market place has created a challenging environment for marketers to position their ads among their competitors. This study uses a unique cross-sectional dataset of the top 500 Internet retailers in North America and hierarchical multiple regression analysis to empirically investigate the effect of keyword competition on the relationship between ad position and its determinants in the sponsored search market. To this end, the study utilizes the literature in consumer search behavior, keyword auction mechanism design, and search advertising performance as the theoretical foundation. This study is the first of its kind to examine the sponsored search market characteristics in a cross-sectional setting where the level of keyword competition is explicitly captured in terms of the number of Internet retailers competing for similar keywords. Internet retailing provides an appropriate setting for this study given the high-stake battle for market share and intense competition for keywords in the sponsored search market place. The findings of this study indicate that bid values and ad relevancy metrics as well as their interaction affect the position of ads on the search engine result pages (SERPs). These results confirm some of the findings from previous studies that examined sponsored search advertising performance at a keyword level. Furthermore, the study finds that the position of ads for web-only retailers is dependent on bid values and ad relevancy metrics, whereas, multi-channel retailers are more reliant on their bid values. This difference between web-only and multi-channel retailers is also observed in the moderating effect of keyword competition on the relationships between ad position and its key determinants. Specifically, this study finds that keyword competition has significant moderating effects only for multi-channel retailers.
Resumo:
Contexte De nombreuses études, utilisant des indicateurs de qualité variés, ont démontré que la qualité des soins pour la dépression n’est pas optimale en première ligne. Peu de ces études ont examiné les facteurs associés à la réception d’un traitement adéquat, en particulier en tenant compte simultanément des caractéristiques individuelles et organisationnelles. L'association entre un traitement adéquat pour un épisode dépressif majeur (EDM) et une amélioration des symptômes dépressifs n'est pas bien établie dans des conditions non-expérimentales. Les objectifs de cette étude étaient de : 1) réaliser une revue systématique des indicateurs mesurant la qualité du traitement de la dépression en première ligne ; 2) estimer la proportion de patients souffrant d’EDM qui reçoivent un traitement adéquat (selon les guides de pratique clinique) en première ligne ; 3) examiner les caractéristiques individuelles et organisationnelles associées à l’adéquation du traitement pour la dépression ; 4) examiner l'association entre un traitement minimalement adéquat au cours des 12 mois précédents et l'évolution des symptômes dépressifs à 6 et 12 mois. Méthodes La littérature sur la qualité du traitement de la dépression a été examinée en utilisant un ensemble de mots-clés (« depression », « depressive disorder », « quality », « treatment », « indicator », « adequacy », « adherence », « concordance », « clinical guideline » et « guideline ») et « 360search », un moteur de recherche fédérée. Les données proviennent d'une étude de cohorte incluant 915 adultes consultant un médecin généraliste, quel que soit le motif de consultation, répondant aux critères du DSM-IV pour l’EDM dans la dernière année, nichés dans 65 cliniques de première ligne au Québec, Canada. Des analyses multiniveaux ont été réalisées. Résultats Bien que majoritairement développés à partir de guides de pratique clinique, une grande variété d'indicateurs a été observée dans la revue systématique de littérature. La plupart des études retenues ont utilisé des indicateurs de qualité rudimentaires, surtout pour la psychothérapie. Les méthodes utilisées étaient très variées, limitant la comparabilité des résultats. Toutefois, quelque soit la méthode choisie, la plupart des études ont révélé qu’une grande proportion des personnes souffrant de dépression n’ont pas reçu de traitement minimalement adéquat en première ligne. Dans notre échantillon, l’adéquation était élevée (> 75 %) pour un tiers des indicateurs de qualité mesurés, mais était faible (< 60 %) pour près de la moitié des mesures. Un peu plus de la moitié de l'échantillon (52,2 %) a reçu au moins un traitement minimalement adéquat pour la dépression. Au niveau individuel, les jeunes adultes (18-24 ans) et les personnes de plus de 65 ans avaient une probabilité moins élevée de recevoir un traitement minimalement adéquat. Cette probabilité était plus élevée pour ceux qui ont un médecin de famille, une assurance complémentaire, un trouble anxieux comorbide et une dépression plus sévère. Au niveau des cliniques, la disponibilité de la psychothérapie sur place, l'utilisation d'algorithmes de traitement, et le mode de rémunération perçu comme adéquat étaient associés à plus de traitement adéquat. Les résultats ont également montré que 1) la réception d'au moins un traitement minimalement adéquat pour la dépression était associée à une plus grande amélioration des symptômes dépressifs à 6 et à 12 mois; 2) la pharmacothérapie adéquate et la psychothérapie adéquate étaient toutes deux associées à de plus grandes améliorations dans les symptômes dépressifs, et 3) l'association entre un traitement adéquat et l'amélioration des symptômes dépressifs varie en fonction de la sévérité des symptômes au moment de l'inclusion dans la cohorte, un niveau de symptômes plus élevé étant associé à une amélioration plus importante à 6 et à 12 mois. Conclusions Nos résultats suggèrent que des interventions sont nécessaires pour améliorer la qualité du traitement de la dépression en première ligne. Ces interventions devraient cibler des populations spécifiques (les jeunes adultes et les personnes âgées), améliorer l'accessibilité à la psychothérapie et à un médecin de famille, et soutenir les médecins de première ligne dans leur pratique clinique avec des patients souffrant de dépression de différentes façons, telles que le développement des connaissances pour traiter la dépression et l'adaptation du mode de rémunération. Cette étude montre également que le traitement adéquat de la dépression en première ligne est associé à une amélioration des symptômes dépressifs dans des conditions non-expérimentales.
Resumo:
Les centres d’appels sont des éléments clés de presque n’importe quelle grande organisation. Le problème de gestion du travail a reçu beaucoup d’attention dans la littérature. Une formulation typique se base sur des mesures de performance sur un horizon infini, et le problème d’affectation d’agents est habituellement résolu en combinant des méthodes d’optimisation et de simulation. Dans cette thèse, nous considérons un problème d’affection d’agents pour des centres d’appels soumis a des contraintes en probabilité. Nous introduisons une formulation qui exige que les contraintes de qualité de service (QoS) soient satisfaites avec une forte probabilité, et définissons une approximation de ce problème par moyenne échantillonnale dans un cadre de compétences multiples. Nous établissons la convergence de la solution du problème approximatif vers celle du problème initial quand la taille de l’échantillon croit. Pour le cas particulier où tous les agents ont toutes les compétences (un seul groupe d’agents), nous concevons trois méthodes d’optimisation basées sur la simulation pour le problème de moyenne échantillonnale. Étant donné un niveau initial de personnel, nous augmentons le nombre d’agents pour les périodes où les contraintes sont violées, et nous diminuons le nombre d’agents pour les périodes telles que les contraintes soient toujours satisfaites après cette réduction. Des expériences numériques sont menées sur plusieurs modèles de centre d’appels à faible occupation, au cours desquelles les algorithmes donnent de bonnes solutions, i.e. la plupart des contraintes en probabilité sont satisfaites, et nous ne pouvons pas réduire le personnel dans une période donnée sont introduire de violation de contraintes. Un avantage de ces algorithmes, par rapport à d’autres méthodes, est la facilité d’implémentation.
Resumo:
Les moteurs de recherche font partie de notre vie quotidienne. Actuellement, plus d’un tiers de la population mondiale utilise l’Internet. Les moteurs de recherche leur permettent de trouver rapidement les informations ou les produits qu'ils veulent. La recherche d'information (IR) est le fondement de moteurs de recherche modernes. Les approches traditionnelles de recherche d'information supposent que les termes d'indexation sont indépendants. Pourtant, les termes qui apparaissent dans le même contexte sont souvent dépendants. L’absence de la prise en compte de ces dépendances est une des causes de l’introduction de bruit dans le résultat (résultat non pertinents). Certaines études ont proposé d’intégrer certains types de dépendance, tels que la proximité, la cooccurrence, la contiguïté et de la dépendance grammaticale. Dans la plupart des cas, les modèles de dépendance sont construits séparément et ensuite combinés avec le modèle traditionnel de mots avec une importance constante. Par conséquent, ils ne peuvent pas capturer correctement la dépendance variable et la force de dépendance. Par exemple, la dépendance entre les mots adjacents "Black Friday" est plus importante que celle entre les mots "road constructions". Dans cette thèse, nous étudions différentes approches pour capturer les relations des termes et de leurs forces de dépendance. Nous avons proposé des méthodes suivantes: ─ Nous réexaminons l'approche de combinaison en utilisant différentes unités d'indexation pour la RI monolingue en chinois et la RI translinguistique entre anglais et chinois. En plus d’utiliser des mots, nous étudions la possibilité d'utiliser bi-gramme et uni-gramme comme unité de traduction pour le chinois. Plusieurs modèles de traduction sont construits pour traduire des mots anglais en uni-grammes, bi-grammes et mots chinois avec un corpus parallèle. Une requête en anglais est ensuite traduite de plusieurs façons, et un score classement est produit avec chaque traduction. Le score final de classement combine tous ces types de traduction. Nous considérons la dépendance entre les termes en utilisant la théorie d’évidence de Dempster-Shafer. Une occurrence d'un fragment de texte (de plusieurs mots) dans un document est considérée comme représentant l'ensemble de tous les termes constituants. La probabilité est assignée à un tel ensemble de termes plutôt qu’a chaque terme individuel. Au moment d’évaluation de requête, cette probabilité est redistribuée aux termes de la requête si ces derniers sont différents. Cette approche nous permet d'intégrer les relations de dépendance entre les termes. Nous proposons un modèle discriminant pour intégrer les différentes types de dépendance selon leur force et leur utilité pour la RI. Notamment, nous considérons la dépendance de contiguïté et de cooccurrence à de différentes distances, c’est-à-dire les bi-grammes et les paires de termes dans une fenêtre de 2, 4, 8 et 16 mots. Le poids d’un bi-gramme ou d’une paire de termes dépendants est déterminé selon un ensemble des caractères, en utilisant la régression SVM. Toutes les méthodes proposées sont évaluées sur plusieurs collections en anglais et/ou chinois, et les résultats expérimentaux montrent que ces méthodes produisent des améliorations substantielles sur l'état de l'art.
Resumo:
Distributed systems are one of the most vital components of the economy. The most prominent example is probably the internet, a constituent element of our knowledge society. During the recent years, the number of novel network types has steadily increased. Amongst others, sensor networks, distributed systems composed of tiny computational devices with scarce resources, have emerged. The further development and heterogeneous connection of such systems imposes new requirements on the software development process. Mobile and wireless networks, for instance, have to organize themselves autonomously and must be able to react to changes in the environment and to failing nodes alike. Researching new approaches for the design of distributed algorithms may lead to methods with which these requirements can be met efficiently. In this thesis, one such method is developed, tested, and discussed in respect of its practical utility. Our new design approach for distributed algorithms is based on Genetic Programming, a member of the family of evolutionary algorithms. Evolutionary algorithms are metaheuristic optimization methods which copy principles from natural evolution. They use a population of solution candidates which they try to refine step by step in order to attain optimal values for predefined objective functions. The synthesis of an algorithm with our approach starts with an analysis step in which the wanted global behavior of the distributed system is specified. From this specification, objective functions are derived which steer a Genetic Programming process where the solution candidates are distributed programs. The objective functions rate how close these programs approximate the goal behavior in multiple randomized network simulations. The evolutionary process step by step selects the most promising solution candidates and modifies and combines them with mutation and crossover operators. This way, a description of the global behavior of a distributed system is translated automatically to programs which, if executed locally on the nodes of the system, exhibit this behavior. In our work, we test six different ways for representing distributed programs, comprising adaptations and extensions of well-known Genetic Programming methods (SGP, eSGP, and LGP), one bio-inspired approach (Fraglets), and two new program representations called Rule-based Genetic Programming (RBGP, eRBGP) designed by us. We breed programs in these representations for three well-known example problems in distributed systems: election algorithms, the distributed mutual exclusion at a critical section, and the distributed computation of the greatest common divisor of a set of numbers. Synthesizing distributed programs the evolutionary way does not necessarily lead to the envisaged results. In a detailed analysis, we discuss the problematic features which make this form of Genetic Programming particularly hard. The two Rule-based Genetic Programming approaches have been developed especially in order to mitigate these difficulties. In our experiments, at least one of them (eRBGP) turned out to be a very efficient approach and in most cases, was superior to the other representations.
Resumo:
When publishing information on the web, one expects it to reach all the people that could be interested in. This is mainly achieved with general purpose indexing and search engines like Google which is the most used today. In the particular case of geographic information (GI) domain, exposing content to mainstream search engines is a complex task that needs specific actions. In many occasions it is convenient to provide a web site with a specially tailored search engine. Such is the case for on-line dictionaries (wikipedia, wordreference), stores (amazon, ebay), and generally all those holding thematic databases. Due to proliferation of these engines, A9.com proposed a standard interface called OpenSearch, used by modern web browsers to manage custom search engines. Geographic information can also benefit from the use of specific search engines. We can distinguish between two main approaches in GI retrieval information efforts: Classical OGC standardization on one hand (CSW, WFS filters), which are very complex for the mainstream user, and on the other hand the neogeographer’s approach, usually in the form of specific APIs lacking a common query interface and standard geographic formats. A draft ‘geo’ extension for OpenSearch has been proposed. It adds geographic filtering for queries and recommends a set of simple standard response geographic formats, such as KML, Atom and GeoRSS. This proposal enables standardization while keeping simplicity, thus covering a wide range of use cases, in both OGC and the neogeography paradigms. In this article we will analyze the OpenSearch geo extension in detail and its use cases, demonstrating its applicability to both the SDI and the geoweb. Open source implementations will be presented as well
Resumo:
The ISO norm line 9241 states some criteria for ergonomics of human system interaction. In markets with a huge variety of offers and little possibility of differentiation, providers can gain a decisive competitive advantage by user oriented interfaces. A precondition for this is that relevant information can be obtained for entrepreneurial decisions in this regard. To test how users of universal search result pages use those pages and pay attention to different elements, an eye tracking experiment with a mixed design has been developed. Twenty subjects were confronted with search engine result pages (SERPs) and were instructed to make a decision while conditions “national vs. international city” and “with vs. without miniaturized Google map” were used. Different parameters like fixation count, duration and time to first fixation were computed from the eye tracking raw data and supplemented by click rate data as well as data from questionnaires. Results of this pilot study revealed some remarkable facts like a vampire effect on miniaturized Google maps. Furthermore, Google maps did not shorten the process of decision making, Google ads were not fixated, visual attention on SERPs was influenced by position of the elements on the SERP and by the users’ familiarity with the search target. These results support the theory of Amount of Invested Mental Effort (AIME) and give providers empirical evidence to take users’ expectations into account. Furthermore, the results indicated that the task oriented goal mode of participants was a moderator for the attention spent on ads. Most important, SERPs with images attracted the viewers’ attention much longer than those without images. This unique selling proposition may lead to a distortion of competition on markets.
Resumo:
This paper describes the implementation of a semantic web search engine on conversation styled transcripts. Our choice of data is Hansard, a publicly available conversation style transcript of parliamentary debates. The current search engine implementation on Hansard is limited to running search queries based on keywords or phrases hence lacks the ability to make semantic inferences from user queries. By making use of knowledge such as the relationship between members of parliament, constituencies, terms of office, as well as topics of debates the search results can be improved in terms of both relevance and coverage. Our contribution is not algorithmic instead we describe how we exploit a collection of external data sources, ontologies, semantic web vocabularies and named entity extraction in the analysis of underlying semantics of user queries as well as the semantic enrichment of the search index thereby improving the quality of results.
Resumo:
The final contents of total and individual trans-fatty acids of sunflower oil, produced during the deacidification step of physical refining were obtained using a computational simulation program that considered cis-trans isomerization reaction features for oleic, linoleic, and linolenic acids attached to the glycerol part of triacylglycerols. The impact of process variables, such as temperature and liquid flow rate, and of equipment configuration parameters, such as liquid height, diameter, and number of stages, that influence the retention time of the oil in the equipment was analyzed using the response-surface methodology (RSM). The computational simulation and the RSM results were used in two different optimization methods, aiming to minimize final levels of total and individual trans-fatty acids (trans-FA), while keeping neutral oil loss and final oil acidity at low values. The main goal of this work was to indicate that computational simulation, based on a careful modeling of the reaction system, combined with optimization could be an important tool for indicating better processing conditions in industrial physical refining plants of vegetable oils, concerning trans-FA formation.
Resumo:
Optimization methods that employ the classical Powell-Hestenes-Rockafellar augmented Lagrangian are useful tools for solving nonlinear programming problems. Their reputation decreased in the last 10 years due to the comparative success of interior-point Newtonian algorithms, which are asymptotically faster. In this research, a combination of both approaches is evaluated. The idea is to produce a competitive method, being more robust and efficient than its `pure` counterparts for critical problems. Moreover, an additional hybrid algorithm is defined, in which the interior-point method is replaced by the Newtonian resolution of a Karush-Kuhn-Tucker (KKT) system identified by the augmented Lagrangian algorithm. The software used in this work is freely available through the Tango Project web page:http://www.ime.usp.br/similar to egbirgin/tango/.
Resumo:
We analyze the impact on consumer prices of the size and bias of price comparison search engines. In the context of a model related to Burdett and Judd (1983) and Varian (1980), we develop and test experimentally several theoretical predictions. The experimental results confirm the model’s predictions regarding the impact of the number of firms, and the type of bias of the search engine, but reject the model’s predictions regarding changes in the size of the index. The explanatory power of an econometric model for the price distributions is significantly improved when variables accounting for risk attitudes are introduced.
Resumo:
Multi-objective combinatorial optimization problems have peculiar characteristics that require optimization methods to adapt for this context. Since many of these problems are NP-Hard, the use of metaheuristics has grown over the last years. Particularly, many different approaches using Ant Colony Optimization (ACO) have been proposed. In this work, an ACO is proposed for the Multi-objective Shortest Path Problem, and is compared to two other optimizers found in the literature. A set of 18 instances from two distinct types of graphs are used, as well as a specific multiobjective performance assessment methodology. Initial experiments showed that the proposed algorithm is able to generate better approximation sets than the other optimizers for all instances. In the second part of this work, an experimental analysis is conducted, using several different multiobjective ACO proposals recently published and the same instances used in the first part. Results show each type of instance benefits a particular type of instance benefits a particular algorithmic approach. A new metaphor for the development of multiobjective ACOs is, then, proposed. Usually, ants share the same characteristics and only few works address multi-species approaches. This works proposes an approach where multi-species ants compete for food resources. Each specie has its own search strategy and different species do not access pheromone information of each other. As in nature, the successful ant populations are allowed to grow, whereas unsuccessful ones shrink. The approach introduced here shows to be able to inherit the behavior of strategies that are successful for different types of problems. Results of computational experiments are reported and show that the proposed approach is able to produce significantly better approximation sets than other methods
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The capacitor placement (replacement) problem for radial distribution networks determines capacitor types, sizes, locations and control schemes. Optimal capacitor placement is a hard combinatorial problem that can be formulated as a mixed integer nonlinear program. Since this is a NP complete problem (Non Polynomial time) the solution approach uses a combinatorial search algorithm. The paper proposes a hybrid method drawn upon the Tabu Search approach, extended with features taken from other combinatorial approaches such as genetic algorithms and simulated annealing, and from practical heuristic approaches. The proposed method has been tested in a range of networks available in the literature with superior results regarding both quality and cost of solutions.