896 resultados para Web log analysis
Resumo:
I takt med att GIS (Grafiska InformationsSystem) blir allt vanligare och mer användarvänligt har WM-data sett att kunder skulle ha intresse i att kunna koppla information från sin verksamhet till en kartbild. Detta för att lättare kunna ta till sig informationen om hur den geografiskt finns utspridd över ett område för att t.ex. ordna effektivare tranporter. WM-data, som det här arbetet är utfört åt, avser att ta fram en prototyp som sedan kan visas upp för att påvisa för kunder och andra intressenter att detta är möjligt att genomföra genom att skapa en integration mellan redan befintliga system. I det här arbetet har prototypen tagits fram med skogsindustrin och dess lager som inriktning. Befintliga program som integrationen ska skapas mellan är båda webbaserade och körs i en webbläsare. Analysprogrammet som ska användas heter Insikt och är utvecklat av företaget Trimma, kartprogrammet heter GIMS som är WM-datas egna program. Det ska vara möjligt att i Insikt analysera data och skapa en rapport. Den ska sedan skickas till GIMS där informationen skrivs ut på kartan på den plats som respektive information hör till. Det ska även gå att välja ut ett eller flera områden i kartan och skicka till Insikt för att analysera information från enbart de utvalda områdena. En prototyp med önskad funktionalitet har under arbetets gång tagits fram, men för att ha en säljbar produkt är en del arbeta kvar. Prototypen har visats för ett antal intresserade som tyckte det var intressant och tror att det är något som skulle kunna användas flitigt inom många områden.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
SQL Injection Attack (SQLIA) remains a technique used by a computer network intruder to pilfer an organisation’s confidential data. This is done by an intruder re-crafting web form’s input and query strings used in web requests with malicious intent to compromise the security of an organisation’s confidential data stored at the back-end database. The database is the most valuable data source, and thus, intruders are unrelenting in constantly evolving new techniques to bypass the signature’s solutions currently provided in Web Application Firewalls (WAF) to mitigate SQLIA. There is therefore a need for an automated scalable methodology in the pre-processing of SQLIA features fit for a supervised learning model. However, obtaining a ready-made scalable dataset that is feature engineered with numerical attributes dataset items to train Artificial Neural Network (ANN) and Machine Leaning (ML) models is a known issue in applying artificial intelligence to effectively address ever evolving novel SQLIA signatures. This proposed approach applies numerical attributes encoding ontology to encode features (both legitimate web requests and SQLIA) to numerical data items as to extract scalable dataset for input to a supervised learning model in moving towards a ML SQLIA detection and prevention model. In numerical attributes encoding of features, the proposed model explores a hybrid of static and dynamic pattern matching by implementing a Non-Deterministic Finite Automaton (NFA). This combined with proxy and SQL parser Application Programming Interface (API) to intercept and parse web requests in transition to the back-end database. In developing a solution to address SQLIA, this model allows processed web requests at the proxy deemed to contain injected query string to be excluded from reaching the target back-end database. This paper is intended for evaluating the performance metrics of a dataset obtained by numerical encoding of features ontology in Microsoft Azure Machine Learning (MAML) studio using Two-Class Support Vector Machines (TCSVM) binary classifier. This methodology then forms the subject of the empirical evaluation.
Resumo:
This dissertation research points out major challenging problems with current Knowledge Organization (KO) systems, such as subject gateways or web directories: (1) the current systems use traditional knowledge organization systems based on controlled vocabulary which is not very well suited to web resources, and (2) information is organized by professionals not by users, which means it does not reflect intuitively and instantaneously expressed users’ current needs. In order to explore users’ needs, I examined social tags which are user-generated uncontrolled vocabulary. As investment in professionally-developed subject gateways and web directories diminishes (support for both BUBL and Intute, examined in this study, is being discontinued), understanding characteristics of social tagging becomes even more critical. Several researchers have discussed social tagging behavior and its usefulness for classification or retrieval; however, further research is needed to qualitatively and quantitatively investigate social tagging in order to verify its quality and benefit. This research particularly examined the indexing consistency of social tagging in comparison to professional indexing to examine the quality and efficacy of tagging. The data analysis was divided into three phases: analysis of indexing consistency, analysis of tagging effectiveness, and analysis of tag attributes. Most indexing consistency studies have been conducted with a small number of professional indexers, and they tended to exclude users. Furthermore, the studies mainly have focused on physical library collections. This dissertation research bridged these gaps by (1) extending the scope of resources to various web documents indexed by users and (2) employing the Information Retrieval (IR) Vector Space Model (VSM) - based indexing consistency method since it is suitable for dealing with a large number of indexers. As a second phase, an analysis of tagging effectiveness with tagging exhaustivity and tag specificity was conducted to ameliorate the drawbacks of consistency analysis based on only the quantitative measures of vocabulary matching. Finally, to investigate tagging pattern and behaviors, a content analysis on tag attributes was conducted based on the FRBR model. The findings revealed that there was greater consistency over all subjects among taggers compared to that for two groups of professionals. The analysis of tagging exhaustivity and tag specificity in relation to tagging effectiveness was conducted to ameliorate difficulties associated with limitations in the analysis of indexing consistency based on only the quantitative measures of vocabulary matching. Examination of exhaustivity and specificity of social tags provided insights into particular characteristics of tagging behavior and its variation across subjects. To further investigate the quality of tags, a Latent Semantic Analysis (LSA) was conducted to determine to what extent tags are conceptually related to professionals’ keywords and it was found that tags of higher specificity tended to have a higher semantic relatedness to professionals’ keywords. This leads to the conclusion that the term’s power as a differentiator is related to its semantic relatedness to documents. The findings on tag attributes identified the important bibliographic attributes of tags beyond describing subjects or topics of a document. The findings also showed that tags have essential attributes matching those defined in FRBR. Furthermore, in terms of specific subject areas, the findings originally identified that taggers exhibited different tagging behaviors representing distinctive features and tendencies on web documents characterizing digital heterogeneous media resources. These results have led to the conclusion that there should be an increased awareness of diverse user needs by subject in order to improve metadata in practical applications. This dissertation research is the first necessary step to utilize social tagging in digital information organization by verifying the quality and efficacy of social tagging. This dissertation research combined both quantitative (statistics) and qualitative (content analysis using FRBR) approaches to vocabulary analysis of tags which provided a more complete examination of the quality of tags. Through the detailed analysis of tag properties undertaken in this dissertation, we have a clearer understanding of the extent to which social tagging can be used to replace (and in some cases to improve upon) professional indexing.
Resumo:
The South Carolina Department of Consumer Affairs publishes an annual mortgage log report as a requirement of the South Carolina Mortgage Lending Act, which became effective on January 1, 2010. The mortgage log report analyzes the following data, concerning all mortgage loan applications taken: the borrower’s credit score, term of the loan, annual percentage rate, type of rate, and appraised value of the property. The mortgage log report also analyzes data required by the Home Mortgage Disclosure Act, including the following information: the loan type, property type, purpose of the loan, owner/occupancy status, loan amount, action taken, reason for denial, property location, gross annual income, purchaser of the loan, rate spread, HOEPA status, and lien status as well as the applicant and co-applicant’s race, ethnicity, and gender.
Resumo:
Sequences of timestamped events are currently being generated across nearly every domain of data analytics, from e-commerce web logging to electronic health records used by doctors and medical researchers. Every day, this data type is reviewed by humans who apply statistical tests, hoping to learn everything they can about how these processes work, why they break, and how they can be improved upon. To further uncover how these processes work the way they do, researchers often compare two groups, or cohorts, of event sequences to find the differences and similarities between outcomes and processes. With temporal event sequence data, this task is complex because of the variety of ways single events and sequences of events can differ between the two cohorts of records: the structure of the event sequences (e.g., event order, co-occurring events, or frequencies of events), the attributes about the events and records (e.g., gender of a patient), or metrics about the timestamps themselves (e.g., duration of an event). Running statistical tests to cover all these cases and determining which results are significant becomes cumbersome. Current visual analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. Visual analytics tools leverage humans' ability to easily see patterns and anomalies that they were not expecting, but is limited by uncertainty in findings. Statistical tools emphasize finding significant differences in the data, but often requires researchers have a concrete question and doesn't facilitate more general exploration of the data. Combining visual analytics tools with statistical methods leverages the benefits of both approaches for quicker and easier insight discovery. Integrating statistics into a visualization tool presents many challenges on the frontend (e.g., displaying the results of many different metrics concisely) and in the backend (e.g., scalability challenges with running various metrics on multi-dimensional data at once). I begin by exploring the problem of comparing cohorts of event sequences and understanding the questions that analysts commonly ask in this task. From there, I demonstrate that combining automated statistics with an interactive user interface amplifies the benefits of both types of tools, thereby enabling analysts to conduct quicker and easier data exploration, hypothesis generation, and insight discovery. The direct contributions of this dissertation are: (1) a taxonomy of metrics for comparing cohorts of temporal event sequences, (2) a statistical framework for exploratory data analysis with a method I refer to as high-volume hypothesis testing (HVHT), (3) a family of visualizations and guidelines for interaction techniques that are useful for understanding and parsing the results, and (4) a user study, five long-term case studies, and five short-term case studies which demonstrate the utility and impact of these methods in various domains: four in the medical domain, one in web log analysis, two in education, and one each in social networks, sports analytics, and security. My dissertation contributes an understanding of how cohorts of temporal event sequences are commonly compared and the difficulties associated with applying and parsing the results of these metrics. It also contributes a set of visualizations, algorithms, and design guidelines for balancing automated statistics with user-driven analysis to guide users to significant, distinguishing features between cohorts. This work opens avenues for future research in comparing two or more groups of temporal event sequences, opening traditional machine learning and data mining techniques to user interaction, and extending the principles found in this dissertation to data types beyond temporal event sequences.
Resumo:
La aplicación Log2XML tiene como objeto principal la transformación de archivos log en formato texto con separador de campos a un formato XML estandarizado. Para permitir que la aplicación pueda trabajar con logs de diferentes sistemas o aplicaciones, dispone de un sistema de plantillas (indicación de orden de campos y carácter separador) que permite definir la estructura mínima para poder extraer la información de cualquier tipo de log que se base en separadores de campo. Por último, la aplicación permite el procesamiento de la información extraída para la generación de informes y estadísticas.Por otro lado, en el proyecto se profundiza en la tecnología Grails.
Resumo:
Websites of academic institutions are the prime source of information about the institution. Libraries, being the main provider of information for the academics, need to be represented in the respective homepages with due importance. Keeping this in mind, this study is an attempt to understand and analyze the presence and presentation of libraries of Engineering Colleges (EC) in Kerala in their respective websites. On the basis of the reviewed literature and an observation of libraries of nationally important institutions imparting technical education in India, a set of criteria were developed for analyzing the websites/web pages. Based on this an extensive survcy of the websites of ECs were done. The collected data was then analyzed using Microsoft Excel. The library websites were then ranked on the basis of this analysis. It was observed that majority of the websites of ECs in Kerala have least representation of their respective libraries. Another important observation is that even the highest scoring libraries satisfy only half of the criteria listed for analysis.
Resumo:
Tässä työssä käsitellään kävijäseurannan menetelmiä ja toteutetaan niitä käytännössä. Web-analytiikkaohjelmistojen toimintaan tutustutaan, pääasiassa keskittyen Google Analyticsiin. Tavoitteena on selvittää Lappeenrannan matkailulaitepäätteiden käyttömääriä ja eriyttää niitä laitekohtaisesti. Web-analytiikasta tehdään kirjallisuuskatsaus ja kävijäseurantadataa analysoidaan sekä vertaillaan kahdesta eri verkkosivustosta. Lisäksi matkailulaitepäätteiden verkkosivuston lokeja tarkastellaan tiedonlouhinnan keinoin tarkoitusta varten kehitetyllä Python-sovelluksella. Työn pohjalta voidaan todeta, ettei matkailulaitepäätteiden käyttömääriä voida nykyisen toteutuksen perusteella eriyttää laitekohtaisesti. Istuntojen määrää ja tapahtumia voidaan kuitenkin seurata. Matkailulaitepäätteiden kävijäseurannassa tunnistetaan useita ongelmia, kuten päätteiden automaattisen verkkosivunpäivityksen tuloksia vääristävä vaikutus, osittainen Google Analytics -integraatio ja tärkeimpänä päätteen yksilöivän tunnistetiedon puuttuminen. Työssä ehdotetaan ratkaisuja, joilla mahdollistetaan kävijäseurannan tehokas käyttö ja laitekohtainen seuranta. Saadut tulokset korostavat kävijäseurannan toteutuksen suunnitelmallisuuden tärkeyttä.
Resumo:
Este estudio de caso analiza en qué medida la firma del Tratado de Libre Comercio entre Colombia y Corea del Sur obedece a estrategias políticas y/o costo beneficio económico por parte de este último. La hipótesis en el presente trabajo es que la firma del TLC entre ambos países se debe a la existencia de intereses compartidos. Por un lado, existen intereses económicos, debido a que Corea del Sur es un actor racional que busca siempre maximizar sus beneficios a través del aumento del tamaño de sus mercados. En este sentido, Colombia le sirve como plataforma para exportación de productos coreanos utilizando los acuerdos comerciales ya establecidos. Así mismo, existen intereses políticos que son permeados por medio de la cooperación internacional y que le pueden servir al Estado surcoreano en un proceso de búsqueda de legitimar de su imagen dentro del Sistema Internacional vis-a-vis de su relación con Corea del Norte. Este trabajo será de tipo descriptivo y explicativo. Para el desarrollo se utilizará la metodología cualitativa, ya que se ahonda en las especificidades del caso para entender cómo se dio éste fenómeno en particular. Como fuentes de recolección de información se utilizan entrevistas y análisis de documentos oficiales de la Embajada de Corea y discursos del Embajador Choo Jong Youn.
Resumo:
El propósito de la presente monografía es evaluar el papel de las ONG internacionales en la apertura de espacios de participación política para la sociedad civil en Egipto. En ese sentido, se analiza el contexto de oportunidades políticas locales y transnacionales del país, así como los procesos de articulación entre la política local e internacional a través de los niveles de integración entre sus actores. Mediante una investigación de tipo cualitativa basada en los desarrollos sobre teorías de la acción colectiva planteados por Sidney Tarrow, Charles Tilly, Robert Benford y David Snow, y las teorías sobre redes transnacionales de defensa desarrolladas por Margaret Keck y Kathryn Sikkink, se avanza hacia la identificación del desarrollo de procesos de externalización como medio para el fortalecimiento de organizaciones locales como alternativa de oposición política.
Resumo:
El interés de esta investigación es analizar los cambios en las políticas migratorias de Italia y Libia a partir del Tratado de amistad y cooperación firmado en 2008. Utilizando el concepto de securitización de Barry Buzan, se explican cuáles fueron las principales motivaciones para que ambos Estados tomaran la decisión de endurecer sus políticas migratorias para hacerle frente a la migración irregular. La securitización del tema de la migración se convirtió en el mecanismo principal del gobierno italiano para justificar el incumplimiento de acuerdos internacionales, dejando en un segundo plano la protecciónde los Derechos Humanos. Esta situación trae consigo altos costos humanitarios y pone en evidencia cómo Italia y Libia están tratando las nuevas amenazas como lo es la migración irregular en esta región.
Resumo:
Las relaciones políticas y económicas entre Corea del Sur y Japón pasaban por su mejor momento en los primeros años del siglo XXI, cuando la disputa territorial por las islas Dokdo, un grupo de islotes ubicados en el mar de Japón y que por décadas han simbolizado el fin de la ocupación del país nipón en territorio coreano, causara nuevas y significativas tensionen entre los dos países. Dicho fenómeno, se sugiere fundamental en la comprensión de las nuevas relaciones bilaterales entre los dos actores y se presenta como foco de análisis en la presente monografía. El documento, presenta un análisis descriptivo de la disputa territorial por las Islas y de sus efectos en las relaciones entre los dos países, tanto en los ámbitos político, social y económico.
Resumo:
La presente investigación busca dilucidar el papel del modelo económico del chaebol en la participación de Corea del Sur en la Organización para la Cooperación y el Desarrollo Económico OCDE. Las investigaciones en torno al modelo chaebol y sus privilegios en Corea del Sur no han sido enfocadas directamente hacia el ámbito internacional y la influencia que puede tener allí dicho modelo. La investigación busca demostrar que el éxito del modelo económico del chaebol sirvió como incentivo para su entrada y participación activa en la OCDE, consiguiendo de esta forma no solo el establecimiento de una cooperación económica con sus miembros, sino un prestigio y reconocimiento frente a la Comunidad Internacional. Para el desarrollo de esta investigación de tipo cualitativo, se utilizarán fuentes de segundo y tercer grado para llevar a cabo un análisis documental de textos, pertinente para su desarrollo.