921 resultados para Web data
Resumo:
Abstract Heading into the 2020s, Physics and Astronomy are undergoing experimental revolutions that will reshape our picture of the fabric of the Universe. The Large Hadron Collider (LHC), the largest particle physics project in the world, produces 30 petabytes of data annually that need to be sifted through, analysed, and modelled. In astrophysics, the Large Synoptic Survey Telescope (LSST) will be taking a high-resolution image of the full sky every 3 days, leading to data rates of 30 terabytes per night over ten years. These experiments endeavour to answer the question why 96% of the content of the universe currently elude our physical understanding. Both the LHC and LSST share the 5-dimensional nature of their data, with position, energy and time being the fundamental axes. This talk will present an overview of the experiments and data that is gathered, and outlines the challenges in extracting information. Common strategies employed are very similar to industrial data! Science problems (e.g., data filtering, machine learning, statistical interpretation) and provide a seed for exchange of knowledge between academia and industry. Speaker Biography Professor Mark Sullivan Mark Sullivan is a Professor of Astrophysics in the Department of Physics and Astronomy. Mark completed his PhD at Cambridge, and following postdoctoral study in Durham, Toronto and Oxford, now leads a research group at Southampton studying dark energy using exploding stars called "type Ia supernovae". Mark has many years' experience of research that involves repeatedly imaging the night sky to track the arrival of transient objects, involving significant challenges in data handling, processing, classification and analysis.
Resumo:
This paper describes the development and evaluation of web-based museum trails for university-level design students to access on handheld devices in the Victoria and Albert Museum (V&A) in London. The trails offered students a range of ways of exploring the museum environment and collections, some encouraging students to interpret objects and museum spaces in lateral and imaginative ways, others more straightforwardly providing context and extra information. In a three-stage qualitative evaluation programme, student feedback showed that overall the trails enhanced students’ knowledge of, interest in, and closeness to the objects. However, the trails were only partially successful from a technological standpoint due to device and network problems. Broader findings suggest that technology has a key role to play in helping to maintain the museum as a learning space which complements that of universities as well as schools. This research informed my other work in visitor-constructed learning trails in museums, specifically in the theoretical approach to data analysis used, in the research design, and in informing ways to structure visitor experiences in museums. It resulted in a conference presentation, and more broadly informed my subsequent teaching practice.
Resumo:
Background: The impact of cancer upon children, teenagers and young people can be profound. Research has been undertaken to explore the impacts upon children, teenagers and young people with cancer, but little is known about how researchers can ‘best’ engage with this group to explore their experiences. This review paper provides an overview of the utility of data collection methods employed when undertaking research with children, teenagers and young people. A systematic review of relevant databases was undertaken utilising the search terms ‘young people’, ‘young adult’, ‘adolescent’ and ‘data collection methods’. The full-text of the papers that were deemed eligible from the title and abstract were accessed and following discussion within the research team, thirty papers were included. Findings: Due to the heterogeneity in terms of the scope of the papers identified the following data collections methods were included in the results section. Three of the papers identified provided an overview of data collection methods utilised with this population and the remaining twenty seven papers covered the following data collection methods: Digital technologies; art based research; comparing the use of ‘paper and pencil’ research with web-based technologies, the use of games; the use of a specific communication tool; questionnaires and interviews; focus groups and telephone interviews/questionnaires. The strengths and limitations of the range of data collection methods included are discussed drawing upon such issues as of the appropriateness of particular methods for particular age groups, or the most appropriate method to employ when exploring a particularly sensitive topic area. Conclusions: There are a number of data collection methods utilised to undertaken research with children, teenagers and young adults. This review provides a summary of the current available evidence and an overview of the strengths and limitations of data collection methods employed.
Resumo:
Les problématiques de surplus de poids sont en augmentation depuis les dernières décennies, notamment chez les jeunes québécois. Cette augmentation est en lien avec des habitudes alimentaires présentant des différences importantes avec les recommandations nutritionnelles. De plus, le gouvernement provincial a instauré des changements importants au Programme de formation de l’école québécoise afin de stimuler l’adoption de saines habitudes de vie. Afin de contrer ces problématiques de surplus de poids et d’habitudes alimentaires déficientes et de poursuivre dans la lignée de la réforme scolaire, le Nutriathlon en équipe version Web a été développé. Ce programme a pour but d’amener chaque participant à améliorer la qualité de son alimentation en augmentant et en diversifiant sa consommation de légumes, de fruits et de produits laitiers. Les objectifs de la présente étude sont (1) d’évaluer l’impact du programme sur la consommation de légumes, de fruits (LF) et de produits laitiers (PL) d’élèves du secondaire et (2) d’évaluer les facteurs influençant la réussite du programme chez ces jeunes. Les résultats de l’étude ont démontré que pendant le programme ainsi qu’immédiatement après, le groupe intervention a rapporté une augmentation significative de la consommation de LF et de PL par rapport au groupe contrôle. Par contre, aucun effet n’a pu être observé à moyen terme. Quant aux facteurs facilitant le succès du Nutriathlon en équipe, les élèves ont mentionné : l’utilisation de la technologie pour la compilation des portions, la formation d’équipes, l’implication des enseignants et de l’entourage familial ainsi que la création de stratégies pour faciliter la réussite du programme. Les élèves ont également mentionné des barrières au succès du Nutriathlon en équipe telles que le manque d’assiduité à saisir leurs données en dehors des heures de classe, la dysfonction du code d’utilisateur et l’incompatibilité de la plateforme avec certains outils technologiques comme les tablettes.
Resumo:
A Internet possui inúmeros tipos de documentos e é uma influente fonte de informação.O conteúdo Web é projetado para os seres humanos interpretarem e não para as máquinas.Os sistemas de busca tradicionais são imprecisos na recuperação de informações. Ogoverno utiliza e disponibiliza documentos na Web para que os cidadãos e seus própriossetores organizacionais os utilizem, porém carece de ferramentas que apoiem na tarefa darecuperação desses documentos. Como exemplo, podemos citar a Plataforma de CurrículosLattes administrada pelo Cnpq.A Web semântica possui a finalidade de otimizar a recuperação dos documentos, ondeesses recebem significados, permitindo que tanto as pessoas quanto as máquinas possamcompreender o significado de uma informação. A falta de semântica em nossos documentos,resultam em pesquisas ineficazes, com informações divergentes e ambíguas. Aanotação semântica é o caminho para promover a semântica em documentos.O objetivo da dissertação é montar um arcabouço com os conceitos da Web Semânticaque possibilite anotar automaticamente o Currículo Lattes por meio de bases de dadosabertas (Linked Open Data), as quais armazenam o significado de termos e expressões.O problema da pesquisa está baseado em saber quais são os conceitos associados à WebSemântica que podem contribuir para a Anotação Semântica Automática do CurrículoLattes utilizando o Linked Open Data (LOD)?Na Revisão Sistemática da Literatura foi apresentado conceitos (anotação manual, automática,semi-automática, anotação intrusiva...), ferramentas (Extrator de Entidade...)e tecnologias (RDF, RDFa, SPARQL..) relativas ao tema. A aplicação desses conceitosoportunizou a criação do Sistema Lattes Web Semântico. O sistema possibilita a importaçãodo currículo XML da Plataforma Lattes, efetua a anotação automática dos dadosdisponibilizados utilizando as bases de dados abertas e possibilita efetuar consultas semânticas.A validação do sistema é realizada com a apresentação de currículos anotados e a realizaçãode consultas utilizando dados externos pertencentes ao LOD. Por fim é apresentado asconclusões, dificuldades encontradas e proposta de trabalhos futuros.
Resumo:
Com o intuito de melhorar a eficiência na gestão e execução de responsabilidades junto dos munícipes, a Câmara Municipal de Angra do Heroísmo (CMAH), localizada na ilha Terceira (Região Autónoma dos Açores), distribui as suas valências por vários departamentos e colaboradores especializados. Apesar desta segmentação existem circunstâncias em que os mesmos trabalham em conjunto e cruzam informações, por exemplo, nos processos de licenciamento. Contudo, esta necessária troca de dados é deficiente quando se calendarizam eventos organizados ou não pela instituição em causa. Consequentemente, esta falha resulta muitas vezes na sobreposição de eventos, algo considerado insustentável numa comunidade relativamente pequena, como é o caso de Angra do Heroísmo (em 2013, contava com 35.109 habitantes). A autarquia pretende solucionar o problema tendo em conta as capacidades proporcionadas pelas plataformas da Web 2.0 que, entre outras, permitem a participação dos utilizadores e a fácil inserção e gestão da informação por pessoas sem conhecimentos técnicos aprofundados. Esta dissertação determina as especificações que devem estar presentes numa plataforma Web de calendarização e divulgação da oferta cultural, ao serviço do Município de Angra do Heroísmo; conceptualiza um protótipo funcional que valida as especificações identificadas e serve de apoio à construção da plataforma final a desenvolver no futuro. Esta investigação tem como fim melhorar o processo de calendarização e divulgação de eventos da oferta cultural do concelho angrense. Esta finalidade implicou a necessidade de conhecer aprofundadamente o funcionamento da instituição, identificando e distinguindo o papel dos vários intervenientes e processos, pelo que parte da investigação decorreu na Câmara Municipal de Angra do Heroísmo. Entre os vários desafios desta pesquisa destacam-se a recolha e compreensão de informação sobre o processo em estudo e o planeamento de um sistema digital intuitivo, que respeite as estruturas de decisão e o sistema hierárquico da autarquia e que detenha o grau de rigor exigido nas organizações governativas.
Resumo:
In 2005, the University of Maryland acquired over 70 digital videos spanning 35 years of Jim Henson’s groundbreaking work in television and film. To support in-house discovery and use, the collection was cataloged in detail using AACR2 and MARC21, and a web-based finding aid was also created. In the past year, I created an "r-ball" (a linked data set described using RDA) of these same resources. The presentation will compare and contrast these three ways of accessing the Jim Henson Works collection, with insights gleaned from providing resource discovery using RIMMF (RDA in Many Metadata Formats).
Resumo:
This paper discusses the advantages of database-backed websites and describes the model for a library website implemented at the University of Nottingham using open source software, PHP and MySQL. As websites continue to grow in size and complexity it becomes increasingly important to introduce automation to help manage them. It is suggested that a database-backed website offers many advantages over one built from static HTML pages. These include a consistency of style and content, the ability to present different views of the same data, devolved editing and enhanced security. The University of Nottingham Library Services website is described and issues surrounding its design, technological implementation and management are explored.
Resumo:
This article describes the design and implementation of computer-aided tool called Relational Algebra Translator (RAT) in data base courses, for the teaching of relational algebra. There was a problem when introducing the relational algebra topic in the course EIF 211 Design and Implementation of Databases, which belongs to the career of Engineering in Information Systems of the National University of Costa Rica, because students attending this course were lacking profound mathematical knowledge, which led to a learning problem, being this an important subject to understand what the data bases search and request do RAT comes along to enhance the teaching-learning process.It introduces the architectural and design principles required for its implementation, such as: the language symbol table, the gramatical rules and the basic algorithms that RAT uses to translate from relational algebra to SQL language. This tool has been used for one periods and has demonstrated to be effective in the learning-teaching process. This urged investigators to publish it in the web site: www.slinfo.una.ac.cr in order for this tool to be used in other university courses.
Resumo:
Relaatiotietokannat ovat olleet vallitseva suunta suurissa tietokantajärjestelmissä jo 80-luvulta lähtien. Viimeisen vuosikymmenen aikana lähes kaikki teollinen ja henkilökohtainen tiedonvaihto on siirtynyt sähköiseen maailmaan. Tämä on aiheuttanut valtaisan kasvun datamäärissä. Sama kasvu jatkuu edelleen eksponentiaalisesti. Samalla ei-relaatiotietokannat eli NoSQL-tietokannat ovat nousseet huomattavaan asemaan. Monet organisaatiot käsittelevät suuria määriä järjestämätöntä dataa, jolloin perinteisen relaatiotietokannan käyttö yksin ei välttämättä ole paras, tai edes riittävä vaihtoehto. Web 2.0 -termin takana oleva internet-kulttuurin muutos tukee mukautuvampia ja skaalautuvia NoSQL-järjestelmiä. Internetin käyttäjät, erityisesti sosiaalisessa mediassa tuottavat valtavia määriä järjestymätöntä dataa. Kerättävä tieto ei ole enää tietyn mallin mukaan muotoiltua, vaan yksittäiseen tietueeseen saattaa liittyä esimerkiksi kuvia, videoita, viittauksia muiden käyttäjien luomiin instansseihin tai osoitetietoja. Tässä tutkielmassa käsitellään NoSQL-järjestelmien rakennetta sekä asemaa erityisesti suurissa tietojärjestelmissä ja vertaillaan niiden hyötyjä ja haittoja relaatiotietokantojen suhteen.
Resumo:
Artificial Immune Systems have been used successfully to build recommender systems for film databases. In this research, an attempt is made to extend this idea to web site recommendation. A collection of more than 1000 individuals' web profiles (alternatively called preferences / favourites / bookmarks file) will be used. URLs will be classified using the DMOZ (Directory Mozilla) database of the Open Directory Project as our ontology. This will then be used as the data for the Artificial Immune Systems rather than the actual addresses. The first attempt will involve using a simple classification code number coupled with the number of pages within that classification code. However, this implementation does not make use of the hierarchical tree-like structure of DMOZ. Consideration will then be given to the construction of a similarity measure for web profiles that makes use of this hierarchical information to build a better-informed Artificial Immune System.
Resumo:
Ecological network analysis was applied in the Seine estuary ecosystem, northern France, integrating ecological data from the years 1996 to 2002. The Ecopath with Ecosim (EwE) approach was used to model the trophic flows in 6 spatial compartments leading to 6 distinct EwE models: the navigation channel and the two channel flanks in the estuary proper, and 3 marine habitats in the eastern Seine Bay. Each model included 12 consumer groups, 2 primary producers, and one detritus group. Ecological network analysis was performed, including a set of indices, keystoneness, and trophic spectrum analysis to describe the contribution of the 6 habitats to the Seine estuary ecosystem functioning. Results showed that the two habitats with a functioning most related to a stressed state were the northern and central navigation channels, where building works and constant maritime traffic are considered major anthropogenic stressors. The strong top-down control highlighted in the other 4 habitats was not present in the central channel, showing instead (i) a change in keystone roles in the ecosystem towards sediment-based, lower trophic levels, and (ii) a higher system omnivory. The southern channel evidenced the highest system activity (total system throughput), the higher trophic specialisation (low system omnivory), and the lowest indication of stress (low cycling and relative redundancy). Marine habitats showed higher fish biomass proportions and higher transfer efficiencies per trophic levels than the estuarine habitats, with a transition area between the two that presented intermediate ecosystem structure. The modelling of separate habitats permitted disclosing each one's response to the different pressures, based on their a priori knowledge. Network indices, although non-monotonously, responded to these differences and seem a promising operational tool to define the ecological status of transitional water ecosystems.
Resumo:
SQL Injection Attack (SQLIA) remains a technique used by a computer network intruder to pilfer an organisation’s confidential data. This is done by an intruder re-crafting web form’s input and query strings used in web requests with malicious intent to compromise the security of an organisation’s confidential data stored at the back-end database. The database is the most valuable data source, and thus, intruders are unrelenting in constantly evolving new techniques to bypass the signature’s solutions currently provided in Web Application Firewalls (WAF) to mitigate SQLIA. There is therefore a need for an automated scalable methodology in the pre-processing of SQLIA features fit for a supervised learning model. However, obtaining a ready-made scalable dataset that is feature engineered with numerical attributes dataset items to train Artificial Neural Network (ANN) and Machine Leaning (ML) models is a known issue in applying artificial intelligence to effectively address ever evolving novel SQLIA signatures. This proposed approach applies numerical attributes encoding ontology to encode features (both legitimate web requests and SQLIA) to numerical data items as to extract scalable dataset for input to a supervised learning model in moving towards a ML SQLIA detection and prevention model. In numerical attributes encoding of features, the proposed model explores a hybrid of static and dynamic pattern matching by implementing a Non-Deterministic Finite Automaton (NFA). This combined with proxy and SQL parser Application Programming Interface (API) to intercept and parse web requests in transition to the back-end database. In developing a solution to address SQLIA, this model allows processed web requests at the proxy deemed to contain injected query string to be excluded from reaching the target back-end database. This paper is intended for evaluating the performance metrics of a dataset obtained by numerical encoding of features ontology in Microsoft Azure Machine Learning (MAML) studio using Two-Class Support Vector Machines (TCSVM) binary classifier. This methodology then forms the subject of the empirical evaluation.
Resumo:
Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed. Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs. Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features. The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned. An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask. The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.