Biblioteca Digital

917 resultados para structuration of lexical data bases

A new analysis of hydrographic data in the Atlantic and its application to an inverse modeling study

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Using text-mining-assisted analysis to examine the applicability of unstructured data in the context of customer complaint management

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Double Degree

Veja mais

Classical Invariants for Principal Series and Isomorphisms of Root Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop some new techniques to calculate the Schur indicator for self-dual irreducible Langlands quotients of the principal series representations. Using these techniques we derive some new formulas for the Schur indicator and the real-quaternionic indicator. We make progress towards developing an algorithm to decide whether or not two root data are isomorphic. When the derived group has cyclic center, we solve the isomorphism problem completely. An immediate consequence is a clean and precise classification theorem for connected complex reductive groups whose derived groups have cyclic center.

Veja mais

Algorithms and tools of big data: A bibliographic review

Relevância:

100.00% 100.00%

Publicador:

Resumo:

66 p.

Veja mais

Multivalent reuse of web data about temporary art exhibitions: The Exhibitium Project

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Our proposal aims to display the analysis techniques, methodologies as well as the most relevant results expected within the Exhibitium project framework (http://www.exhibitium.com). Awarded by the BBVA Foundation, the Exhibitium project is being developed by an international consortium of several research groups . Its main purpose is to build a comprehensive and structured data repository about temporary art exhibitions, captured from the web, to make them useful and reusable in various domains through open and interoperable data systems.

Veja mais

Analysis of Sampling Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A Similar Exposure Group (SEG) can be created through the evaluation of workers performing the same or similar task, hazards they are exposed to, frequency and duration of their exposures, engineering controls available during their operations, personal protective equipment used, and exposure data. For this report, the samples of one facility that has collected nearly 40,000 various types of samples will be evaluated to determine if the creation of a SEG can be supported. The data will be reviewed for consistency with collection methods and laboratory detection limits. A subset of the samples may be selected based on the review. Data will also be statistically evaluated in order to determine whether the data is sufficient to terminate the sampling. IHDataAnalyst V1.27 will be used to assess the data. This program uses Bayesian Analysis to assist in making determinations. The 95 percent confidence interval will be calculated and evaluated in making decisions. This evaluation will be used to determine if a SEG can be created for any of the workers and determine the need for future sample collection. The data and evaluation presented in this report have been selected and evaluated specifically for the purposes of this project.

Veja mais

Studies of CMS data access patterns with machine learning techniques

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presents a study of the Grid data access patterns in distributed analysis in the CMS experiment at the LHC accelerator. This study ranges from the deep analysis of the historical patterns of access to the most relevant data types in CMS, to the exploitation of a supervised Machine Learning classification system to set-up a machinery able to eventually predict future data access patterns - i.e. the so-called dataset “popularity” of the CMS datasets on the Grid - with focus on specific data types. All the CMS workflows run on the Worldwide LHC Computing Grid (WCG) computing centers (Tiers), and in particular the distributed analysis systems sustains hundreds of users and applications submitted every day. These applications (or “jobs”) access different data types hosted on disk storage systems at a large set of WLCG Tiers. The detailed study of how this data is accessed, in terms of data types, hosting Tiers, and different time periods, allows to gain precious insight on storage occupancy over time and different access patterns, and ultimately to extract suggested actions based on this information (e.g. targetted disk clean-up and/or data replication). In this sense, the application of Machine Learning techniques allows to learn from past data and to gain predictability potential for the future CMS data access patterns. Chapter 1 provides an introduction to High Energy Physics at the LHC. Chapter 2 describes the CMS Computing Model, with special focus on the data management sector, also discussing the concept of dataset popularity. Chapter 3 describes the study of CMS data access patterns with different depth levels. Chapter 4 offers a brief introduction to basic machine learning concepts and gives an introduction to its application in CMS and discuss the results obtained by using this approach in the context of this thesis.

Veja mais

How do Firms ask for Consumers’ Data Permission? The Value of Companies Data Practices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

On May 25, 2018, the EU introduced the General Data Protection Regulation (GDPR) that offers EU citizens a shelter for their personal information by requesting companies to explain how people’s information is used clearly. To comply with the new law, European and non-European companies interacting with EU citizens undertook a massive data re-permission-request campaign. However, if on the one side the EU Regulator was particularly specific in defining the conditions to get customers’ data access, on the other side, it did not specify how the communication between firms and consumers should be designed. This has left firms free to develop their re-permission emails as they liked, plausibly coupling the informative nature of these privacy-related communications with other persuasive techniques to maximize data disclosure. Consequently, we took advantage of this colossal wave of simultaneous requests to provide insights into two issues. Firstly, we investigate how companies across industries and countries chose to frame their requests. Secondly, we investigate which are the factors that influenced the selection of alternative re-permission formats. In order to achieve these goals, we examine the content of a sample of 1506 re-permission emails sent by 1396 firms worldwide, and we identify the dominant “themes” characterizing these emails. We then relate these themes to both the expected benefits firms may derive from data usage and the possible risks they may experience from not being completely compliant to the spirit of the law. Our results show that: (1) most firms enriched their re-permission messages with persuasive arguments aiming at increasing consumers’ likelihood of relinquishing their data; (2) the use of persuasion is the outcome of a difficult tradeoff between costs and benefits; (3) most companies acted in their self-interest and “gamed the system”. Our results have important implications for policymakers, managers, and customers of the online sector.

Veja mais

Noises: a Nuisance and a Resource. Development of New Decomposition Methods of Noisy Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Noise is constant presence in measurements. Its origin is related to the microscopic properties of matter. Since the seminal work of Brown in 1828, the study of stochastic processes has gained an increasing interest with the development of new mathematical and analytical tools. In the last decades, the central role that noise plays in chemical and physiological processes has become recognized. The dual role of noise as nuisance/resource pushes towards the development of new decomposition techniques that divide a signal into its deterministic and stochastic components. In this thesis I show how methods based on Singular Spectrum Analysis have the right properties to fulfil the previously mentioned requirement. During my work I applied SSA to different signals of interest in chemistry: I developed a novel iterative procedure for the denoising of powder X-ray diffractograms; I “denoised” bi-dimensional images from experiments of electrochemiluminescence imaging of micro-beads obtaining new insight on ECL mechanism. I also used Principal Component Analysis to investigate the relationship between brain electrophysiological signals and voice emission.

Veja mais

Development of JADE, a new software for the verification and validation of nuclear data libraries

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nuclear cross sections are the pillars onto which the transport simulation of particles and radiations is built on. Since the nuclear data libraries production chain is extremely complex and made of different steps, it is mandatory to foresee stringent verification and validation procedures to be applied to it. The work here presented has been focused on the development of a new python based software called JADE, whose objective is to give a significant help in increasing the level of automation and standardization of these procedures in order to reduce the time passing between new libraries releases and, at the same time, increasing their quality. After an introduction to nuclear fusion (which is the field where the majority of the V\&V action was concentrated for the time being) and to the simulation of particles and radiations transport, the motivations leading to JADE development are discussed. Subsequently, the code general architecture and the implemented benchmarks (both experimental and computational) are described. After that, the results coming from the major application of JADE during the research years are presented. At last, after a final discussion on the objective reached by JADE, the possible brief, mid and long time developments for the project are discussed.

Veja mais

Integration of heterogeneous data sources and automated reasoning in healthcare and domotic IoT systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years, IoT technology has radically transformed many crucial industrial and service sectors such as healthcare. The multi-facets heterogeneity of the devices and the collected information provides important opportunities to develop innovative systems and services. However, the ubiquitous presence of data silos and the poor semantic interoperability in the IoT landscape constitute a significant obstacle in the pursuit of this goal. Moreover, achieving actionable knowledge from the collected data requires IoT information sources to be analysed using appropriate artificial intelligence techniques such as automated reasoning. In this thesis work, Semantic Web technologies have been investigated as an approach to address both the data integration and reasoning aspect in modern IoT systems. In particular, the contributions presented in this thesis are the following: (1) the IoT Fitness Ontology, an OWL ontology that has been developed in order to overcome the issue of data silos and enable semantic interoperability in the IoT fitness domain; (2) a Linked Open Data web portal for collecting and sharing IoT health datasets with the research community; (3) a novel methodology for embedding knowledge in rule-defined IoT smart home scenarios; and (4) a knowledge-based IoT home automation system that supports a seamless integration of heterogeneous devices and data sources.

Veja mais

Decentralized systems for the protection and portability of personal data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The General Data Protection Regulation (GDPR) has been designed to help promote a view in favor of the interests of individuals instead of large corporations. However, there is the need of more dedicated technologies that can help companies comply with GDPR while enabling people to exercise their rights. We argue that such a dedicated solution must address two main issues: the need for more transparency towards individuals regarding the management of their personal information and their often hindered ability to access and make interoperable personal data in a way that the exercise of one's rights would result in straightforward. We aim to provide a system that helps to push personal data management towards the individual's control, i.e., a personal information management system (PIMS). By using distributed storage and decentralized computing networks to control online services, users' personal information could be shifted towards those directly concerned, i.e., the data subjects. The use of Distributed Ledger Technologies (DLTs) and Decentralized File Storage (DFS) as an implementation of decentralized systems is of paramount importance in this case. The structure of this dissertation follows an incremental approach to describing a set of decentralized systems and models that revolves around personal data and their subjects. Each chapter of this dissertation builds up the previous one and discusses the technical implementation of a system and its relation with the corresponding regulations. We refer to the EU regulatory framework, including GDPR, eIDAS, and Data Governance Act, to build our final system architecture's functional and non-functional drivers. In our PIMS design, personal data is kept in a Personal Data Space (PDS) consisting of encrypted personal data referring to the subject stored in a DFS. On top of that, a network of authorization servers acts as a data intermediary to provide access to potential data recipients through smart contracts.

Veja mais

Neo-commodification of persons: exploitation of personal data and impact on the sharing economy

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The notion of commodification is a fascinating one. It entails many facets, ranging from subjective debates on desirability of commodification to in depth economic analyses of objects of value and their corresponding markets. Commodity theory is therefore not just defined by a single debate, but spans a plethora of different discussions. This thesis maps and situates those theories and debates and selects one specific strain to investigate further. This thesis argues that commodity theory in its optima forma deals with the investigation into what sets commodities apart from non-commodities. It proceeds to examine the many given answers to this question by scholars ranging from the mid 1800’s to the late 2000’s. Ultimately, commodification is defined as a process in which an object becomes an element of the total wealth of societies in which the capitalist mode of production prevails. In doing so, objects must meet observables, or indicia, of commodification provided by commodity theories. Problems arise when objects are clearly part of the total wealth in societies without meeting established commodity indicia. In such cases, objects are part of the total wealth of a society without counting as a commodity. This thesis examines this phenomenon in relation to the novel commodities of audiences and data. It explains how these non-commodities (according to classical theories) are still essential elements of industry. The thesis then takes a deep dive into commodity theory using the theory on the construction of social reality by John Searle.

Veja mais

Deployment of a data analysis workflow of the ATLAS experiment on HPC systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

LHC experiments produce an enormous amount of data, estimated of the order of a few PetaBytes per year. Data management takes place using the Worldwide LHC Computing Grid (WLCG) grid infrastructure, both for storage and processing operations. However, in recent years, many more resources are available on High Performance Computing (HPC) farms, which generally have many computing nodes with a high number of processors. Large collaborations are working to use these resources in the most efficient way, compatibly with the constraints imposed by computing models (data distributed on the Grid, authentication, software dependencies, etc.). The aim of this thesis project is to develop a software framework that allows users to process a typical data analysis workflow of the ATLAS experiment on HPC systems. The developed analysis framework shall be deployed on the computing resources of the Open Physics Hub project and on the CINECA Marconi100 cluster, in view of the switch-on of the Leonardo supercomputer, foreseen in 2023.

Veja mais

L’étude des stratégies de séparations préparatrices de protéines par électrophorèse capillaire

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La protéomique est un sujet d'intérêt puisque l'étude des fonctions et des structures de protéines est essentiel à la compréhension du fonctionnement d'un organisme donné. Ce projet se situe dans la catégorie des études structurales, ou plus précisément, la séquence primaire en acides aminés pour l’identification d’une protéine. La détermination des protéines commence par l'extraction d'un mélange protéique issu d'un tissu ou d'un fluide biologique pouvant contenir plus de 1000 protéines différentes. Ensuite, des techniques analytiques comme l’électrophorèse en gel polyacrylamide en deux dimensions (2D-SDS-PAGE), qui visent à séparer ce mélange en fonction du point isoélectrique et de la masse molaire des protéines, sont utilisées pour isoler les protéines et pour permettre leur identification par chromatographie liquide and spectrométrie de masse (MS), typiquement. Ce projet s'inspire de ce processus et propose que l'étape de fractionnement de l'extrait protéique avec la 2D-SDS-PAGE soit remplacé ou supporté par de multiples fractionnements en parallèle par électrophorèse capillaire (CE) quasi-multidimensionnelle. Les fractions obtenues, contenant une protéine seule ou un mélange de protéines moins complexe que l’extrait du départ, pourraient ensuite être soumises à des identifications de protéines par cartographie peptidique et cartographie protéique à l’aide des techniques de séparations analytiques et de la MS. Pour obtenir la carte peptidique d'un échantillon, il est nécessaire de procéder à la protéolyse enzymatique ou chimique des protéines purifiées et de séparer les fragments peptidiques issus de cette digestion. Les cartes peptidiques ainsi générées peuvent ensuite être comparées à des échantillons témoins ou les masses exactes des peptides enzymatiques sont soumises à des moteurs de recherche comme MASCOT™, ce qui permet l’identification des protéines en interrogeant les bases de données génomiques. Les avantages exploitables de la CE, par rapport à la 2D-SDS-PAGE, sont sa haute efficacité de séparation, sa rapidité d'analyse et sa facilité d'automatisation. L’un des défis à surmonter est la faible quantité de masse de protéines disponible après analyses en CE, due partiellement à l'adsorption des protéines sur la paroi du capillaire, mais due majoritairement au faible volume d'échantillon en CE. Pour augmenter ce volume, un capillaire de 75 µm était utilisé. Aussi, le volume de la fraction collectée était diminué de 1000 à 100 µL et les fractions étaient accumulées 10 fois; c’est-à-dire que 10 produits de séparations étaient contenu dans chaque fraction. D'un autre côté, l'adsorption de protéines se traduit par la variation de l'aire d'un pic et du temps de migration d'une protéine donnée ce qui influence la reproductibilité de la séparation, un aspect très important puisque 10 séparations cumulatives sont nécessaires pour la collecte de fractions. De nombreuses approches existent pour diminuer ce problème (e.g. les extrêmes de pH de l’électrolyte de fond, les revêtements dynamique ou permanent du capillaire, etc.), mais dans ce mémoire, les études de revêtement portaient sur le bromure de N,N-didodecyl-N,N-dimethylammonium (DDAB), un surfactant qui forme un revêtement semi-permanent sur la paroi du capillaire. La grande majorité du mémoire visait à obtenir une séparation reproductible d'un mélange protéique standard préparé en laboratoire (contenant l’albumine de sérum de bovin, l'anhydrase carbonique, l’α-lactalbumine et la β-lactoglobulin) par CE avec le revêtement DDAB. Les études portées sur le revêtement montraient qu'il était nécessaire de régénérer le revêtement entre chaque injection du mélange de protéines dans les conditions étudiées : la collecte de 5 fractions de 6 min chacune à travers une séparation de 30 min, suivant le processus de régénération du DDAB, et tout ça répété 10 fois. Cependant, l’analyse en CE-UV et en HPLC-MS des fractions collectées ne montraient pas les protéines attendues puisqu'elles semblaient être en-dessous de la limite de détection. De plus, une analyse en MS montrait que le DDAB s’accumule dans les fractions collectées dû à sa désorption de la paroi du capillaire. Pour confirmer que les efforts pour recueillir une quantité de masse de protéine étaient suffisants, la méthode de CE avec détection par fluorescence induite par laser (CE-LIF) était utilisée pour séparer et collecter la protéine, albumine marquée de fluorescéine isothiocyanate (FITC), sans l'utilisation du revêtement DDAB. Ces analyses montraient que l'albumine-FITC était, en fait, présente dans la fraction collecté. La cartographie peptidique a été ensuite réalisée avec succès en employant l’enzyme chymotrypsine pour la digestion et CE-LIF pour obtenir la carte peptidique.

Veja mais

917 resultados para structuration of lexical data bases

Filtro por publicador