Biblioteca Digital

879 resultados para Security of data

Identification and visualization of multidimensional antigen-specific T-cell populations in polychromatic cytometry data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An important aspect of immune monitoring for vaccine development, clinical trials, and research is the detection, measurement, and comparison of antigen-specific T-cells from subject samples under different conditions. Antigen-specific T-cells compose a very small fraction of total T-cells. Developments in cytometry technology over the past five years have enabled the measurement of single-cells in a multivariate and high-throughput manner. This growth in both dimensionality and quantity of data continues to pose a challenge for effective identification and visualization of rare cell subsets, such as antigen-specific T-cells. Dimension reduction and feature extraction play pivotal role in both identifying and visualizing cell populations of interest in large, multi-dimensional cytometry datasets. However, the automated identification and visualization of rare, high-dimensional cell subsets remains challenging. Here we demonstrate how a systematic and integrated approach combining targeted feature extraction with dimension reduction can be used to identify and visualize biological differences in rare, antigen-specific cell populations. By using OpenCyto to perform semi-automated gating and features extraction of flow cytometry data, followed by dimensionality reduction with t-SNE we are able to identify polyfunctional subpopulations of antigen-specific T-cells and visualize treatment-specific differences between them.

Repetitive Deliberate Fires: Critical Review of the Situation and Proposal of a Follow-Up Process and Systematic Analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Deliberate fires appear to be borderless and timeless events creating a serious security problem. There have been many attempts to develop approaches to tackle this problem, but unfortunately acting effectively against deliberate fires has proven a complex challenge. This article reviews the current situation relating to deliberate fires: what do we know, how serious is the situation, how is it being dealt with, and what challenges are faced when developing a systematic and global methodology to tackle the issues? The repetitive nature of some types of deliberate fires will also be discussed. Finally, drawing on the reality of repetition within deliberate fires and encouraged by successes obtained in previous repetitive crimes (such as property crimes or drug trafficking), we will argue that the use of the intelligence process cycle as a framework to allow a follow-up and systematic analysis of fire events is a relevant approach. This is the first article of a series of three articles. This first part is introducing the context and discussing the background issues in order to provide a better underpinning knowledge to managers and policy makers planning on tackling this issue. The second part will present a methodology developed to detect and identify repetitive fire events from a set of data, and the third part will discuss the analyses of these data to produce intelligence.

Assessing the use of very high resolution data to predict species distribution in a mountain environment

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, Species Distribution Models (SDMs) are a widely used tool. Using different statistical approaches these models reconstruct the realized niche of a species using presence data and a set of variables, often topoclimatic. There utilization range is quite large from understanding single species requirements, to the creation of nature reserve based on species hotspots, or modeling of climate change impact, etc... Most of the time these models are using variables at a resolution of 50km x 50km or 1 km x 1 km. However in some cases these models are used with resolutions below the kilometer scale and thus called high resolution models (100 m x 100 m or 25 m x 25 m). Quite recently a new kind of data has emerged enabling precision up to lm x lm and thus allowing very high resolution modeling. However these new variables are very costly and need an important amount of time to be processed. This is especially the case when these variables are used in complex calculation like models projections over large areas. Moreover the importance of very high resolution data in SDMs has not been assessed yet and is not well understood. Some basic knowledge on what drive species presence-absences is still missing. Indeed, it is not clear whether in mountain areas like the Alps coarse topoclimatic gradients are driving species distributions or if fine scale temperature or topography are more important or if their importance can be neglected when balance to competition or stochasticity. In this thesis I investigated the importance of very high resolution data (2-5m) in species distribution models using either very high resolution topographic, climatic or edaphic variables over a 2000m elevation gradient in the Western Swiss Alps. I also investigated more local responses of these variables for a subset of species living in this area at two precise elvation belts. During this thesis I showed that high resolution data necessitates very good datasets (species and variables for the models) to produce satisfactory results. Indeed, in mountain areas, temperature is the most important factor driving species distribution and needs to be modeled at very fine resolution instead of being interpolated over large surface to produce satisfactory results. Despite the instinctive idea that topographic should be very important at high resolution, results are mitigated. However looking at the importance of variables over a large gradient buffers the importance of the variables. Indeed topographic factors have been shown to be highly important at the subalpine level but their importance decrease at lower elevations. Wether at the mountane level edaphic and land use factors are more important high resolution topographic data is more imporatant at the subalpine level. Finally the biggest improvement in the models happens when edaphic variables are added. Indeed, adding soil variables is of high importance and variables like pH are overpassing the usual topographic variables in SDMs in term of importance in the models. To conclude high resolution is very important in modeling but necessitate very good datasets. Only increasing the resolution of the usual topoclimatic predictors is not sufficient and the use of edaphic predictors has been highlighted as fundamental to produce significantly better models. This is of primary importance, especially if these models are used to reconstruct communities or as basis for biodiversity assessments. -- Ces dernières années, l'utilisation des modèles de distribution d'espèces (SDMs) a continuellement augmenté. Ces modèles utilisent différents outils statistiques afin de reconstruire la niche réalisée d'une espèce à l'aide de variables, notamment climatiques ou topographiques, et de données de présence récoltées sur le terrain. Leur utilisation couvre de nombreux domaines allant de l'étude de l'écologie d'une espèce à la reconstruction de communautés ou à l'impact du réchauffement climatique. La plupart du temps, ces modèles utilisent des occur-rences issues des bases de données mondiales à une résolution plutôt large (1 km ou même 50 km). Certaines bases de données permettent cependant de travailler à haute résolution, par conséquent de descendre en dessous de l'échelle du kilomètre et de travailler avec des résolutions de 100 m x 100 m ou de 25 m x 25 m. Récemment, une nouvelle génération de données à très haute résolution est apparue et permet de travailler à l'échelle du mètre. Les variables qui peuvent être générées sur la base de ces nouvelles données sont cependant très coûteuses et nécessitent un temps conséquent quant à leur traitement. En effet, tout calcul statistique complexe, comme des projections de distribution d'espèces sur de larges surfaces, demande des calculateurs puissants et beaucoup de temps. De plus, les facteurs régissant la distribution des espèces à fine échelle sont encore mal connus et l'importance de variables à haute résolution comme la microtopographie ou la température dans les modèles n'est pas certaine. D'autres facteurs comme la compétition ou la stochasticité naturelle pourraient avoir une influence toute aussi forte. C'est dans ce contexte que se situe mon travail de thèse. J'ai cherché à comprendre l'importance de la haute résolution dans les modèles de distribution d'espèces, que ce soit pour la température, la microtopographie ou les variables édaphiques le long d'un important gradient d'altitude dans les Préalpes vaudoises. J'ai également cherché à comprendre l'impact local de certaines variables potentiellement négligées en raison d'effets confondants le long du gradient altitudinal. Durant cette thèse, j'ai pu monter que les variables à haute résolution, qu'elles soient liées à la température ou à la microtopographie, ne permettent qu'une amélioration substantielle des modèles. Afin de distinguer une amélioration conséquente, il est nécessaire de travailler avec des jeux de données plus importants, tant au niveau des espèces que des variables utilisées. Par exemple, les couches climatiques habituellement interpolées doivent être remplacées par des couches de température modélisées à haute résolution sur la base de données de terrain. Le fait de travailler le long d'un gradient de température de 2000m rend naturellement la température très importante au niveau des modèles. L'importance de la microtopographie est négligeable par rapport à la topographie à une résolution de 25m. Cependant, lorsque l'on regarde à une échelle plus locale, la haute résolution est une variable extrêmement importante dans le milieu subalpin. À l'étage montagnard par contre, les variables liées aux sols et à l'utilisation du sol sont très importantes. Finalement, les modèles de distribution d'espèces ont été particulièrement améliorés par l'addition de variables édaphiques, principalement le pH, dont l'importance supplante ou égale les variables topographique lors de leur ajout aux modèles de distribution d'espèces habituels.

Adjusting for the environment in DEA: a comparison of alternative models based on empirical data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the existence of free software and pedagogical guides, the use of Data Envelopment Analysis (DEA) has been further democratized in recent years. Nowadays, it is quite usual for practitioners and decision makers with no or little knowledge in operational research to run their own efficiency analysis. Within DEA, several alternative models allow for an environmental adjustment. Four alternative models, each user-friendly and easily accessible to practitioners and decision makers, are performed using empirical data of 90 primary schools in the State of Geneva, Switzerland. Results show that the majority of alternative models deliver divergent results. From a political and a managerial standpoint, these diverging results could lead to potentially ineffective decisions. As no consensus emerges on the best model to use, practitioners and decision makers may be tempted to select the model that is right for them, in other words, the model that best reflects their own preferences. Further studies should investigate how an appropriate multi-criteria decision analysis method could help decision makers to select the right model.

A nonparametric method for the measurement of size diversity with emphasis on data standardization

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The most suitable method for estimation of size diversity is investigated. Size diversity is computed on the basis of the Shannon diversity expression adapted for continuous variables, such as size. It takes the form of an integral involving the probability density function (pdf) of the size of the individuals. Different approaches for the estimation of pdf are compared: parametric methods, assuming that data come from a determinate family of pdfs, and nonparametric methods, where pdf is estimated using some kind of local evaluation. Exponential, generalized Pareto, normal, and log-normal distributions have been used to generate simulated samples using estimated parameters from real samples. Nonparametric methods include discrete computation of data histograms based on size intervals and continuous kernel estimation of pdf. Kernel approach gives accurate estimation of size diversity, whilst parametric methods are only useful when the reference distribution have similar shape to the real one. Special attention is given for data standardization. The division of data by the sample geometric mean is proposedas the most suitable standardization method, which shows additional advantages: the same size diversity value is obtained when using original size or log-transformed data, and size measurements with different dimensionality (longitudes, areas, volumes or biomasses) may be immediately compared with the simple addition of ln k where kis the dimensionality (1, 2, or 3, respectively). Thus, the kernel estimation, after data standardization by division of sample geometric mean, arises as the most reliable and generalizable method of size diversity evaluation

The single principle of compositional data analysis, continuing fallacies, confusionsand misunderstandings and some suggested remedies

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In any discipline, where uncertainty and variability are present, it is important to haveprinciples which are accepted as inviolate and which should therefore drive statisticalmodelling, statistical analysis of data and any inferences from such an analysis.Despite the fact that two such principles have existed over the last two decades andfrom these a sensible, meaningful methodology has been developed for the statisticalanalysis of compositional data, the application of inappropriate and/or meaninglessmethods persists in many areas of application. This paper identifies at least tencommon fallacies and confusions in compositional data analysis with illustrativeexamples and provides readers with necessary, and hopefully sufficient, arguments topersuade the culprits why and how they should amend their ways

Software Security Vulnerabilities in Mobile Peer-to-peer Environment

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Increase of computational power and emergence of new computer technologies led to popularity of local communications between personal trusted devices. By-turn, it led to emergence of security problems related to user data utilized in such communications. One of the main aspects of the data security assurance is security of software operating on mobile devices. The aim of this work was to analyze security threats to PeerHood, software intended for performing personal communications between mobile devices regardless of underlying network technologies. To reach this goal, risk-based software security testing was performed. The results of the testing showed that the project has several security vulnerabilities. So PeerHood cannot be considered as a secure software. The analysis made in the work is the first step towards the further implementation of PeerHood security mechanisms, as well as taking into account security in the development process of this project.

A cyber security architecture for military networks using a cognitive network approach

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cyber security is one of the main topics that are discussed around the world today. The threat is real, and it is unlikely to diminish. People, business, governments, and even armed forces are networked in a way or another. Thus, the cyber threat is also facing military networking. On the other hand, the concept of Network Centric Warfare sets high requirements for military tactical data communications and security. A challenging networking environment and cyber threats force us to consider new approaches to build security on the military communication systems. The purpose of this thesis is to develop a cyber security architecture for military networks, and to evaluate the designed architecture. The architecture is described as a technical functionality. As a new approach, the thesis introduces Cognitive Networks (CN) which are a theoretical concept to build more intelligent, dynamic and even secure communication networks. The cognitive networks are capable of observe the networking environment, make decisions for optimal performance and adapt its system parameter according to the decisions. As a result, the thesis presents a five-layer cyber security architecture that consists of security elements controlled by a cognitive process. The proposed architecture includes the infrastructure, services and application layers that are managed and controlled by the cognitive and management layers. The architecture defines the tasks of the security elements at a functional level without introducing any new protocols or algorithms. For evaluating two separated method were used. The first method is based on the SABSA framework that uses a layered approach to analyze overall security of an organization. The second method was a scenario based method in which a risk severity level is calculated. The evaluation results show that the proposed architecture fulfills the security requirements at least at a high level. However, the evaluation of the proposed architecture proved to be very challenging. Thus, the evaluation results must be considered very critically. The thesis proves the cognitive networks are a promising approach, and they provide lots of benefits when designing a cyber security architecture for the tactical military networks. However, many implementation problems exist, and several details must be considered and studied during the future work.

Statistical analysis of survey-based event history data with application to modeling of unemployment duration

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Longitudinal surveys are increasingly used to collect event history data on person-specific processes such as transitions between labour market states. Surveybased event history data pose a number of challenges for statistical analysis. These challenges include survey errors due to sampling, non-response, attrition and measurement. This study deals with non-response, attrition and measurement errors in event history data and the bias caused by them in event history analysis. The study also discusses some choices faced by a researcher using longitudinal survey data for event history analysis and demonstrates their effects. These choices include, whether a design-based or a model-based approach is taken, which subset of data to use and, if a design-based approach is taken, which weights to use. The study takes advantage of the possibility to use combined longitudinal survey register data. The Finnish subset of European Community Household Panel (FI ECHP) survey for waves 1–5 were linked at person-level with longitudinal register data. Unemployment spells were used as study variables of interest. Lastly, a simulation study was conducted in order to assess the statistical properties of the Inverse Probability of Censoring Weighting (IPCW) method in a survey data context. The study shows how combined longitudinal survey register data can be used to analyse and compare the non-response and attrition processes, test the missingness mechanism type and estimate the size of bias due to non-response and attrition. In our empirical analysis, initial non-response turned out to be a more important source of bias than attrition. Reported unemployment spells were subject to seam effects, omissions, and, to a lesser extent, overreporting. The use of proxy interviews tended to cause spell omissions. An often-ignored phenomenon classification error in reported spell outcomes, was also found in the data. Neither the Missing At Random (MAR) assumption about non-response and attrition mechanisms, nor the classical assumptions about measurement errors, turned out to be valid. Both measurement errors in spell durations and spell outcomes were found to cause bias in estimates from event history models. Low measurement accuracy affected the estimates of baseline hazard most. The design-based estimates based on data from respondents to all waves of interest and weighted by the last wave weights displayed the largest bias. Using all the available data, including the spells by attriters until the time of attrition, helped to reduce attrition bias. Lastly, the simulation study showed that the IPCW correction to design weights reduces bias due to dependent censoring in design-based Kaplan-Meier and Cox proportional hazard model estimators. The study discusses implications of the results for survey organisations collecting event history data, researchers using surveys for event history analysis, and researchers who develop methods to correct for non-sampling biases in event history data.

Data management as a part of performance management: Case study about production reporting development project

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Because of the increased availability of different kind of business intelligence technologies and tools it can be easy to fall in illusion that new technologies will automatically solve the problems of data management and reporting of the company. The management is not only about management of technology but also the management of processes and people. This thesis is focusing more into traditional data management and performance management of production processes which both can be seen as a requirement for long lasting development. Also some of the operative BI solutions are considered in the ideal state of reporting system. The objectives of this study are to examine what requirements effective performance management of production processes have for data management and reporting of the company and to see how they are effecting on the efficiency of it. The research is executed as a theoretical literary research about the subjects and as a qualitative case study about reporting development project of Finnsugar Ltd. The case study is examined through theoretical frameworks and by the active participant observation. To get a better picture about the ideal state of reporting system simple investment calculations are performed. According to the results of the research, requirements for effective performance management of production processes are automation in the collection of data, integration of operative databases, usage of efficient data management technologies like ETL (Extract, Transform, Load) processes, data warehouse (DW) and Online Analytical Processing (OLAP) and efficient management of processes, data and roles.

The consequences of e-HRM on line managers

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this research was to provide a deeper insight into the consequences of electronic human resource management (e-HRM) for line managers. The consequences are viewed as used information system (IS) potentials pertaining to the moderate voluntaristic category of consequences. Due to the need to contextualize the research and draw on line managers’ personal experiences, a qualitative approach in a case study setting was selected. The empirical part of the research is loosely based on literature on HRM and e-HRM and it was conducted in an industrial private sector company. In this thesis, method triangulation was utilized, as nine semi-structured interviews, conducted in a European setting, created the main method for data collection and analysis. Other complementary data such as HRM documentation and statistics of e-HRM system usage were utilized as background information to help to put the results into context. E-HRM has partly been taken into use in the case study company. Line managers tend to use e-HRM when a particular task requires it, but they are not familiar with all the features and possibilities which e-HRM has to offer. The advantages of e-HRM are in line with the company’s goals. The advantages are e.g. an transparency of data, process consistency, and having an efficient and easy-to-use tool at one’s disposal. However, several unintended, even contradictory, and mainly negative outcomes can also be identified, such as over-complicated processes, in-security in use of the tool, and the lack of co-operation with HR professionals. The use of e-HRM and managers’ perceptions regarding e-HRM affect the way in which managers perceive the consequences of e-HRM on their work. Overall, the consequences of e-HRM are divergent, even contradictory. The managers who considered e-HRM mostly beneficial to their work found that e-HRM affects their work by providing information and increasing efficiency. Those managers who mostly perceived challenges in e-HRM did not think that e-HRM had affected their role or their work. Even though the perceptions regarding e-HRM and its consequences might reflect the strategies, the distribution of work, and the ways of working in all HRM in general and can’t be generalized as such, this research contributed to the field of e-HRM and it provides new perspectives to e-HRM in the case study organization and in the academic field in general.

Supporting the creation, management, and long-term preservation of social science research data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Implementing Widely-used Vocabularies to Produce Linked Open Data in the Context of Open Repositories

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Data Governance and Automated Marketing – A Case Study of Expected Benefits of Organizing Data Governance in an ICT Company

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research is looking to find out what benefits employees expect the organization of data governance gains for an organization and how it benefits implementing automated marketing capabilities. Quality and usability of the data are crucial for organizations to meet various business needs. Organizations have more data and technology available what can be utilized for example in automated marketing. Data governance addresses the organization of decision rights and accountabilities for the management of an organization’s data assets. With automated marketing it is meant sending a right message, to a right person, at a right time, automatically. The research is a single case study conducted in Finnish ICT-company. The case company was starting to organize data governance and implementing automated marketing capabilities at the time of the research. Empirical material is interviews of the employees of the case company. Content analysis is used to interpret the interviews in order to find the answers to the research questions. Theoretical framework of the research is derived from the morphology of data governance. Findings of the research indicate that the employees expect the organization of data governance among others to improve customer experience, to improve sales, to provide abilities to identify individual customer’s life-situation, ensure that the handling of the data is according to the regulations and improve operational efficiency. The organization of data governance is expected to solve problems in customer data quality that are currently hindering implementation of automated marketing capabilities.

A systematic mapping study on open data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presented the overview of Open Data research area, quantity of evidence and establishes the research evidence based on the Systematic Mapping Study (SMS). There are 621 such publications were identified published between years 2005 and 2014, but only 243 were selected in the review process. This thesis highlights the implications of Open Data principals’ proliferation in the emerging era of the accessibility, reusability and sustainability of data transparency. The findings of mapping study are described in quantitative and qualitative measurement based on the organization affiliation, countries, year of publications, research method, star rating and units of analysis identified. Furthermore, units of analysis were categorized by development lifecycle, linked open data, type of data, technical platforms, organizations, ontology and semantic, adoption and awareness, intermediaries, security and privacy and supply of data which are important component to provide a quality open data applications and services. The results of the mapping study help the organizations (such as academia, government and industries), re-searchers and software developers to understand the existing trend of open data, latest research development and the demand of future research. In addition, the proposed conceptual framework of Open Data research can be adopted and expanded to strengthen and improved current open data applications.

«
1
2
...
28
29
30
31
32
33
34
...
58
59
»