Biblioteca Digital

934 resultados para Data anonymization and sanitization

Randomization tests for ABAB designs: Comparing-data-division-specific and common distributions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monte Carlo simulations were used to generate data for ABAB designs of different lengths. The points of change in phase are randomly determined before gathering behaviour measurements, which allows the use of a randomization test as an analytic technique. Data simulation and analysis can be based either on data-division-specific or on common distributions. Following one method or another affects the results obtained after the randomization test has been applied. Therefore, the goal of the study was to examine these effects in more detail. The discrepancies in these approaches are obvious when data with zero treatment effect are considered and such approaches have implications for statistical power studies. Data-division-specific distributions provide more detailed information about the performance of the statistical technique.

Managing manufacturing data and improving its accuracy

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are two main objects in this study: First, to prove the importance of data accuracy to the business success, and second, create a tool for observing and improving the accuracy of ERP systems production master data. Sub-objective is to explain the need for new tool in client company and the meaning of it for the company. In the theoretical part of this thesis the focus is in stating the importance of data accuracy in decision making and it's implications on business success. Also basics of manufacturing planning are introduced in order to explain the key vocabulary. In the empirical part the client company and its need for this study is introduced. New master data report is introduced, and finally, analysing the report and actions based on the results of analysis are explained. The main results of this thesis are finding the interdependence between data accuracy and business success, and providing a report for continuous master data improvement in the client company's ERP system.

Trigger and data link system for CMS Resistive Plate Chambers at the LHC accelerator

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of the work was to realize a high-speed digital data transfer system for RPC muon chambers in the CMS experiment on CERN’s new LHC accelerator. This large scale system took many years and many stages of prototyping to develop, and required the participation of tens of people. The system interfaces to Frontend Boards (FEB) at the 200,000-channel detector and to the trigger and readout electronics in the control room of the experiment. The distance between these two is about 80 metres and the speed required for the optic links was pushing the limits of available technology when the project was started. Here, as in many other aspects of the design, it was assumed that the features of readily available commercial components would develop in the course of the design work, just as they did. By choosing a high speed it was possible to multiplex the data from some the chambers into the same fibres to reduce the number of links needed. Further reduction was achieved by employing zero suppression and data compression, and a total of only 660 optical links were needed. Another requirement, which conflicted somewhat with choosing the components a late as possible was that the design needed to be radiation tolerant to an ionizing dose of 100 Gy and to a have a moderate tolerance to Single Event Effects (SEEs). This required some radiation test campaigns, and eventually led to ASICs being chosen for some of the critical parts. The system was made to be as reconfigurable as possible. The reconfiguration needs to be done from a distance as the electronics is not accessible except for some short and rare service breaks once the accelerator starts running. Therefore reconfigurable logic is extensively used, and the firmware development for the FPGAs constituted a sizable part of the work. Some special techniques needed to be used there too, to achieve the required radiation tolerance. The system has been demonstrated to work in several laboratory and beam tests, and now we are waiting to see it in action when the LHC will start running in the autumn 2008.

Identification of molecular subtypes and gene expression patterns of breast cancer analysing RNA-seq data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Breast cancer is the most common diagnosed cancer and the leading cause of cancer death among females worldwide. It is considered a highly heterogeneous disease and it must be classified into more homogeneous groups. Hence, the purpose of this study was to classify breast tumors based on variations in gene expression patterns derived from RNA sequencing by using different class discovery methods. 42 breast tumors paired-samples were sequenced by Illumine Genome Analyzer and the data was analyzed and prepared by TopHat2 and htseq-count. As reported previously, breast cancer could be grouped into five main groups known as basal epithelial-like group, HER2 group, normal breast-like group and two Luminal groups with a distinctive expression profile. Classifying breast tumor samples by using PAM50 method, the most common subtype was Luminal B and was significantly associated with ESR1 and ERBB2 high expression. Luminal A subtype had ESR1 and SLC39A6 significant high expression, whereas HER2 subtype had a high expression of ERBB2 and CNNE1 genes and low luminal epithelial gene expression. Basal-like and normal-like subtypes were associated with low expression of ESR1, PgR and HER2, and had significant high expression of cytokeratins 5 and 17. Our results were similar compared with TGCA breast cancer data results and with known studies related with breast cancer classification. Classifying breast tumors could add significant prognostic and predictive information to standard parameters, and moreover, identify marker genes for each subtype to find a better therapy for patients with breast cancer.

Success factors of customer and supplier master data system

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Especially in global enterprises, key data is fragmented in multiple Enterprise Resource Planning (ERP) systems. Thus the data is inconsistent, fragmented and redundant across the various systems. Master Data Management (MDM) is a concept, which creates cross-references between customers, suppliers and business units, and enables corporate hierarchies and structures. The overall goal for MDM is the ability to create an enterprise-wide consistent data model, which enables analyzing and reporting customer and supplier data. The goal of the study was defining the properties and success factors of a master data system. The theoretical background was based on literature and the case consisted of enterprise specific needs and demands. The theoretical part presents the concept, background, and principles of MDM and then the phases of system planning and implementation project. Case consists of background, definition of as is situation, definition of project, evaluation criterions and concludes the key results of the thesis. In the end chapter Conclusions combines common principles with the results of the case. The case part ended up dividing important factors of the system in success factors, technical requirements and business benefits. To clarify the project and find funding for the project, business benefits have to be defined and the realization has to be monitored. The thesis found out six success factors for the MDM system: Well defined business case, data management and monitoring, data models and structures defined and maintained, customer and supplier data governance, delivery and quality, commitment, and continuous communication with business. Technical requirements emerged several times during the thesis and therefore those can’t be ignored in the project. Conclusions chapter goes through these factors on a general level. The success factors and technical requirements are related to the essentials of MDM: Governance, Action and Quality. This chapter could be used as guidance in a master data management project.

Combined experimental powder X-ray diffraction and DFT data to obtain the lowest energy molecular conformation of friedelin

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Friedelin molecular conformers were obtained by Density Functional Theory (DFT) and by ab initio structure determination from powder X-ray diffraction. Their conformers with the five rings in chair-chair-chair-boat-boat, and with all rings in chair, are energy degenerated in gas-phase according to DFT results. The powder diffraction data reveals that rings A, B and C of friedelin are in chair, and rings D and E in boat-boat, conformation. The high correlation values among powder diffraction data, DFT and reported single-crystal data indicate that the use of conventional X-ray diffractometer can be applied in routine laboratory analysis in the absence of a single-crystal diffractometer.

Recording and reproducing Soundscapes : using and interpreting data in a creative way

Relevância:

100.00% 100.00%

Publicador:

Resumo:

http://elo.aalto.fi/fi/studies/elomedia/dataseminar/

MASTER DATA MANAGEMENT MATURITY AND TECHNOLOGY ASSESSMENT - From theory to practice, case Ineo Oy

Relevância:

100.00% 100.00%

Publicador:

Resumo:

After decades of mergers and acquisitions and successive technology trends such as CRM, ERP and DW, the data in enterprise systems is scattered and inconsistent. Global organizations face the challenge of addressing local uses of shared business entities, such as customer and material, and at the same time have a consistent, unique, and consolidate view of financial indicators. In addition, current enterprise systems do not accommodate the pace of organizational changes and immense efforts are required to maintain data. When it comes to systems integration, ERPs are considered “closed” and expensive. Data structures are complex and the “out-of-the-box” integration options offered are not based on industry standards. Therefore expensive and time-consuming projects are undertaken in order to have required data flowing according to business processes needs. Master Data Management (MDM) emerges as one discipline focused on ensuring long-term data consistency. Presented as a technology-enabled business discipline, it emphasizes business process and governance to model and maintain the data related to key business entities. There are immense technical and organizational challenges to accomplish the “single version of the truth” MDM mantra. Adding one central repository of master data might prove unfeasible in a few scenarios, thus an incremental approach is recommended, starting from areas most critically affected by data issues. This research aims at understanding the current literature on MDM and contrasting it with views from professionals. The data collected from interviews revealed details on the complexities of data structures and data management practices in global organizations, reinforcing the call for more in-depth research on organizational aspects of MDM. The most difficult piece of master data to manage is the “local” part, the attributes related to the sourcing and storing of materials in one particular warehouse in The Netherlands or a complex set of pricing rules for a subsidiary of a customer in Brazil. From a practical perspective, this research evaluates one MDM solution under development at a Finnish IT solution-provider. By means of applying an existing assessment method, the research attempts at providing the company with one possible tool to evaluate its product from a vendor-agnostics perspective.

‘Just how (re)usable is Research Data? A legal perspective’ - A poster summarizing the recommendations of the OpenAIRE legal and licensing study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Supporting the creation, management, and long-term preservation of social science research data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

A platform for assessing the current state of data repositories in Korea, Japan and China

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Breaking down the boundaries to storing, sharing and publishing research data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Governance of Healthcare Data: International Standards and Recommendations

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the global phenomenon, the aging population becomes a critical issue. Data and information concerning elderly citizens are increasing and are not well organized. In addition, these unstructured data and information cause the problems for decision makers. Since we live in a digital world, Information Technology is considered to be a tool in order to solve problems. Data, information, and knowledge are crucial components to facilitate success in IT service system. Therefore, it is necessary to study how to organize or to govern data from various sources related elderly citizens. The research is conducted due to the fact that there is no internationally accepted holistic framework for governance of data. The research limits the scope to study on the healthcare domain; however, the results can be applied to the other areas. The research starts with an ongoing research of Dahlberg and Nokkala (2015) as a theory. It explains the classification of existing data sources and their characteristics with the focus on managerial perspectives. Then the studies of existing frameworks at international and national level organizations have been performed to show the current frameworks, which have been used and are useful in compiling data on elderly citizens. The international organizations in this research are selected based on their reputations and the reliability to obtain information. The selected countries at national level provide different point of views between two countries. Australia is a forerunner in IT governance while Thailand is the country which the author has familiar knowledge of the current situation. Considered the discussions of frameworks at international and national organizations level illustrate the main characteristics of each framework. At international organization level gives precedence to the interoperability of exchanging data and information between different parties. Whereas at national level shows the importance of the acknowledgement of using frameworks throughout the country in order to make the frameworks to be effective. After the studies of both international and national organization levels, the thesis shows the summarized tables to answer the fitness to the proposed framework by Dahlberg and Nokkala whether the framework help to consolidate data from various sources with different formats, hierarchies, structures, velocities, and other attributes of data storages. In addition, suggestions and recommendations will be proposed for the future research.

Modelling ecological values in heterogeneous and dynamic landscapes with geospatial data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Our surrounding landscape is in a constantly dynamic state, but recently the rate of changes and their effects on the environment have considerably increased. In terms of the impact on nature, this development has not been entirely positive, but has rather caused a decline in valuable species, habitats, and general biodiversity. Regardless of recognizing the problem and its high importance, plans and actions of how to stop the detrimental development are largely lacking. This partly originates from a lack of genuine will, but is also due to difficulties in detecting many valuable landscape components and their consequent neglect. To support knowledge extraction, various digital environmental data sources may be of substantial help, but only if all the relevant background factors are known and the data is processed in a suitable way. This dissertation concentrates on detecting ecologically valuable landscape components by using geospatial data sources, and applies this knowledge to support spatial planning and management activities. In other words, the focus is on observing regionally valuable species, habitats, and biotopes with GIS and remote sensing data, using suitable methods for their analysis. Primary emphasis is given to the hemiboreal vegetation zone and the drastic decline in its semi-natural grasslands, which were created by a long trajectory of traditional grazing and management activities. However, the applied perspective is largely methodological, and allows for the application of the obtained results in various contexts. Models based on statistical dependencies and correlations of multiple variables, which are able to extract desired properties from a large mass of initial data, are emphasized in the dissertation. In addition, the papers included combine several data sets from different sources and dates together, with the aim of detecting a wider range of environmental characteristics, as well as pointing out their temporal dynamics. The results of the dissertation emphasise the multidimensionality and dynamics of landscapes, which need to be understood in order to be able to recognise their ecologically valuable components. This not only requires knowledge about the emergence of these components and an understanding of the used data, but also the need to focus the observations on minute details that are able to indicate the existence of fragmented and partly overlapping landscape targets. In addition, this pinpoints the fact that most of the existing classifications are too generalised as such to provide all the required details, but they can be utilized at various steps along a longer processing chain. The dissertation also emphases the importance of landscape history as an important factor, which both creates and preserves ecological values, and which sets an essential standpoint for understanding the present landscape characteristics. The obtained results are significant both in terms of preserving semi-natural grasslands, as well as general methodological development, giving support to science-based framework in order to evaluate ecological values and guide spatial planning.

Quadtree representation and compression of spatial data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spatial data representation and compression has become a focus issue in computer graphics and image processing applications. Quadtrees, as one of hierarchical data structures, basing on the principle of recursive decomposition of space, always offer a compact and efficient representation of an image. For a given image, the choice of quadtree root node plays an important role in its quadtree representation and final data compression. The goal of this thesis is to present a heuristic algorithm for finding a root node of a region quadtree, which is able to reduce the number of leaf nodes when compared with the standard quadtree decomposition. The empirical results indicate that, this proposed algorithm has quadtree representation and data compression improvement when in comparison with the traditional method.

«
1
2
...
22
23
24
25
26
27
28
...
62
63
»