948 resultados para Semantic file systems
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Access control is a fundamental concern in any system that manages resources, e.g., operating systems, file systems, databases and communications systems. The problem we address is how to specify, enforce, and implement access control in distributed environments. This problem occurs in many applications such as management of distributed project resources, e-newspaper and payTV subscription services. Starting from an access relation between users and resources, we derive a user hierarchy, a resource hierarchy, and a unified hierarchy. The unified hierarchy is then used to specify the access relation in a way that is compact and that allows efficient queries. It is also used in cryptographic schemes that enforce the access relation. We introduce three specific cryptography based hierarchical schemes, which can effectively enforce and implement access control and are designed for distributed environments because they do not need the presence of a central authority (except perhaps for set- UP).
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds—parent-to-child or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5–54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3–127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.
Resumo:
Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds - parent-tochild or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5-54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3-127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
To explore the feasibility of processing Compact Muon Solenoid (CMS) analysis jobs across the wide area network, the FIU CMS Tier-3 center and the Florida CMS Tier-2 center designed a remote data access strategy. A Kerberized Lustre test bed was installed at the Tier-2 with the design to provide storage resources to private-facing worker nodes at the Tier-3. However, the Kerberos security layer is not capable of authenticating resources behind a private network. As a remedy, an xrootd server on a public-facing node at the Tier-3 was installed to export the file system to the private-facing worker nodes. We report the performance of CMS analysis jobs processed by the Tier-3 worker nodes accessing data from a Kerberized Lustre file. The processing performance of this configuration is benchmarked against a direct connection to the Lustre file system, and separately, where the xrootd server is near the Lustre file system.
Resumo:
Introdução: Em Portugal a realidade actual, clínica e financeira do exercício da Medicina Dentária é bem distinta da do final do séc. XX devido à pletora de Médicos Dentistas e à baixa de honorários por acto Médico registada nos últimos anos. Objectivos: Este trabalho teve como objectivo perceber um pouco sobre os detalhes a ter em consideração aquando da abertura de uma clínica centrada na realização de tratamentos na área da Endodontia, e como publicitá-la de forma legal e apelativa. Pretende-se depois também estimar o custo mínimo para a realização de tratamentos Endodonticos com diferentes equipamentos e comparar a eficácia entre tratamentos usando ferramentas diferentes. Materiais e Métodos: Foram usadas como fontes de pesquisa para o presente trabalho bases de dados como a PubMed, B-On e Cochrane Library. Foi também usado o Google. Para pesquisa no PubMed foram usados vários descritores MeSH, como “Commerce”, “Dentistry”, “Endodontics”, “Management” e “Marketing”. Foram também usadas diversas palavras-chave no PubMed e nos outros motores de pesquisa, como “Anesthesia”, “Dental Office”, “Files”, “Irrigation”, “Magnification”, “Microscope”, “Obturation Systems” e “Rotatory Systems”. As pesquisas foram filtradas para serem apresentados apenas resultados entre 2011 e 2016, sendo este filtro retirado só quando não eram encontrados resultados satisfatórios ou relevantes para os temas discutidos no trabalho. Foram obtidos 3927 artigos e seleccionados 24. A inclusão destes artigos foi feita tendo em conta as suas fontes bibliográficas e a qualidade dos estudos a que se reportavam. Foram excluídos artigos que se reportavam a estudos muito antigos ou a tecnologias ultrapassadas, como sistemas de limas antigos. Conclusões: As novas tecnologias usadas para tratamento Endodontico, apesar de dispendiosas, melhoram muito o atendimento ao paciente.
Resumo:
In a general purpose cloud system efficiencies are yet to be had from supporting diverse applications and their requirements within a storage system used for a private cloud. Supporting such diverse requirements poses a significant challenge in a storage system that supports fine grained configuration on a variety of parameters. This paper uses the Ceph distributed file system, and in particular its global parameters, to show how a single changed parameter can effect the performance for a range of access patterns when tested with an OpenStack cloud system.
Resumo:
Business practices vary from one company to another and business practices often need to be changed due to changes of business environments. To satisfy different business practices, enterprise systems need to be customized. To keep up with ongoing business practice changes, enterprise systems need to be adapted. Because of rigidity and complexity, the customization and adaption of enterprise systems often takes excessive time with potential failures and budget shortfall. Moreover, enterprise systems often drag business behind because they cannot be rapidly adapted to support business practice changes. Extensive literature has addressed this issue by identifying success or failure factors, implementation approaches, and project management strategies. Those efforts were aimed at learning lessons from post implementation experiences to help future projects. This research looks into this issue from a different angle. It attempts to address this issue by delivering a systematic method for developing flexible enterprise systems which can be easily tailored for different business practices or rapidly adapted when business practices change. First, this research examines the role of system models in the context of enterprise system development; and the relationship of system models with software programs in the contexts of computer aided software engineering (CASE), model driven architecture (MDA) and workflow management system (WfMS). Then, by applying the analogical reasoning method, this research initiates a concept of model driven enterprise systems. The novelty of model driven enterprise systems is that it extracts system models from software programs and makes system models able to stay independent of software programs. In the paradigm of model driven enterprise systems, system models act as instructors to guide and control the behavior of software programs. Software programs function by interpreting instructions in system models. This mechanism exposes the opportunity to tailor such a system by changing system models. To make this true, system models should be represented in a language which can be easily understood by human beings and can also be effectively interpreted by computers. In this research, various semantic representations are investigated to support model driven enterprise systems. The significance of this research is 1) the transplantation of the successful structure for flexibility in modern machines and WfMS to enterprise systems; and 2) the advancement of MDA by extending the role of system models from guiding system development to controlling system behaviors. This research contributes to the area relevant to enterprise systems from three perspectives: 1) a new paradigm of enterprise systems, in which enterprise systems consist of two essential elements: system models and software programs. These two elements are loosely coupled and can exist independently; 2) semantic representations, which can effectively represent business entities, entity relationships, business logic and information processing logic in a semantic manner. Semantic representations are the key enabling techniques of model driven enterprise systems; and 3) a brand new role of system models; traditionally the role of system models is to guide developers to write system source code. This research promotes the role of system models to control the behaviors of enterprise.
Resumo:
Mobile applications are being increasingly deployed on a massive scale in various mobile sensor grid database systems. With limited resources from the mobile devices, how to process the huge number of queries from mobile users with distributed sensor grid databases becomes a critical problem for such mobile systems. While the fundamental semantic cache technique has been investigated for query optimization in sensor grid database systems, the problem is still difficult due to the fact that more realistic multi-dimensional constraints have not been considered in existing methods. To solve the problem, a new semantic cache scheme is presented in this paper for location-dependent data queries in distributed sensor grid database systems. It considers multi-dimensional constraints or factors in a unified cost model architecture, determines the parameters of the cost model in the scheme by using the concept of Nash equilibrium from game theory, and makes semantic cache decisions from the established cost model. The scenarios of three factors of semantic, time and locations are investigated as special cases, which improve existing methods. Experiments are conducted to demonstrate the semantic cache scheme presented in this paper for distributed sensor grid database systems.
Resumo:
An understanding of application I/O access patterns is useful in several situations. First, gaining insight into what applications are doing with their data at a semantic level helps in designing efficient storage systems. Second, it helps create benchmarks that mimic realistic application behavior closely. Third, it enables autonomic systems as the information obtained can be used to adapt the system in a closed loop.All these use cases require the ability to extract the application-level semantics of I/O operations. Methods such as modifying application code to associate I/O operations with semantic tags are intrusive. It is well known that network file system traces are an important source of information that can be obtained non-intrusively and analyzed either online or offline. These traces are a sequence of primitive file system operations and their parameters. Simple counting, statistical analysis or deterministic search techniques are inadequate for discovering application-level semantics in the general case, because of the inherent variation and noise in realistic traces.In this paper, we describe a trace analysis methodology based on Profile Hidden Markov Models. We show that the methodology has powerful discriminatory capabilities that enable it to recognize applications based on the patterns in the traces, and to mark out regions in a long trace that encapsulate sets of primitive operations that represent higher-level application actions. It is robust enough that it can work around discrepancies between training and target traces such as in length and interleaving with other operations. We demonstrate the feasibility of recognizing patterns based on a small sampling of the trace, enabling faster trace analysis. Preliminary experiments show that the method is capable of learning accurate profile models on live traces in an online setting. We present a detailed evaluation of this methodology in a UNIX environment using NFS traces of selected commonly used applications such as compilations as well as on industrial strength benchmarks such as TPC-C and Postmark, and discuss its capabilities and limitations in the context of the use cases mentioned above.