980 resultados para Semantic file systems
Resumo:
El software lliure està tenint últimament un pes cada cop més important en les empreses, però encara és el gran desconegut per a molta gent. Des de la seva creació als anys 80 fins ara, hi ha hagut un creixement exponencial de software lliure de gran qualitat, oferint eines per a tot tipus de necessitats, eines ofimàtiques, gestors de correu, sistemes de fitxer, sistemes operatius…. Tot aquest moviment no ha passat desapercebut per a molts usuaris i empreses, que s’han aprofitat d’ell per cobrir les seves necessitats. Pel que fa a les empreses, cada cop n’hi ha més que en petita o gran mesura, utilitzen el software lliure, ja sigui per el seu menor cost d’adquisició, o bé per la seva gran fiabilitat o per que és fàcilment adaptable o per no establir cap lligam tecnològic, en definitiva per tenir més llibertat. En el moment de la creació d’una nova empresa, on es parteix de zero en tota la tecnologia informàtica, és el moment menys costòs d’implementar l’arquitectura informàtica amb software lliure, és quan l’impacte que té sobre l’empresa, usuaris i clients és menor. En les empreses que ja tenen un sistema informàtic, caldrà establir un pla de migració, ja sigui total o parcial. La finalitat d’aquest projecte no és la de dir quin software és millor que l’altre o de dir quin s’ha d’instal•lar, sinó el de donar a conèixer el món del software lliure, mostrar part d’aquest software, fer alguna comparativa de software lliure amb software propietari, donant idees i un conjunt de solucions per a empreses, per què una empresa pugui agafar idees d’implementació d’algunes de les solucions informàtiques exposades o seguir algun dels consells proposats. Actualment ja hi ha moltes empreses que utilitzen software lliure. Algunes només n’utilitzen una petita part en les seves instal•lacions, ja que el fet de que una empresa funcioni al 100% amb software lliure, tot i que n’hi comença ha haver, de moment ho considero una mica arriscat, però que en poc temps, aquest fet serà cada cop més habitual.
Resumo:
Tietokonejärjestelmän osien ja ohjelmistojen suorituskykymittauksista saadaan tietoa,jota voidaan käyttää suorituskyvyn parantamiseen ja laitteistohankintojen päätöksen tukena. Tässä työssä tutustutaan suorituskyvyn mittaamiseen ja mittausohjelmiin eli ns. benchmark-ohjelmistoihin. Työssä etsittiin ja arvioitiin eri tyyppisiä vapaasti saatavilla olevia benchmark-ohjelmia, jotka soveltuvat Linux-laskentaklusterin suorituskyvynanalysointiin. Benchmarkit ryhmiteltiin ja arvioitiin testaamalla niiden ominaisuuksia Linux-klusterissa. Työssä käsitellään myös mittausten tekemisen ja rinnakkaislaskennan haasteita. Benchmarkkeja löytyi moneen tarkoitukseen ja ne osoittautuivat laadultaan ja laajuudeltaan vaihteleviksi. Niitä on myös koottu ohjelmistopaketeiksi, jotta laitteiston suorituskyvystä saisi laajemman kuvan kuin mitä yhdellä ohjelmalla on mahdollista saada. Olennaista on ymmärtää nopeus, jolla dataa saadaan siirretyä prosessorille keskusmuistista, levyjärjestelmistä ja toisista laskentasolmuista. Tyypillinen benchmark-ohjelma sisältää paljon laskentaa tarvitsevan matemaattisen algoritmin, jota käytetään tieteellisissä ohjelmistoissa. Benchmarkista riippuen tulosten ymmärtäminen ja hyödyntäminen voi olla haasteellista.
Resumo:
We adapt the Shout and Act algorithm to Digital Objects Preservation where agents explore file systems looking for digital objects to be preserved (victims). When they find something they “shout” so that agent mates can hear it. The louder the shout, the urgent or most important the finding is. Louder shouts can also refer to closeness. We perform several experiments to show that this system works very scalably, showing that heterogeneous teams of agents outperform homogeneous ones over a wide range of tasks complexity. The target at-risk documents are MS Office documents (including an RTF file) with Excel content or in Excel format. Thus, an interesting conclusion from the experiments is that fewer heterogeneous (varying skills) agents can equal the performance of many homogeneous (combined super-skilled) agents, implying significant performance increases with lower overall cost growth. Our results impact the design of Digital Objects Preservation teams: a properly designed combination of heterogeneous teams is cheaper and more scalable when confronted with uncertain maps of digital objects that need to be preserved. A cost pyramid is proposed for engineers to use for modeling the most effective agent combinations
Resumo:
El software lliure està tenint últimament un pes cada cop més important en les empreses, però encara és el gran desconegut per a molta gent. Des de la seva creació als anys 80 fins ara, hi ha hagut un creixement exponencial de software lliure de gran qualitat, oferint eines per a tot tipus de necessitats, eines ofimàtiques, gestors de correu, sistemes de fitxer, sistemes operatius…. Tot aquest moviment no ha passat desapercebut per a molts usuaris i empreses, que s’han aprofitat d’ell per cobrir les seves necessitats. Pel que fa a les empreses, cada cop n’hi ha més que en petita o gran mesura, utilitzen el software lliure, ja sigui per el seu menor cost d’adquisició, o bé per la seva gran fiabilitat o per que és fàcilment adaptable o per no establir cap lligam tecnològic, en definitiva per tenir més llibertat. En el moment de la creació d’una nova empresa, on es parteix de zero en tota la tecnologia informàtica, és el moment menys costòs d’implementar l’arquitectura informàtica amb software lliure, és quan l’impacte que té sobre l’empresa, usuaris i clients és menor. En les empreses que ja tenen un sistema informàtic, caldrà establir un pla de migració, ja sigui total o parcial. La finalitat d’aquest projecte no és la de dir quin software és millor que l’altre o de dir quin s’ha d’instal•lar, sinó el de donar a conèixer el món del software lliure, mostrar part d’aquest software, fer alguna comparativa de software lliure amb software propietari, donant idees i un conjunt de solucions per a empreses, per què una empresa pugui agafar idees d’implementació d’algunes de les solucions informàtiques exposades o seguir algun dels consells proposats. Actualment ja hi ha moltes empreses que utilitzen software lliure. Algunes només n’utilitzen una petita part en les seves instal•lacions, ja que el fet de que una empresa funcioni al 100% amb software lliure, tot i que n’hi comença ha haver, de moment ho considero una mica arriscat, però que en poc temps, aquest fet serà cada cop més habitual.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Access control is a fundamental concern in any system that manages resources, e.g., operating systems, file systems, databases and communications systems. The problem we address is how to specify, enforce, and implement access control in distributed environments. This problem occurs in many applications such as management of distributed project resources, e-newspaper and payTV subscription services. Starting from an access relation between users and resources, we derive a user hierarchy, a resource hierarchy, and a unified hierarchy. The unified hierarchy is then used to specify the access relation in a way that is compact and that allows efficient queries. It is also used in cryptographic schemes that enforce the access relation. We introduce three specific cryptography based hierarchical schemes, which can effectively enforce and implement access control and are designed for distributed environments because they do not need the presence of a central authority (except perhaps for set- UP).
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds—parent-to-child or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5–54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3–127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.
Resumo:
Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds - parent-tochild or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5-54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3-127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
To explore the feasibility of processing Compact Muon Solenoid (CMS) analysis jobs across the wide area network, the FIU CMS Tier-3 center and the Florida CMS Tier-2 center designed a remote data access strategy. A Kerberized Lustre test bed was installed at the Tier-2 with the design to provide storage resources to private-facing worker nodes at the Tier-3. However, the Kerberos security layer is not capable of authenticating resources behind a private network. As a remedy, an xrootd server on a public-facing node at the Tier-3 was installed to export the file system to the private-facing worker nodes. We report the performance of CMS analysis jobs processed by the Tier-3 worker nodes accessing data from a Kerberized Lustre file. The processing performance of this configuration is benchmarked against a direct connection to the Lustre file system, and separately, where the xrootd server is near the Lustre file system.