19 resultados para Data Storage

em Digital Commons at Florida International University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Currently the data storage industry is facing huge challenges with respect to the conventional method of recording data known as longitudinal magnetic recording. This technology is fast approaching a fundamental physical limit, known as the superparamagnetic limit. A unique way of deferring the superparamagnetic limit incorporates the patterning of magnetic media. This method exploits the use of lithography tools to predetermine the areal density. Various nanofabrication schemes are employed to pattern the magnetic material are Focus Ion Beam (FIB), E-beam Lithography (EBL), UV-Optical Lithography (UVL), Self-assembled Media Synthesis and Nanoimprint Lithography (NIL). Although there are many challenges to manufacturing patterned media, the large potential gains offered in terms of areal density make it one of the most promising new technologies on the horizon for future hard disk drives. Thus, this dissertation contributes to the development of future alternative data storage devices and deferring the superparamagnetic limit by designing and characterizing patterned magnetic media using a novel nanoimprint replication process called "Step and Flash Imprint lithography". As opposed to hot embossing and other high temperature-low pressure processes, SFIL can be performed at low pressure and room temperature. Initial experiments carried out, consisted of process flow design for the patterned structures on sputtered Ni-Fe thin films. The main one being the defectivity analysis for the SFIL process conducted by fabricating and testing devices of varying feature sizes (50 nm to 1 μm) and inspecting them optically as well as testing them electrically. Once the SFIL process was optimized, a number of Ni-Fe coated wafers were imprinted with a template having the patterned topography. A minimum feature size of 40 nm was obtained with varying pitch (1:1, 1:1.5, 1:2, and 1:3). The Characterization steps involved extensive SEM study at each processing step as well as Atomic Force Microscopy (AFM) and Magnetic Force Microscopy (MFM) analysis.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The increasing amount of available semistructured data demands efficient mechanisms to store, process, and search an enormous corpus of data to encourage its global adoption. Current techniques to store semistructured documents either map them to relational databases, or use a combination of flat files and indexes. These two approaches result in a mismatch between the tree-structure of semistructured data and the access characteristics of the underlying storage devices. Furthermore, the inefficiency of XML parsing methods has slowed down the large-scale adoption of XML into actual system implementations. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have significant drawbacks that undermine the massive adoption of XML. Once the processing (storage and parsing) issues for semistructured data have been addressed, another key challenge to leverage semistructured data is to perform effective information discovery on such data. Previous works have addressed this problem in a generic (i.e. domain independent) way, but this process can be improved if knowledge about the specific domain is taken into consideration. This dissertation had two general goals: The first goal was to devise novel techniques to efficiently store and process semistructured documents. This goal had two specific aims: We proposed a method for storing semistructured documents that maps the physical characteristics of the documents to the geometrical layout of hard drives. We developed a Double-Lazy Parser for semistructured documents which introduces lazy behavior in both the pre-parsing and progressive parsing phases of the standard Document Object Model's parsing mechanism. The second goal was to construct a user-friendly and efficient engine for performing Information Discovery over domain-specific semistructured documents. This goal also had two aims: We presented a framework that exploits the domain-specific knowledge to improve the quality of the information discovery process by incorporating domain ontologies. We also proposed meaningful evaluation metrics to compare the results of search systems over semistructured documents.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work is the first work using patterned soft underlayers in multilevel three-dimensional vertical magnetic data storage systems. The motivation stems from an exponentially growing information stockpile, and a corresponding need for more efficient storage devices with higher density. The world information stockpile currently exceeds 150EB (ExaByte=1x1018Bytes); most of which is in analog form. Among the storage technologies (semiconductor, optical and magnetic), magnetic hard disk drives are posed to occupy a big role in personal, network as well as corporate storage. However; this mode suffers from a limit known as the Superparamagnetic limit; which limits achievable areal density due to fundamental quantum mechanical stability requirements. There are many viable techniques considered to defer superparamagnetism into the 100's of Gbit/in2 such as: patterned media, Heat-Assisted Magnetic Recording (HAMR), Self Organized Magnetic Arrays (SOMA), antiferromagnetically coupled structures (AFC), and perpendicular magnetic recording. Nonetheless, these techniques utilize a single magnetic layer; and can thusly be viewed as two-dimensional in nature. In this work a novel three-dimensional vertical magnetic recording approach is proposed. This approach utilizes the entire thickness of a magnetic multilayer structure to store information; with potential areal density well into the Tbit/in2 regime. ^ There are several possible implementations for 3D magnetic recording; each presenting its own set of requirements, merits and challenges. The issues and considerations pertaining to the development of such systems will be examined, and analyzed using empirical and numerical analysis techniques. Two novel key approaches are proposed and developed: (1) Patterned soft underlayer (SUL) which allows for enhanced recording of thicker media, (2) A combinatorial approach for 3D media development that facilitates concurrent investigation of various film parameters on a predefined performance metric. A case study is presented using combinatorial overcoats of Tantalum and Zirconium Oxides for corrosion protection in magnetic media. ^ Feasibility of 3D recording is demonstrated, and an emphasis on 3D media development is emphasized as a key prerequisite. Patterned SUL shows significant enhancement over conventional "un-patterned" SUL, and shows that geometry can be used as a design tool to achieve favorable field distribution where magnetic storage and magnetic phenomena are involved. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Moving objects database systems are the most challenging sub-category among Spatio-Temporal database systems. A database system that updates in real-time the location information of GPS-equipped moving vehicles has to meet even stricter requirements. Currently existing data storage models and indexing mechanisms work well only when the number of moving objects in the system is relatively small. This dissertation research aimed at the real-time tracking and history retrieval of massive numbers of vehicles moving on road networks. A total solution has been provided for the real-time update of the vehicles' location and motion information, range queries on current and history data, and prediction of vehicles' movement in the near future. ^ To achieve these goals, a new approach called Segmented Time Associated to Partitioned Space (STAPS) was first proposed in this dissertation for building and manipulating the indexing structures for moving objects databases. ^ Applying the STAPS approach, an indexing structure of associating a time interval tree to each road segment was developed for real-time database systems of vehicles moving on road networks. The indexing structure uses affordable storage to support real-time data updates and efficient query processing. The data update and query processing performance it provides is consistent without restrictions such as a time window or assuming linear moving trajectories. ^ An application system design based on distributed system architecture with centralized organization was developed to maximally support the proposed data and indexing structures. The suggested system architecture is highly scalable and flexible. Finally, based on a real-world application model of vehicles moving in region-wide, main issues on the implementation of such a system were addressed. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Experimental and theoretical studies regarding noise processes in various kinds of AlGaAs/GaAs heterostructures with a quantum well are reported. The measurement processes, involving a Fast Fourier Transform and analog wave analyzer in the frequency range from 10 Hz to 1 MHz, a computerized data storage and processing system, and cryostat in the temperature range from 78 K to 300 K are described in detail. The current noise spectra are obtained with the “three-point method”, using a Quan-Tech and avalanche noise source for calibration. ^ The properties of both GaAs and AlGaAs materials and field effect transistors, based on the two-dimensional electron gas in the interface quantum well, are discussed. Extensive measurements are performed in three types of heterostructures, viz., Hall structures with a large spacer layer, modulation-doped non-gated FETs, and more standard gated FETs; all structures are grown by MBE techniques. ^ The Hall structures show Lorentzian generation-recombination noise spectra with near temperature independent relaxation times. This noise is attributed to g-r processes in the 2D electron gas. For the TEGFET structures, we observe several Lorentzian g-r noise components which have strongly temperature dependent relaxation times. This noise is attributed to trapping processes in the doped AlGaAs layer. The trap level energies are determined from an Arrhenius plot of log (τT2) versus 1/T as well as from the plateau values. The theory to interpret these measurements and to extract the defect level data is reviewed and further developed. Good agreement with the data is found for all reported devices. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Moving objects database systems are the most challenging sub-category among Spatio-Temporal database systems. A database system that updates in real-time the location information of GPS-equipped moving vehicles has to meet even stricter requirements. Currently existing data storage models and indexing mechanisms work well only when the number of moving objects in the system is relatively small. This dissertation research aimed at the real-time tracking and history retrieval of massive numbers of vehicles moving on road networks. A total solution has been provided for the real-time update of the vehicles’ location and motion information, range queries on current and history data, and prediction of vehicles’ movement in the near future. To achieve these goals, a new approach called Segmented Time Associated to Partitioned Space (STAPS) was first proposed in this dissertation for building and manipulating the indexing structures for moving objects databases. Applying the STAPS approach, an indexing structure of associating a time interval tree to each road segment was developed for real-time database systems of vehicles moving on road networks. The indexing structure uses affordable storage to support real-time data updates and efficient query processing. The data update and query processing performance it provides is consistent without restrictions such as a time window or assuming linear moving trajectories. An application system design based on distributed system architecture with centralized organization was developed to maximally support the proposed data and indexing structures. The suggested system architecture is highly scalable and flexible. Finally, based on a real-world application model of vehicles moving in region-wide, main issues on the implementation of such a system were addressed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Everglades R-EMAP project for year 2005 produced large quantities of data collected at 232 sampling sites. Data collection and analysis is an on-going long-term activity conducted by scientists of different disciplines at irregular intervals of several years. The data sets collected for 2005 include bio-geo-chemical (including mercury and hydro period), fish, invertebrate, periphyton, and plant data. Each sampling site is associated with a location, a description of the site to provide a general overview and photographs to provide a pictorial impression. The Geographic Information Systems and Remote Sensing Center(GISRSC) at Florida International University (FIU) has designed and implemented an enterprise database for long-term storage of the project�s data in a central repository, providing the framework of data storage for the continuity of future sampling campaigns and allowing integration of new sample data as it becomes available. In addition GISRSC provides this interactive web application for easy, quick and effective retrieval and visualization of that data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The virtual quadrilateral is the coalescence of novel data structures that reduces the storage requirements of spatial data without jeopardizing the quality and operability of the inherent information. The data representative of the observed area is parsed to ascertain the necessary contiguous measures that, when contained, implicitly define a quadrilateral. The virtual quadrilateral then represents a geolocated area of the observed space where all of the measures are the same. The area, contoured as a rectangle, is pseudo-delimited by the opposite coordinates of the bounding area. Once defined, the virtual quadrilateral is representative of an area in the observed space and is represented in a database by the attributes of its bounding coordinates and measure of its contiguous space. Virtual quadrilaterals have been found to ensure a lossless reduction of the physical storage, maintain the implied features of the data, facilitate the rapid retrieval of vast amount of the represented spatial data and accommodate complex queries. The methods presented herein demonstrate that virtual quadrilaterals are created quite easily, are stable and versatile objects in a database and have proven to be beneficial to exigent spatial data applications such as geographic information systems. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research presents several components encompassing the scope of the objective of Data Partitioning and Replication Management in Distributed GIS Database. Modern Geographic Information Systems (GIS) databases are often large and complicated. Therefore data partitioning and replication management problems need to be addresses in development of an efficient and scalable solution. ^ Part of the research is to study the patterns of geographical raster data processing and to propose the algorithms to improve availability of such data. These algorithms and approaches are targeting granularity of geographic data objects as well as data partitioning in geographic databases to achieve high data availability and Quality of Service(QoS) considering distributed data delivery and processing. To achieve this goal a dynamic, real-time approach for mosaicking digital images of different temporal and spatial characteristics into tiles is proposed. This dynamic approach reuses digital images upon demand and generates mosaicked tiles only for the required region according to user's requirements such as resolution, temporal range, and target bands to reduce redundancy in storage and to utilize available computing and storage resources more efficiently. ^ Another part of the research pursued methods for efficient acquiring of GIS data from external heterogeneous databases and Web services as well as end-user GIS data delivery enhancements, automation and 3D virtual reality presentation. ^ There are vast numbers of computing, network, and storage resources idling or not fully utilized available on the Internet. Proposed "Crawling Distributed Operating System "(CDOS) approach employs such resources and creates benefits for the hosts that lend their CPU, network, and storage resources to be used in GIS database context. ^ The results of this dissertation demonstrate effective ways to develop a highly scalable GIS database. The approach developed in this dissertation has resulted in creation of TerraFly GIS database that is used by US government, researchers, and general public to facilitate Web access to remotely-sensed imagery and GIS vector information. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds—parent-to-child or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5–54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3–127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The increasing amount of available semistructured data demands efficient mechanisms to store, process, and search an enormous corpus of data to encourage its global adoption. Current techniques to store semistructured documents either map them to relational databases, or use a combination of flat files and indexes. These two approaches result in a mismatch between the tree-structure of semistructured data and the access characteristics of the underlying storage devices. Furthermore, the inefficiency of XML parsing methods has slowed down the large-scale adoption of XML into actual system implementations. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have significant drawbacks that undermine the massive adoption of XML. ^ Once the processing (storage and parsing) issues for semistructured data have been addressed, another key challenge to leverage semistructured data is to perform effective information discovery on such data. Previous works have addressed this problem in a generic (i.e. domain independent) way, but this process can be improved if knowledge about the specific domain is taken into consideration. ^ This dissertation had two general goals: The first goal was to devise novel techniques to efficiently store and process semistructured documents. This goal had two specific aims: We proposed a method for storing semistructured documents that maps the physical characteristics of the documents to the geometrical layout of hard drives. We developed a Double-Lazy Parser for semistructured documents which introduces lazy behavior in both the pre-parsing and progressive parsing phases of the standard Document Object Model’s parsing mechanism. ^ The second goal was to construct a user-friendly and efficient engine for performing Information Discovery over domain-specific semistructured documents. This goal also had two aims: We presented a framework that exploits the domain-specific knowledge to improve the quality of the information discovery process by incorporating domain ontologies. We also proposed meaningful evaluation metrics to compare the results of search systems over semistructured documents. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Storage is a central part of computing. Driven by exponentially increasing content generation rate and a widening performance gap between memory and secondary storage, researchers are in the perennial quest to push for further innovation. This has resulted in novel ways to "squeeze" more capacity and performance out of current and emerging storage technology. Adding intelligence and leveraging new types of storage devices has opened the door to a whole new class of optimizations to save cost, improve performance, and reduce energy consumption. In this dissertation, we first develop, analyze, and evaluate three storage extensions. Our first extension tracks application access patterns and writes data in the way individual applications most commonly access it to benefit from the sequential throughput of disks. Our second extension uses a lower power flash device as a cache to save energy and turn off the disk during idle periods. Our third extension is designed to leverage the characteristics of both disks and solid state devices by placing data in the most appropriate device to improve performance and save power. In developing these systems, we learned that extending the storage stack is a complex process. Implementing new ideas incurs a prolonged and cumbersome development process and requires developers to have advanced knowledge of the entire system to ensure that extensions accomplish their goal without compromising data recoverability. Futhermore, storage administrators are often reluctant to deploy specific storage extensions without understanding how they interact with other extensions and if the extension ultimately achieves the intended goal. We address these challenges by using a combination of approaches. First, we simplify the storage extension development process with system-level infrastructure that implements core functionality commonly needed for storage extension development. Second, we develop a formal theory to assist administrators deploy storage extensions while guaranteeing that the given high level goals are satisfied. There are, however, some cases for which our theory is inconclusive. For such scenarios we present an experimental methodology that allows administrators to pick an extension that performs best for a given workload. Our evaluation demostrates the benefits of both the infrastructure and the formal theory.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Electrical energy is an essential resource for the modern world. Unfortunately, its price has almost doubled in the last decade. Furthermore, energy production is also currently one of the primary sources of pollution. These concerns are becoming more important in data-centers. As more computational power is required to serve hundreds of millions of users, bigger data-centers are becoming necessary. This results in higher electrical energy consumption. Of all the energy used in data-centers, including power distribution units, lights, and cooling, computer hardware consumes as much as 80%. Consequently, there is opportunity to make data-centers more energy efficient by designing systems with lower energy footprint. Consuming less energy is critical not only in data-centers. It is also important in mobile devices where battery-based energy is a scarce resource. Reducing the energy consumption of these devices will allow them to last longer and re-charge less frequently. Saving energy in computer systems is a challenging problem. Improving a system's energy efficiency usually comes at the cost of compromises in other areas such as performance or reliability. In the case of secondary storage, for example, spinning-down the disks to save energy can incur high latencies if they are accessed while in this state. The challenge is to be able to increase the energy efficiency while keeping the system as reliable and responsive as before. This thesis tackles the problem of improving energy efficiency in existing systems while reducing the impact on performance. First, we propose a new technique to achieve fine grained energy proportionality in multi-disk systems; Second, we design and implement an energy-efficient cache system using flash memory that increases disk idleness to save energy; Finally, we identify and explore solutions for the page fetch-before-update problem in caching systems that can: (a) control better I/O traffic to secondary storage and (b) provide critical performance improvement for energy efficient systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Over the past five years, XML has been embraced by both the research and industrial community due to its promising prospects as a new data representation and exchange format on the Internet. The widespread popularity of XML creates an increasing need to store XML data in persistent storage systems and to enable sophisticated XML queries over the data. The currently available approaches to addressing the XML storage and retrieval issue have the limitations of either being not mature enough (e.g. native approaches) or causing inflexibility, a lot of fragmentation and excessive join operations (e.g. non-native approaches such as the relational database approach). ^ In this dissertation, I studied the issue of storing and retrieving XML data using the Semantic Binary Object-Oriented Database System (Sem-ODB) to leverage the advanced Sem-ODB technology with the emerging XML data model. First, a meta-schema based approach was implemented to address the data model mismatch issue that is inherent in the non-native approaches. The meta-schema based approach captures the meta-data of both Document Type Definitions (DTDs) and Sem-ODB Semantic Schemas, thus enables a dynamic and flexible mapping scheme. Second, a formal framework was presented to ensure precise and concise mappings. In this framework, both schemas and the conversions between them are formally defined and described. Third, after major features of an XML query language, XQuery, were analyzed, a high-level XQuery to Semantic SQL (Sem-SQL) query translation scheme was described. This translation scheme takes advantage of the navigation-oriented query paradigm of the Sem-SQL, thus avoids the excessive join problem of relational approaches. Finally, the modeling capability of the Semantic Binary Object-Oriented Data Model (Sem-ODM) was explored from the perspective of conceptually modeling an XML Schema using a Semantic Schema. ^ It was revealed that the advanced features of the Sem-ODB, such as multi-valued attributes, surrogates, the navigation-oriented query paradigm, among others, are indeed beneficial in coping with the XML storage and retrieval issue using a non-XML approach. Furthermore, extensions to the Sem-ODB to make it work more effectively with XML data were also proposed. ^