987 resultados para Data placement
Resumo:
Data grid services have been used to deal with the increasing needs of applications in terms of data volume and throughput. The large scale, heterogeneity and dynamism of grid environments often make management and tuning of these data services very complex. Furthermore, current high-performance I/O approaches are characterized by their high complexity and specific features that usually require specialized administrator skills. Autonomic computing can help manage this complexity. The present paper describes an autonomic subsystem intended to provide self-management features aimed at efficiently reducing the I/O problem in a grid environment, thereby enhancing the quality of service (QoS) of data access and storage services in the grid. Our proposal takes into account that data produced in an I/O system is not usually immediately required. Therefore, performance improvements are related not only to current but also to any future I/O access, as the actual data access usually occurs later on. Nevertheless, the exact time of the next I/O operations is unknown. Thus, our approach proposes a long-term prediction designed to forecast the future workload of grid components. This enables the autonomic subsystem to determine the optimal data placement to improve both current and future I/O operations.
Resumo:
In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.
Resumo:
Datacenters have emerged as the dominant form of computing infrastructure over the last two decades. The tremendous increase in the requirements of data analysis has led to a proportional increase in power consumption and datacenters are now one of the fastest growing electricity consumers in the United States. Another rising concern is the loss of throughput due to network congestion. Scheduling models that do not explicitly account for data placement may lead to a transfer of large amounts of data over the network causing unacceptable delays. In this dissertation, we study different scheduling models that are inspired by the dual objectives of minimizing energy costs and network congestion in a datacenter. As datacenters are equipped to handle peak workloads, the average server utilization in most datacenters is very low. As a result, one can achieve huge energy savings by selectively shutting down machines when demand is low. In this dissertation, we introduce the network-aware machine activation problem to find a schedule that simultaneously minimizes the number of machines necessary and the congestion incurred in the network. Our model significantly generalizes well-studied combinatorial optimization problems such as hard-capacitated hypergraph covering and is thus strongly NP-hard. As a result, we focus on finding good approximation algorithms. Data-parallel computation frameworks such as MapReduce have popularized the design of applications that require a large amount of communication between different machines. Efficient scheduling of these communication demands is essential to guarantee efficient execution of the different applications. In the second part of the thesis, we study the approximability of the co-flow scheduling problem that has been recently introduced to capture these application-level demands. Finally, we also study the question, "In what order should one process jobs?'' Often, precedence constraints specify a partial order over the set of jobs and the objective is to find suitable schedules that satisfy the partial order. However, in the presence of hard deadline constraints, it may be impossible to find a schedule that satisfies all precedence constraints. In this thesis we formalize different variants of job scheduling with soft precedence constraints and conduct the first systematic study of these problems.
Resumo:
Study Design. In vitro biomechanical investigation of the screw-holding capacity. Objective. To evaluate the effect of repetitive screw-hole use on the insertional torque and retentive strength of vertebral system screws. Summary and Background Data. Placement and removal of vertebral system screws is sometimes necessary during the surgical procedures in order to assess the walls of the pilot hole. This procedure may compromise the holding capacity of the implant. Methods. Screws with outer diameter measuring 5, 6, and 7 mm were inserted into wood, polyurethane, polyethylene, and cancellous bone cylindrical blocks. The pilot holes were made with drills of a smaller, equal, or wider diameter than the inner screw diameter. Three experimental groups were established based on the number of insertions and reinsertions of the screws and subgroups were created according to the outer diameter of the screw and the diameter of the pilot hole used. Results. A reduction of screw-holding capacity was observed between the first and the following insertions regardless the anchorage material. The pattern of reduction of retentive strength was not similar to the pattern of torque reduction. The pullout strength was more pronounced between the first and the last insertions, while the torque decreased more proportionally from the first to the last insertions. Conclusion. Insertion and reinsertion of the screws of the vertebral fixation system used in the present study reduced the insertion torque and screw purchase.
Resumo:
In CMS è stato lanciato un progetto di Data Analytics e, all’interno di esso, un’attività specifica pilota che mira a sfruttare tecniche di Machine Learning per predire la popolarità dei dataset di CMS. Si tratta di un’osservabile molto delicata, la cui eventuale predizione premetterebbe a CMS di costruire modelli di data placement più intelligenti, ampie ottimizzazioni nell’uso dello storage a tutti i livelli Tiers, e formerebbe la base per l’introduzione di un solito sistema di data management dinamico e adattivo. Questa tesi descrive il lavoro fatto sfruttando un nuovo prototipo pilota chiamato DCAFPilot, interamente scritto in python, per affrontare questa sfida.
Resumo:
Data mining is one of the hottest research areas nowadays as it has got wide variety of applications in common man’s life to make the world a better place to live. It is all about finding interesting hidden patterns in a huge history data base. As an example, from a sales data base, one can find an interesting pattern like “people who buy magazines tend to buy news papers also” using data mining. Now in the sales point of view the advantage is that one can place these things together in the shop to increase sales. In this research work, data mining is effectively applied to a domain called placement chance prediction, since taking wise career decision is so crucial for anybody for sure. In India technical manpower analysis is carried out by an organization named National Technical Manpower Information System (NTMIS), established in 1983-84 by India's Ministry of Education & Culture. The NTMIS comprises of a lead centre in the IAMR, New Delhi, and 21 nodal centres located at different parts of the country. The Kerala State Nodal Centre is located at Cochin University of Science and Technology. In Nodal Centre, they collect placement information by sending postal questionnaire to passed out students on a regular basis. From this raw data available in the nodal centre, a history data base was prepared. Each record in this data base includes entrance rank ranges, reservation, Sector, Sex, and a particular engineering. From each such combination of attributes from the history data base of student records, corresponding placement chances is computed and stored in the history data base. From this data, various popular data mining models are built and tested. These models can be used to predict the most suitable branch for a particular new student with one of the above combination of criteria. Also a detailed performance comparison of the various data mining models is done.This research work proposes to use a combination of data mining models namely a hybrid stacking ensemble for better predictions. A strategy to predict the overall absorption rate for various branches as well as the time it takes for all the students of a particular branch to get placed etc are also proposed. Finally, this research work puts forward a new data mining algorithm namely C 4.5 * stat for numeric data sets which has been proved to have competent accuracy over standard benchmarking data sets called UCI data sets. It also proposes an optimization strategy called parameter tuning to improve the standard C 4.5 algorithm. As a summary this research work passes through all four dimensions for a typical data mining research work, namely application to a domain, development of classifier models, optimization and ensemble methods.
Resumo:
The objective of this study was to assess implant therapy after a staged guided bone regeneration procedure in the anterior maxilla by lateralization of the nasopalatine nerve and vessel bundle. Neurosensory function following augmentative procedures and implant placement, assessed using a standardized questionnaire and clinical examination, were the primary outcome variables measured. This retrospective study included patients with a bone defect in the anterior maxilla in need of horizontal and/or vertical ridge augmentation prior to dental implant placement. The surgical sites were allowed to heal for at least 6 months before placement of dental implants. All patients received fixed implant-supported restorations and entered into a tightly scheduled maintenance program. In addition to the maintenance program, patients were recalled for a clinical examination and to fill out a questionnaire to assess any changes in the neurosensory function of the nasopalatine nerve at least 6 months after function. Twenty patients were included in the study from February 2001 to December 2010. They received a total of 51 implants after augmentation of the alveolar crest and lateralization of the nasopalatine nerve. The follow-up examination for questionnaire and neurosensory assessment was scheduled after a mean period of 4.18 years of function. None of the patients examined reported any pain, they did not have less or an altered sensation, and they did not experience a "foreign body" feeling in the area of surgery. Overall, 6 patients out of 20 (30%) showed palatal sensibility alterations of the soft tissues in the region of the maxillary canines and incisors resulting in a risk for a neurosensory change of 0.45 mucosal teeth regions per patient after ridge augmentation with lateralization of the nasopalatine nerve. Regeneration of bone defects in the anterior maxilla by horizontal and/or vertical ridge augmentation and lateralization of the nasopalatine nerve prior to dental implant placement is a predictable surgical technique. Whether or not there were clinically measurable impairments of neurosensory function, the patients did not report them or were not bothered by them.
Resumo:
Few valid and reliable placement procedures are available to assess the English language proficiency of adults who enroll in English for Speakers of Other Languages (ESOL) programs. Whereas placement material exists for children and university ESOL students, the needs of students in adult community education programs have not been adequately addressed. Furthermore, the research suggests that a number of variables, such as, native language, age, prior schooling, length of residence, and employment are related to second language acquisition. Numerous studies contribute to our understanding of the relationship of these factors to second language acquisition of Spanish-speaking students. Again, there is a void in the research investigating the factors affecting second language acquisition and consequently, appropriate placement of Haitian Creole-speaking students. This study compared a standardized instrument, the NYS Place Test, used alone and in combination with a writing sample in English, to subjective judgement of a department coordinator for initial placement of Haitian adult ESOL students in a community education program. The study also investigated whether or not consideration of student profile data improved the accuracy of the test. Finally, the study sought to determine if a relationship existed between student profile data and those who withdrew from the program or did not enter a class after registering. Analysis of the data by crosstabulation and chi-square revealed that the standardized NYS Place Test was at least as accurate as subjective department coordinator placement and that one procedure could be substituted for li other. Although the writing sample in English improved accuracy of placement by the NYS test, the results were not significant. Of the profile variables, only length of residence was found to be significantly related to accuracy of placement using the NYS Place Test. The number of incorrect placements was higher for those students who lived in the host country from twenty-five to one hundred ten months. A post hoc analysis of NYS test scores according to level showed that those learners who placed in level three also had a significantly higher incidence of incorrect placements. No significant relationship was observed between the profile variables and those who withdrew from the program or registered but did not enter a class.
Resumo:
Background: Polypodium hydriforme is a parasite with an unusual life cycle and peculiar morphology, both of which have made its systematic position uncertain. Polypodium has traditionally been considered a cnidarian because it possesses nematocysts, the stinging structures characteristic of this phylum. However, recent molecular phylogenetic studies using 18S rDNA sequence data have challenged this interpretation, and have shown that Polypodium is a close relative to myxozoans and together they share a closer affinity to bilaterians than cnidarians. Due to the variable rates of 18S rDNA sequences, these results have been suggested to be an artifact of long-branch attraction ( LBA). A recent study, using multiple protein coding markers, shows that the myxozoan Buddenbrockia, is nested within cnidarians. Polypodium was not included in this study. To further investigate the phylogenetic placement of Polypodium, we have performed phylogenetic analyses of metazoans with 18S and partial 28S rDNA sequences in a large dataset that includes Polypodium and a comprehensive sampling of cnidarian taxa. Results: Analyses of a combined dataset of 18S and partial 28S sequences, and partial 28S alone, support the placement of Polypodium within Cnidaria. Removal of the long-branched myxozoans from the 18S dataset also results in Polypodium being nested within Cnidaria. These results suggest that previous reports showing that Polypodium and Myxozoa form a sister group to Bilateria were an artifact of long-branch attraction. Conclusion: By including 28S rDNA sequences and a comprehensive sampling of cnidarian taxa, we demonstrate that previously conflicting hypotheses concerning the phylogenetic placement of Polypodium can be reconciled. Specifically, the data presented provide evidence that Polypodium is indeed a cnidarian and is either the sister taxon to Hydrozoa, or part of the hydrozoan clade, Leptothecata. The former hypothesis is consistent with the traditional view that Polypodium should be placed in its own cnidarian class, Polypodiozoa.
Resumo:
Objectives: The aim of this study was to determine the precision of the measurements of 2 craniometric anatomic points-glabella and anterior nasal spine-in order to verify their possibility as potential locations for placing implants aimed at nasal prostheses retention. Methods: Twenty-six dry human skulls were scanned in a high-resolution spiral tomography with 1-mm axial slice thickness and 1-mm interval reconstruction using a bone tissue filter. Images obtained were stored and transferred to an independent workstation containing e-film imaging software. The measurements (in the glabella and anterior nasal fossa) were made independently by 2 observers twice for each measurement. Data were submitted to statistical analysis (parametric t test). Results: The results demonstrated no statistically significant difference between interobserver and intraobserver measurements (P > .05). The standard error was found to be between 0.49 mm and 0.84 mrn for measurements in bone protocol, indicating a high /eve/ of precision. Conclusions: The measurements obtained in anterior nasal spine and glabella were considered precise and reproducible. Mean values of such measurements pointed to the possibility of implant placement in these regions, particularly in the anterior nasal spine.
Resumo:
Purpose: Orthodontic miniscrews are commonly used to achieve absolute anchorage during tooth movement. One of the most frequent complications is screw loss as a result of root contact. Increased precision during the process of miniscrew insertion would help prevent screw loss and potential root damage, improving treatment outcomes. Stereo lithographic surgical guides have been commonly used for prosthetic implants to increase the precision of insertion. The objective of this paper was to describe the use of a stereolithographic surgical guide suitable for one-component orthodontic miniscrews based on cone beam computed tomography (CBCT) data and to evaluate implant placement accuracy. Materials and Methods: Acrylic splints were adapted to the dental arches of four patients, and six radiopaque reference points were filled with gutta-percha. The patients were submitted to CBCT while they wore the occlusal splint. Another series of images was captured with the splint alone. After superimposition and segmentation, miniscrew insertion was simulated using planning software that allowed the user to check the implant position in all planes and in three dimensions. In a rapid-prototyping machine, a stereolithographic guide was fabricated with metallic sleeves located at the insertion points to allow for three-dimensional control of the pilot bur. The surgical guide was worn during surgery. After implant insertion, each patient was submitted to CBCT a second time to verify the implant position and the accuracy of the placement of the miniscrews. Results: The average differences between the planned and inserted positions for the ten miniscrews were 0.86 mm at the coronal end, 0.71 mm at the center, and 0.87 mm at the apical tip. The average angular discrepancy was 1.76 degrees. Conclusions: The use of stereolithographic surgical guides based on CBCT data allows for accurate orthodontic mini screw insertion without damaging neighboring anatomic structures. INT J ORAL MAXILLOFAC IMPLANTS 2011;26:860-865
Resumo:
This paper reports on the fate of nitrogen (N) in a first ratoon sugarcane (Saccharum officinarum L.) crop in the wet tropics of Queensland when urea was either surface applied or drilled into the soil 3-4 days after harvesting the plant cane. Ammonia volatilization was measured with a micrometeorological method, and fertilizer N recovery in plants and soil, to a depth of 140 cm, was determined by mass balance in macroplots with N labelled urea 166 and 334 days after fertilizer application. The bulk of the fertilizer and soil N uptake by the sugarcane occurred between fertilizing and the first sampling on day 166. Nitrogen use efficiency measured as the recovery of labelled N in the plant was very low. At the time of the final sampling (day 334), the efficiencies for the surface and subsurface treatments were 18.9% and 28.8%, respectively. The tops, leaves, stalks and roots in the subsurface treatment contained significantly more fertilizer N than the corresponding parts in the surface treatment. The total recoveries of fertilizer N for the plant-trash-soil system on day 334 indicate significant losses of N in both treatments ( 59.1% and 45.6% of the applied N in the surface and subsurface treatments, respectively). Drilling the urea into the soil instead of applying it to the trash surface reduced ammonia loss from 37.3% to 5.5% of the applied N. Subtracting the data for ammonia loss from total loss suggests that losses by leaching and denitrification combined increased from 21.8% and 40.1% of the applied N as a result of the change in method of application. While the treatment resulted in increased denitrification and/or leaching loss, total N loss was reduced from 59.1% to 45.6%, ( a saving of 13.5% of the applied N), which resulted in an extra 9.9% of the applied N being assimilated by the crop.
Resumo:
Abstract This thesis proposes a set of adaptive broadcast solutions and an adaptive data replication solution to support the deployment of P2P applications. P2P applications are an emerging type of distributed applications that are running on top of P2P networks. Typical P2P applications are video streaming, file sharing, etc. While interesting because they are fully distributed, P2P applications suffer from several deployment problems, due to the nature of the environment on which they perform. Indeed, defining an application on top of a P2P network often means defining an application where peers contribute resources in exchange for their ability to use the P2P application. For example, in P2P file sharing application, while the user is downloading some file, the P2P application is in parallel serving that file to other users. Such peers could have limited hardware resources, e.g., CPU, bandwidth and memory or the end-user could decide to limit the resources it dedicates to the P2P application a priori. In addition, a P2P network is typically emerged into an unreliable environment, where communication links and processes are subject to message losses and crashes, respectively. To support P2P applications, this thesis proposes a set of services that address some underlying constraints related to the nature of P2P networks. The proposed services include a set of adaptive broadcast solutions and an adaptive data replication solution that can be used as the basis of several P2P applications. Our data replication solution permits to increase availability and to reduce the communication overhead. The broadcast solutions aim, at providing a communication substrate encapsulating one of the key communication paradigms used by P2P applications: broadcast. Our broadcast solutions typically aim at offering reliability and scalability to some upper layer, be it an end-to-end P2P application or another system-level layer, such as a data replication layer. Our contributions are organized in a protocol stack made of three layers. In each layer, we propose a set of adaptive protocols that address specific constraints imposed by the environment. Each protocol is evaluated through a set of simulations. The adaptiveness aspect of our solutions relies on the fact that they take into account the constraints of the underlying system in a proactive manner. To model these constraints, we define an environment approximation algorithm allowing us to obtain an approximated view about the system or part of it. This approximated view includes the topology and the components reliability expressed in probabilistic terms. To adapt to the underlying system constraints, the proposed broadcast solutions route messages through tree overlays permitting to maximize the broadcast reliability. Here, the broadcast reliability is expressed as a function of the selected paths reliability and of the use of available resources. These resources are modeled in terms of quotas of messages translating the receiving and sending capacities at each node. To allow a deployment in a large-scale system, we take into account the available memory at processes by limiting the view they have to maintain about the system. Using this partial view, we propose three scalable broadcast algorithms, which are based on a propagation overlay that tends to the global tree overlay and adapts to some constraints of the underlying system. At a higher level, this thesis also proposes a data replication solution that is adaptive both in terms of replica placement and in terms of request routing. At the routing level, this solution takes the unreliability of the environment into account, in order to maximize reliable delivery of requests. At the replica placement level, the dynamically changing origin and frequency of read/write requests are analyzed, in order to define a set of replica that minimizes communication cost.
Resumo:
The passage of the Workforce Investment Act (WIA) of 1998 [Public Law 105-220] by the 105th Congress has ushered in a new era of collaboration, coordination, cooperation and accountability. The overall goal of the Act is “to increase the employability, retention, and earnings of participants, and increase occupational skill attainment by participants, and, as a result improve the quality of the workforce, reduce welfare dependency, and enhance the productivity and competitiveness of the Nation.” The key principles inculcated in the Act are: • Streamlining services; • Empowering individuals; • Universal access; • Increased accountability; • New roles for local boards; • State and local flexibility; • Improved youth programs. The purpose of Title II, The Adult Education and Family Literacy Act (AEFLA), of the Workforce Investment Act of 1998 is to create a partnership among the federal government, states, and localities to provide, on a voluntary basis, adult education and literacy services in order to: • Assist adults become literate and obtain the knowledge and skills necessary for employment and self-sufficiency; • Assist adults who are parents obtain the educational skills necessary to become full partners in the educational development of their children; • Assist adults in the completion of a secondary school education. Adult education is an important part of the workforce investment system. Title II restructures and improves programs previously authorized by the Adult Education Act. AEFLA focuses on strengthening program quality by requiring States to give priority in awarding funds to local programs that are based on a solid foundation of research, address the diverse needs of adult learners, and utilize other effective practices and strategies. To promote continuous program involvement and to ensure optimal return on the Federal investment, AEFLA also establishes a State performance accountability system. Under this system, the Secretary and each State must reach agreement on annual levels of performance for a number of “core indicators” specified in the law: • Demonstrated improvements in literacy skill levels in reading, writing, and speaking the English language, numeracy, problem solving, English language acquisition, and other literacy skills. • Placement in, retention in, or completion of postsecondary education, training, unsubsidized employment or career advancement. • Receipt of a secondary school diploma or its recognized equivalent. Iowa’s community college based adult basic education program has implemented a series of proactive strategies in order to effectively and systematically meet the challenges posed by WIA. The Iowa TOPSpro Data Dictionary is a direct result of Iowa’s pro-active efforts in this educational arena.