881 resultados para Large-scale analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The social media classification problems draw more and more attention in the past few years. With the rapid development of Internet and the popularity of computers, there is astronomical amount of information in the social network (social media platforms). The datasets are generally large scale and are often corrupted by noise. The presence of noise in training set has strong impact on the performance of supervised learning (classification) techniques. A budget-driven One-class SVM approach is presented in this thesis that is suitable for large scale social media data classification. Our approach is based on an existing online One-class SVM learning algorithm, referred as STOCS (Self-Tuning One-Class SVM) algorithm. To justify our choice, we first analyze the noise-resilient ability of STOCS using synthetic data. The experiments suggest that STOCS is more robust against label noise than several other existing approaches. Next, to handle big data classification problem for social media data, we introduce several budget driven features, which allow the algorithm to be trained within limited time and under limited memory requirement. Besides, the resulting algorithm can be easily adapted to changes in dynamic data with minimal computational cost. Compared with two state-of-the-art approaches, Lib-Linear and kNN, our approach is shown to be competitive with lower requirements of memory and time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ACKNOWLEDGEMENTS This research is based upon work supported in part by the U.S. ARL and U.K. Ministry of Defense under Agreement Number W911NF-06-3-0001, and by the NSF under award CNS-1213140. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views or represent the official policies of the NSF, the U.S. ARL, the U.S. Government, the U.K. Ministry of Defense or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Funded by The research presented in this paper is part of the SINBAD project. Grant Number: STW (12058) and EPSRC (EP/J00507X/1, EP/J005541/1)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ACKNOWLEDGEMENTS This research is based upon work supported in part by the U.S. ARL and U.K. Ministry of Defense under Agreement Number W911NF-06-3-0001, and by the NSF under award CNS-1213140. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views or represent the official policies of the NSF, the U.S. ARL, the U.S. Government, the U.K. Ministry of Defense or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main goal of this work is to determine the true cost incurred by the Republic of Ireland and Northern Ireland in order to meet their EU renewable electricity targets. The primary all-island of Ireland policy goal is that 40% of electricity will come from renewable sources in 2020. From this it is expected that wind generation on the Irish electricity system will be in the region of 32-37% of total generation. This leads to issues resulting from wind energy being a non-synchronous, unpredictable and variable source of energy use on a scale never seen before for a single synchronous system. If changes are not made to traditional operational practices, the efficient running of the electricity system will be directly affected by these issues in the coming years. Using models of the electricity system for the all-island grid of Ireland, the effects of high wind energy penetration expected to be present in 2020 are examined. These models were developed using a unit commitment, economic dispatch tool called PLEXOS which allows for a detailed representation of the electricity system to be achieved down to individual generator level. These models replicate the true running of the electricity system through use of day-ahead scheduling and semi-relaxed use of these schedules that reflects the Transmission System Operator's of real time decision making on dispatch. In addition, it carefully considers other non-wind priority dispatch generation technologies that have an effect on the overall system. In the models developed, three main issues associated with wind energy integration were selected to be examined in detail to determine the sensitivity of assumptions presented in other studies. These three issues include wind energy's non-synchronous nature, its variability and spatial correlation, and its unpredictability. This leads to an examination of the effects in three areas: the need for system operation constraints required for system security; different onshore to offshore ratios of installed wind energy; and the degrees of accuracy in wind energy forecasting. Each of these areas directly impact the way in which the electricity system is run as they address each of the three issues associated with wind energy stated above, respectively. It is shown that assumptions in these three areas have a large effect on the results in terms of total generation costs, wind curtailment and generator technology type dispatch. In particular accounting for these issues has resulted in wind curtailment being predicted in much larger quantities than had been previously reported. This would have a large effect on wind energy companies because it is already a very low profit margin industry. Results from this work have shown that the relaxation of system operation constraints is crucial to the economic running of the electricity system with large improvements shown in the reduction of wind curtailment and system generation costs. There are clear benefits in having a proportion of the wind installed offshore in Ireland which would help to reduce variability of wind energy generation on the system and therefore reduce wind curtailment. With envisaged future improvements in day-ahead wind forecasting from 8% to 4% mean absolute error, there are potential reductions in wind curtailment system costs and open cycle gas turbine usage. This work illustrates the consequences of assumptions in the areas of system operation constraints, onshore/offshore installed wind capacities and accuracy in wind forecasting to better inform the true costs associated with running Ireland's changing electricity system as it continues to decarbonise into the near future. This work also proposes to illustrate, through the use of Ireland as a case study, the effects that will become ever more prevalent in other synchronous systems as they pursue a path of increasing renewable energy generation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Network simulation is an indispensable tool for studying Internet-scale networks due to the heterogeneous structure, immense size and changing properties. It is crucial for network simulators to generate representative traffic, which is necessary for effectively evaluating next-generation network protocols and applications. With network simulation, we can make a distinction between foreground traffic, which is generated by the target applications the researchers intend to study and therefore must be simulated with high fidelity, and background traffic, which represents the network traffic that is generated by other applications and does not require significant accuracy. The background traffic has a significant impact on the foreground traffic, since it competes with the foreground traffic for network resources and therefore can drastically affect the behavior of the applications that produce the foreground traffic. This dissertation aims to provide a solution to meaningfully generate background traffic in three aspects. First is realism. Realistic traffic characterization plays an important role in determining the correct outcome of the simulation studies. This work starts from enhancing an existing fluid background traffic model by removing its two unrealistic assumptions. The improved model can correctly reflect the network conditions in the reverse direction of the data traffic and can reproduce the traffic burstiness observed from measurements. Second is scalability. The trade-off between accuracy and scalability is a constant theme in background traffic modeling. This work presents a fast rate-based TCP (RTCP) traffic model, which originally used analytical models to represent TCP congestion control behavior. This model outperforms other existing traffic models in that it can correctly capture the overall TCP behavior and achieve a speedup of more than two orders of magnitude over the corresponding packet-oriented simulation. Third is network-wide traffic generation. Regardless of how detailed or scalable the models are, they mainly focus on how to generate traffic on one single link, which cannot be extended easily to studies of more complicated network scenarios. This work presents a cluster-based spatio-temporal background traffic generation model that considers spatial and temporal traffic characteristics as well as their correlations. The resulting model can be used effectively for the evaluation work in network studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Graph analytics is an important and computationally demanding class of data analytics. It is essential to balance scalability, ease-of-use and high performance in large scale graph analytics. As such, it is necessary to hide the complexity of parallelism, data distribution and memory locality behind an abstract interface. The aim of this work is to build a scalable graph analytics framework that does not demand significant parallel programming experience based on NUMA-awareness.
The realization of such a system faces two key problems:
(i)~how to develop a scale-free parallel programming framework that scales efficiently across NUMA domains; (ii)~how to efficiently apply graph partitioning in order to create separate and largely independent work items that can be distributed among threads.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Large-scale multiple-input multiple-output (MIMO) communication systems can bring substantial improvement in spectral efficiency and/or energy efficiency, due to the excessive degrees-of-freedom and huge array gain. However, large-scale MIMO is expected to deploy lower-cost radio frequency (RF) components, which are particularly prone to hardware impairments. Unfortunately, compensation schemes are not able to remove the impact of hardware impairments completely, such that a certain amount of residual impairments always exists. In this paper, we investigate the impact of residual transmit RF impairments (RTRI) on the spectral and energy efficiency of training-based point-to-point large-scale MIMO systems, and seek to determine the optimal training length and number of antennas which maximize the energy efficiency. We derive deterministic equivalents of the signal-to-noise-and-interference ratio (SINR) with zero-forcing (ZF) receivers, as well as the corresponding spectral and energy efficiency, which are shown to be accurate even for small number of antennas. Through an iterative sequential optimization, we find that the optimal training length of systems with RTRI can be smaller compared to ideal hardware systems in the moderate SNR regime, while larger in the high SNR regime. Moreover, it is observed that RTRI can significantly decrease the optimal number of transmit and receive antennas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Two-dimensional (2D) materials have generated great interest in the last few years as a new toolbox for electronics. This family of materials includes, among others, metallic graphene, semiconducting transition metal dichalcogenides (such as MoS2) and insulating Boron Nitride. These materials and their heterostructures offer excellent mechanical flexibility, optical transparency and favorable transport properties for realizing electronic, sensing and optical systems on arbitrary surfaces. In this work, we develop several etch stop layer technologies that allow the fabrication of complex 2D devices and present for the first time the large scale integration of graphene with molybdenum disulfide (MoS2) , both grown using the fully scalable CVD technique. Transistor devices and logic circuits with MoS2 channel and graphene as contacts and interconnects are constructed and show high performances. In addition, the graphene/MoS2 heterojunction contact has been systematically compared with MoS2-metal junctions experimentally and studied using density functional theory. The tunability of the graphene work function significantly improves the ohmic contact to MoS2. These high-performance large-scale devices and circuits based on 2D heterostructure pave the way for practical flexible transparent electronics in the future. The authors acknowledge financial support from the Office of Naval Research (ONR) Young Investigator Program, the ONR GATE MURI program, and the Army Research Laboratory. This research has made use of the MI.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Otto-von-Guericke-Universität Magdeburg, Fakultät für Mathematik, Dissertation, 2016

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is increasingly recognized that ecological restoration demands conservation action beyond the borders of existing protected areas. This requires the coordination of land uses and management over a larger area, usually with a range of partners, which presents novel institutional challenges for conservation planners. Interviews were undertaken with managers of a purposive sample of large-scale conservation areas in the UK. Interviews were open-ended and analyzed using standard qualitative methods. Results show a wide variety of organizations are involved in large-scale conservation projects, and that partnerships take time to create and demand resilience in the face of different organizational practices, staff turnover, and short-term funding. Successful partnerships with local communities depend on the establishment of trust and the availability of external funds to support conservation land uses. We conclude that there is no single institutional model for large-scale conservation: success depends on finding institutional strategies that secure long-term conservation outcomes, and ensure that conservation gains are not reversed when funding runs out, private owners change priorities, or land changes hands.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background and Aims: True Colours is an online prospective mood-monitoring system developed at the University of Oxford to assist local patients and clinicians with monitoring course of illness in bipolar disorder. We report our initial experiences of using True Colours for research purposes in the Bipolar Disorder Research Network (BDRN; www.bdrn.org), a large research network of individuals with mood disorders spread throughout the UK. Methods: After initial piloting to ensure the practicality/acceptability of using True Colours within BDRN, we invited all BDRN participants (n = 7000) to participate in weekly True Colours ratings via three postal invitations sent over an 8-month period. Results: Following the three postal invitations, 915 individuals have so far expressed an interest in joining True Colours, and, of these, 662 (72.3%) have registered. 32 of those who registered for True Colours (5%) have so far asked to leave the system. Positive feedback from participants has focused around the ease of use and convenience of True Colours and potential clinical utility of the graphical representation of weekly mood scores. Conclusions: We have demonstrated that large-scale prospective mood monitoring for research purposes using a contemporary online approach is feasible. Challenges have included: (i) variation in participants’ technological ability; (ii) management of requests for clinical advice based on mood scores within a research setting; and, (iii) resources required to provide access and on-going support for participants using True Colours. We continue to expand recruitment to True Colours within BDRN, and plan to trial email invitations in the next phase of recruitment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper various techniques in relation to large-scale systems are presented. At first, explanation of large-scale systems and differences from traditional systems are given. Next, possible specifications and requirements on hardware and software are listed. Finally, examples of large-scale systems are presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present Dithen, a novel computation-as-a-service (CaaS) cloud platform specifically tailored to the parallel ex-ecution of large-scale multimedia tasks. Dithen handles the upload/download of both multimedia data and executable items, the assignment of compute units to multimedia workloads, and the reactive control of the available compute units to minimize the cloud infrastructure cost under deadline-abiding execution. Dithen combines three key properties: (i) the reactive assignment of individual multimedia tasks to available computing units according to availability and predetermined time-to-completion constraints; (ii) optimal resource estimation based on Kalman-filter estimates; (iii) the use of additive increase multiplicative decrease (AIMD) algorithms (famous for being the resource management in the transport control protocol) for the control of the number of units servicing workloads. The deployment of Dithen over Amazon EC2 spot instances is shown to be capable of processing more than 80,000 video transcoding, face detection and image processing tasks (equivalent to the processing of more than 116 GB of compressed data) for less than $1 in billing cost from EC2. Moreover, the proposed AIMD-based control mechanism, in conjunction with the Kalman estimates, is shown to provide for more than 27% reduction in EC2 spot instance cost against methods based on reactive resource estimation. Finally, Dithen is shown to offer a 38% to 500% reduction of the billing cost against the current state-of-the-art in CaaS platforms on Amazon EC2 (Amazon Lambda and Amazon Autoscale). A baseline version of Dithen is currently available at dithen.com.