920 resultados para Distributed Control Problems


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, wireless communication infrastructures have been widely deployed for both personal and business applications. IEEE 802.11 series Wireless Local Area Network (WLAN) standards attract lots of attention due to their low cost and high data rate. Wireless ad hoc networks which use IEEE 802.11 standards are one of hot spots of recent network research. Designing appropriate Media Access Control (MAC) layer protocols is one of the key issues for wireless ad hoc networks. ^ Existing wireless applications typically use omni-directional antennas. When using an omni-directional antenna, the gain of the antenna in all directions is the same. Due to the nature of the Distributed Coordination Function (DCF) mechanism of IEEE 802.11 standards, only one of the one-hop neighbors can send data at one time. Nodes other than the sender and the receiver must be either in idle or listening state, otherwise collisions could occur. The downside of the omni-directionality of antennas is that the spatial reuse ratio is low and the capacity of the network is considerably limited. ^ It is therefore obvious that the directional antenna has been introduced to improve spatial reutilization. As we know, a directional antenna has the following benefits. It can improve transport capacity by decreasing interference of a directional main lobe. It can increase coverage range due to a higher SINR (Signal Interference to Noise Ratio), i.e., with the same power consumption, better connectivity can be achieved. And the usage of power can be reduced, i.e., for the same coverage, a transmitter can reduce its power consumption. ^ To utilizing the advantages of directional antennas, we propose a relay-enabled MAC protocol. Two relay nodes are chosen to forward data when the channel condition of direct link from the sender to the receiver is poor. The two relay nodes can transfer data at the same time and a pipelined data transmission can be achieved by using directional antennas. The throughput can be improved significant when introducing the relay-enabled MAC protocol. ^ Besides the strong points, directional antennas also have some explicit drawbacks, such as the hidden terminal and deafness problems and the requirements of retaining location information for each node. Therefore, an omni-directional antenna should be used in some situations. The combination use of omni-directional and directional antennas leads to the problem of configuring heterogeneous antennas, i e., given a network topology and a traffic pattern, we need to find a tradeoff between using omni-directional and using directional antennas to obtain a better network performance over this configuration. ^ Directly and mathematically establishing the relationship between the network performance and the antenna configurations is extremely difficult, if not intractable. Therefore, in this research, we proposed several clustering-based methods to obtain approximate solutions for heterogeneous antennas configuration problem, which can improve network performance significantly. ^ Our proposed methods consist of two steps. The first step (i.e., clustering links) is to cluster the links into different groups based on the matrix-based system model. After being clustered, the links in the same group have similar neighborhood nodes and will use the same type of antenna. The second step (i.e., labeling links) is to decide the type of antenna for each group. For heterogeneous antennas, some groups of links will use directional antenna and others will adopt omni-directional antenna. Experiments are conducted to compare the proposed methods with existing methods. Experimental results demonstrate that our clustering-based methods can improve the network performance significantly. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the recent explosion in the complexity and amount of digital multimedia data, there has been a huge impact on the operations of various organizations in distinct areas, such as government services, education, medical care, business, entertainment, etc. To satisfy the growing demand of multimedia data management systems, an integrated framework called DIMUSE is proposed and deployed for distributed multimedia applications to offer a full scope of multimedia related tools and provide appealing experiences for the users. This research mainly focuses on video database modeling and retrieval by addressing a set of core challenges. First, a comprehensive multimedia database modeling mechanism called Hierarchical Markov Model Mediator (HMMM) is proposed to model high dimensional media data including video objects, low-level visual/audio features, as well as historical access patterns and frequencies. The associated retrieval and ranking algorithms are designed to support not only the general queries, but also the complicated temporal event pattern queries. Second, system training and learning methodologies are incorporated such that user interests are mined efficiently to improve the retrieval performance. Third, video clustering techniques are proposed to continuously increase the searching speed and accuracy by architecting a more efficient multimedia database structure. A distributed video management and retrieval system is designed and implemented to demonstrate the overall performance. The proposed approach is further customized for a mobile-based video retrieval system to solve the perception subjectivity issue by considering individual user's profile. Moreover, to deal with security and privacy issues and concerns in distributed multimedia applications, DIMUSE also incorporates a practical framework called SMARXO, which supports multilevel multimedia security control. SMARXO efficiently combines role-based access control (RBAC), XML and object-relational database management system (ORDBMS) to achieve the target of proficient security control. A distributed multimedia management system named DMMManager (Distributed MultiMedia Manager) is developed with the proposed framework DEMUR; to support multimedia capturing, analysis, retrieval, authoring and presentation in one single framework.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Developing analytical models that can accurately describe behaviors of Internet-scale networks is difficult. This is due, in part, to the heterogeneous structure, immense size and rapidly changing properties of today's networks. The lack of analytical models makes large-scale network simulation an indispensable tool for studying immense networks. However, large-scale network simulation has not been commonly used to study networks of Internet-scale. This can be attributed to three factors: 1) current large-scale network simulators are geared towards simulation research and not network research, 2) the memory required to execute an Internet-scale model is exorbitant, and 3) large-scale network models are difficult to validate. This dissertation tackles each of these problems. ^ First, this work presents a method for automatically enabling real-time interaction, monitoring, and control of large-scale network models. Network researchers need tools that allow them to focus on creating realistic models and conducting experiments. However, this should not increase the complexity of developing a large-scale network simulator. This work presents a systematic approach to separating the concerns of running large-scale network models on parallel computers and the user facing concerns of configuring and interacting with large-scale network models. ^ Second, this work deals with reducing memory consumption of network models. As network models become larger, so does the amount of memory needed to simulate them. This work presents a comprehensive approach to exploiting structural duplications in network models to dramatically reduce the memory required to execute large-scale network experiments. ^ Lastly, this work addresses the issue of validating large-scale simulations by integrating real protocols and applications into the simulation. With an emulation extension, a network simulator operating in real-time can run together with real-world distributed applications and services. As such, real-time network simulation not only alleviates the burden of developing separate models for applications in simulation, but as real systems are included in the network model, it also increases the confidence level of network simulation. This work presents a scalable and flexible framework to integrate real-world applications with real-time simulation.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present investigation examined the relationships among personality (as conceptualized by the Big Five Factors), leader-member exchange (LMX) quality, action control, organizational citizenship behaviors (OCB), and overall job performance (OJP). Two mediator variables were proposed and tested in this study: LMX and Action Control. Two-hundred and seven currently employed regular elementary school classroom teachers provided data during the 2000–2001 academic school year. Teachers provided personality, LMX quality (member or subordinate perspective), action control, job tenure, and demographic data. Nine school administrators (i.e., Principals, Assistant Principals) were the source for supervisor ratings of OCB, OJP, and LMX quality (leader or supervisor perspective). In eight of the nine total schools, teachers completed questionnaires during an after-school teacher gathering; in the remaining school location questionnaires were dropped off, distributed to teachers, and re-collected two weeks later. Results indicated a significant relationship between the OCB scale and overall supervisory ratings of OJP. The relationship among the big five factors of personality and OJP did not reach statistical significance, nor did the relationships among personality and OCB. The data indicated that none of the teacher tenure variables (i.e., teacher, school, or time worked with principal tenure) moderated the personality-OCB relationship nor the personality-OJP relationship. Finally, a review of the correlations among the variables of interest precluded conducting a mediation between personality-performance by OCB, mediation of personality-OCB by action control, and mediation of personality-OCB by LMX. In conclusion, the data reveal that personality was not significantly correlated with supervisory ratings of OJP or significantly related to supervisory ratings of overall OCB. Moreover, LMX quality and action control did not mediate the relationships between Personality-OJP nor the Personality-OCB relationship. Significant relationships were found between disengagement and overall LMX quality and between Initiative and overall LMX quality (both LMX-Teacher perspectives) as well as between personality variables and both Disengagement and Initiative action control variables. Despite the limitations inherent in this study, these latter findings suggest “lessons” for teachers and school administrators alike. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many systems and applications are continuously producing events. These events are used to record the status of the system and trace the behaviors of the systems. By examining these events, system administrators can check the potential problems of these systems. If the temporal dynamics of the systems are further investigated, the underlying patterns can be discovered. The uncovered knowledge can be leveraged to predict the future system behaviors or to mitigate the potential risks of the systems. Moreover, the system administrators can utilize the temporal patterns to set up event management rules to make the system more intelligent. With the popularity of data mining techniques in recent years, these events grad- ually become more and more useful. Despite the recent advances of the data mining techniques, the application to system event mining is still in a rudimentary stage. Most of works are still focusing on episodes mining or frequent pattern discovering. These methods are unable to provide a brief yet comprehensible summary to reveal the valuable information from the high level perspective. Moreover, these methods provide little actionable knowledge to help the system administrators to better man- age the systems. To better make use of the recorded events, more practical techniques are required. From the perspective of data mining, three correlated directions are considered to be helpful for system management: (1) Provide concise yet comprehensive summaries about the running status of the systems; (2) Make the systems more intelligence and autonomous; (3) Effectively detect the abnormal behaviors of the systems. Due to the richness of the event logs, all these directions can be solved in the data-driven manner. And in this way, the robustness of the systems can be enhanced and the goal of autonomous management can be approached. This dissertation mainly focuses on the foregoing directions that leverage tem- poral mining techniques to facilitate system management. More specifically, three concrete topics will be discussed, including event, resource demand prediction, and streaming anomaly detection. Besides the theoretic contributions, the experimental evaluation will also be presented to demonstrate the effectiveness and efficacy of the corresponding solutions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distributed Generation (DG) from alternate sources and smart grid technologies represent good solutions for the increase in energy demands. Employment of these DG assets requires solutions for the new technical challenges that are accompanied by the integration and interconnection into operational power systems. A DG infrastructure comprised of alternate energy sources in addition to conventional sources, is developed as a test bed. The test bed is operated by synchronizing, wind, photovoltaic, fuel cell, micro generator and energy storage assets, in addition to standard AC generators. Connectivity of these DG assets is tested for viability and for their operational characteristics. The control and communication layers for dynamic operations are developed to improve the connectivity of alternates to the power system. A real time application for the operation of alternate sources in microgrids is developed. Multi agent approach is utilized to improve stability and sequences of actions for black start are implemented. Experiments for control and stability issues related to dynamic operation under load conditions have been conducted and verified.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A typical electrical power system is characterized by centr alization of power gene- ration. However, with the restructuring of the electric sys tem, this topology is changing with the insertion of generators in parallel with the distri bution system (distributed gene- ration) that provides several benefits to be located near to e nergy consumers. Therefore, the integration of distributed generators, especially fro m renewable sources in the Brazi- lian system has been common every year. However, this new sys tem topology may result in new challenges in the field of the power system control, ope ration, and protection. One of the main problems related to the distributed generati on is the islanding formation, witch can result in safety risk to the people and to the power g rid. Among the several islanding protection techniques, passive techniques have low implementation cost and simplicity, requiring only voltage and current measuremen ts to detect system problems. This paper proposes a protection system based on the wavelet transform with overcur- rent and under/overvoltage functions as well as infomation of fault-induced transients in order to provide a fast detection and identification of fault s in the system. The propo- sed protection scheme was evaluated through simulation and experimental studies, with performance similar to the overcurrent and under/overvolt age conventional methods, but with the additional detection of the exact moment of the fault.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Collaborative sharing of information is becoming much more needed technique to achieve complex goals in today's fast-paced tech-dominant world. Personal Health Record (PHR) system has become a popular research area for sharing patients informa- tion very quickly among health professionals. PHR systems store and process sensitive information, which should have proper security mechanisms to protect patients' private data. Thus, access control mechanisms of the PHR should be well-defined. Secondly, PHRs should be stored in encrypted form. Cryptographic schemes offering a more suitable solution for enforcing access policies based on user attributes are needed for this purpose. Attribute-based encryption can resolve these problems, we propose a patient-centric framework that protects PHRs against untrusted service providers and malicious users. In this framework, we have used Ciphertext Policy Attribute Based Encryption scheme as an efficient cryptographic technique, enhancing security and privacy of the system, as well as enabling access revocation. Patients can encrypt their PHRs and store them on untrusted storage servers. They also maintain full control over access to their PHR data by assigning attribute-based access control to selected data users, and revoking unauthorized users instantly. In order to evaluate our system, we implemented CP-ABE library and web services as part of our framework. We also developed an android application based on the framework that allows users to register into the system, encrypt their PHR data and upload to the server, and at the same time authorized users can download PHR data and decrypt it. Finally, we present experimental results and performance analysis. It shows that the deployment of the proposed system would be practical and can be applied into practice.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The study is funded by Chiesi Farmaceutici S.p.A., Parma, Italy

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Acknowledgements The authors would like to thank the Scottish Diabetes Research Network Epidemiology Group for granting permission to use this database. They also thank the data management team in the University of Aberdeen who were the initial conduit for access to these data and also provided validation to the various data cleaning criteria applied. Jeremy J Walker, University of Edinburgh, was invaluable for the original funding application and initial exploration of data. HSRU is funded by the Chief Scientist Office of the Scottish Government Health and Social Care Directorates. Funding Chief Scientist Office (CSO) reference number: CZG/2/571.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.

Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.

One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.

Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.

In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.

Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.

The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.

Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Secure Access For Everyone (SAFE), is an integrated system for managing trust

using a logic-based declarative language. Logical trust systems authorize each

request by constructing a proof from a context---a set of authenticated logic

statements representing credentials and policies issued by various principals

in a networked system. A key barrier to practical use of logical trust systems

is the problem of managing proof contexts: identifying, validating, and

assembling the credentials and policies that are relevant to each trust

decision.

SAFE addresses this challenge by (i) proposing a distributed authenticated data

repository for storing the credentials and policies; (ii) introducing a

programmable credential discovery and assembly layer that generates the

appropriate tailored context for a given request. The authenticated data

repository is built upon a scalable key-value store with its contents named by

secure identifiers and certified by the issuing principal. The SAFE language

provides scripting primitives to generate and organize logic sets representing

credentials and policies, materialize the logic sets as certificates, and link

them to reflect delegation patterns in the application. The authorizer fetches

the logic sets on demand, then validates and caches them locally for further

use. Upon each request, the authorizer constructs the tailored proof context

and provides it to the SAFE inference for certified validation.

Delegation-driven credential linking with certified data distribution provides

flexible and dynamic policy control enabling security and trust infrastructure

to be agile, while addressing the perennial problems related to today's

certificate infrastructure: automated credential discovery, scalable

revocation, and issuing credentials without relying on centralized authority.

We envision SAFE as a new foundation for building secure network systems. We

used SAFE to build secure services based on case studies drawn from practice:

(i) a secure name service resolver similar to DNS that resolves a name across

multi-domain federated systems; (ii) a secure proxy shim to delegate access

control decisions in a key-value store; (iii) an authorization module for a

networked infrastructure-as-a-service system with a federated trust structure

(NSF GENI initiative); and (iv) a secure cooperative data analytics service

that adheres to individual secrecy constraints while disclosing the data. We

present empirical evaluation based on these case studies and demonstrate that

SAFE supports a wide range of applications with low overhead.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distributed Computing frameworks belong to a class of programming models that allow developers to

launch workloads on large clusters of machines. Due to the dramatic increase in the volume of

data gathered by ubiquitous computing devices, data analytic workloads have become a common

case among distributed computing applications, making Data Science an entire field of

Computer Science. We argue that Data Scientist's concern lays in three main components: a dataset,

a sequence of operations they wish to apply on this dataset, and some constraint they may have

related to their work (performances, QoS, budget, etc). However, it is actually extremely

difficult, without domain expertise, to perform data science. One need to select the right amount

and type of resources, pick up a framework, and configure it. Also, users are often running their

application in shared environments, ruled by schedulers expecting them to specify precisely their resource

needs. Inherent to the distributed and concurrent nature of the cited frameworks, monitoring and

profiling are hard, high dimensional problems that block users from making the right

configuration choices and determining the right amount of resources they need. Paradoxically, the

system is gathering a large amount of monitoring data at runtime, which remains unused.

In the ideal abstraction we envision for data scientists, the system is adaptive, able to exploit

monitoring data to learn about workloads, and process user requests into a tailored execution

context. In this work, we study different techniques that have been used to make steps toward

such system awareness, and explore a new way to do so by implementing machine learning

techniques to recommend a specific subset of system configurations for Apache Spark applications.

Furthermore, we present an in depth study of Apache Spark executors configuration, which highlight

the complexity in choosing the best one for a given workload.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Because the interactions between feedforward influences are inextricably linked during many motor outputs (including but not limited to walking), the contribution of descending inputs to the generation of movements is difficult to study. Here we take advantage of the relatively small number of descending neurons (DNs) in the Drosophila melanogaster model system. We first characterize the number and distribution of the DN populations, then present a novel load free preparation, which enables the study of descending control on limb movements in a context where sensory feedback can be is reduced while leaving the nervous system, musculature, and cuticle of the animal relatively intact. Lastly we use in-vivo whole cell patch clamp electrophysiology to characterize the role of individual DNs in response to specific sensory stimuli and in relationship to movement. We find that there are approximately 1100 DNs in Drosophila that are distributed across six clusters. Input from these DNs is not necessary for coordinated motor activity, which can be generated by the thoracic ganglion, but is necessary for the specific combinations of joint movements typically observed in walking. Lastly, we identify a particular cluster of DNs that are tuned to sensory stimuli and innervate the leg neuromeres. We propose that a multi-layered interaction between these DNs, other DNs, and motor circuits in the thoracic ganglia enable the diverse but well-coordinated range of motor outputs an animal might exhibit.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Requirements Engineering (RE) has received much attention in research and practice due to its importance to software project success. Its inter-disciplinary nature, the dependency to the customer, and its inherent uncertainty still render the discipline diffcult to investigate. This results in a lack of empirical data. These are necessary, however, to demonstrate which practically relevant RE problems exist and to what extent they matter. Motivated by this situation, we initiated the Naming the Pain in Requirements Engineering (NaPiRE) initiative which constitutes a globally distributed, bi-yearly replicated family of surveys on the status quo and problems in practical RE.

In this article, we report on the analysis of data obtained from 228 companies in 10 countries. We apply Grounded Theory to the data obtained from NaPiRE and reveal which contemporary problems practitioners encounter. To this end, we analyse 21 problems derived from the literature with respect to their relevance and criticality in dependency to their context, and we complement this picture with a cause-effect analysis showing the causes and effects surrounding the most critical problems.

Our results give us a better understanding of which problems exist and how they manifest themselves in practical environments. Thus, we provide a rst step to ground contributions to RE on empirical observations which, by now, were dominated by conventional wisdom only.