932 resultados para data complexity
Error, Bias, and Long-Branch Attraction in Data for Two Chloroplast Photosystem Genes in Seed Plants
Resumo:
Sequences of two chloroplast photosystem genes, psaA and psbB, together comprising about 3,500 bp, were obtained for all five major groups of extant seed plants and several outgroups among other vascular plants. Strongly supported, but significantly conflicting, phylogenetic signals were obtained in parsimony analyses from partitions of the data into first and second codon positions versus third positions. In the former, both genes agreed on a monophyletic gymnosperms, with Gnetales closely related to certain conifers. In the latter, Gnetales are inferred to be the sister group of all other seed plants, with gymnosperms paraphyletic. None of the data supported the modern ‘‘anthophyte hypothesis,’’ which places Gnetales as the sister group of flowering plants. A series of simulation studies were undertaken to examine the error rate for parsimony inference. Three kinds of errors were examined: random error, systematic bias (both properties of finite data sets), and statistical inconsistency owing to long-branch attraction (an asymptotic property). Parsimony reconstructions were extremely biased for third-position data for psbB. Regardless of the true underlying tree, a tree in which Gnetales are sister to all other seed plants was likely to be reconstructed for these data. None of the combinations of genes or partitions permits the anthophyte tree to be reconstructed with high probability. Simulations of progressively larger data sets indicate the existence of long-branch attraction (statistical inconsistency) for third-position psbB data if either the anthophyte tree or the gymnosperm tree is correct. This is also true for the anthophyte tree using either psaA third positions or psbB first and second positions. A factor contributing to bias and inconsistency is extremely short branches at the base of the seed plant radiation, coupled with extremely high rates in Gnetales and nonseed plant outgroups. M. J. Sanderson,* M. F. Wojciechowski,*† J.-M. Hu,* T. Sher Khan,* and S. G. Brady
Resumo:
Health Information Systems (HIS) make extensive use of Information and Communication Technologies (ICT). The use of ICT aids in improving the quality and efficiency of healthcare services by making healthcare information available at the point of care (Goldstein, Groen, Ponkshe, and Wine, 2007). The increasing availability of healthcare data presents security and privacy issues which have not yet been fully addressed (Liu, Caelli, May, and Croll, 2008a). Healthcare organisations have to comply with the security and privacy requirements stated in laws, regulations and ethical standards, while managing healthcare information. Protecting the security and privacy of healthcare information is a very complex task (Liu, May, Caelli and Croll, 2008b). In order to simplify the complexity of providing security and privacy in HIS, appropriate information security services and mechanisms have to be implemented. Solutions at the application layer have already been implemented in HIS such as those existing in healthcare web services (Weaver et al., 2003). In addition, Discretionary Access Control (DAC) is the most commonly implemented access control model to restrict access to resources at the OS layer (Liu, Caelli, May, Croll and Henricksen, 2007a). Nevertheless, the combination of application security mechanisms and DAC at the OS layer has been stated to be insufficient in satisfying security requirements in computer systems (Loscocco et al., 1998). This thesis investigates the feasibility of implementing Security Enhanced Linux (SELinux) to enforce a Role-Based Access Control (RBAC) policy to help protect resources at the Operating System (OS) layer. SELinux provides Mandatory Access Control (MAC) mechanisms at the OS layer. These mechanisms can contain the damage from compromised applications and restrict access to resources according to the security policy implemented. The main contribution of this research is to provide a modern framework to implement and manage SELinux in HIS. The proposed framework introduces SELinux Profiles to restrict access permissions over the system resources to authorised users. The feasibility of using SELinux profiles in HIS was demonstrated through the creation of a prototype, which was submitted to various attack scenarios. The prototype was also subjected to testing during emergency scenarios, where changes to the security policies had to be made on the spot. Attack scenarios were based on vulnerabilities common at the application layer. SELinux demonstrated that it could effectively contain attacks at the application layer and provide adequate flexibility during emergency situations. However, even with the use of current tools, the development of SELinux policies can be very complex. Further research has to be made in order to simplify the management of SELinux policies and access permissions. In addition, SELinux related technologies, such as the Policy Management Server by Tresys Technologies, need to be researched in order to provide solutions at different layers of protection.
Resumo:
CFO and I/Q mismatch could cause significant performance degradation to OFDM systems. Their estimation and compensation are generally difficult as they are entangled in the received signal. In this paper, we propose some low-complexity estimation and compensation schemes in the receiver, which are robust to various CFO and I/Q mismatch values although the performance is slightly degraded for very small CFO. These schemes consist of three steps: forming a cosine estimator free of I/Q mismatch interference, estimating I/Q mismatch using the estimated cosine value, and forming a sine estimator using samples after I/Q mismatch compensation. These estimators are based on the perception that an estimate of cosine serves much better as the basis for I/Q mismatch estimation than the estimate of CFO derived from the cosine function. Simulation results show that the proposed schemes can improve system performance significantly, and they are robust to CFO and I/Q mismatch.
Resumo:
With the advent of Service Oriented Architecture, Web Services have gained tremendous popularity. Due to the availability of a large number of Web services, finding an appropriate Web service according to the requirement of the user is a challenge. This warrants the need to establish an effective and reliable process of Web service discovery. A considerable body of research has emerged to develop methods to improve the accuracy of Web service discovery to match the best service. The process of Web service discovery results in suggesting many individual services that partially fulfil the user’s interest. By considering the semantic relationships of words used in describing the services as well as the use of input and output parameters can lead to accurate Web service discovery. Appropriate linking of individual matched services should fully satisfy the requirements which the user is looking for. This research proposes to integrate a semantic model and a data mining technique to enhance the accuracy of Web service discovery. A novel three-phase Web service discovery methodology has been proposed. The first phase performs match-making to find semantically similar Web services for a user query. In order to perform semantic analysis on the content present in the Web service description language document, the support-based latent semantic kernel is constructed using an innovative concept of binning and merging on the large quantity of text documents covering diverse areas of domain of knowledge. The use of a generic latent semantic kernel constructed with a large number of terms helps to find the hidden meaning of the query terms which otherwise could not be found. Sometimes a single Web service is unable to fully satisfy the requirement of the user. In such cases, a composition of multiple inter-related Web services is presented to the user. The task of checking the possibility of linking multiple Web services is done in the second phase. Once the feasibility of linking Web services is checked, the objective is to provide the user with the best composition of Web services. In the link analysis phase, the Web services are modelled as nodes of a graph and an allpair shortest-path algorithm is applied to find the optimum path at the minimum cost for traversal. The third phase which is the system integration, integrates the results from the preceding two phases by using an original fusion algorithm in the fusion engine. Finally, the recommendation engine which is an integral part of the system integration phase makes the final recommendations including individual and composite Web services to the user. In order to evaluate the performance of the proposed method, extensive experimentation has been performed. Results of the proposed support-based semantic kernel method of Web service discovery are compared with the results of the standard keyword-based information-retrieval method and a clustering-based machine-learning method of Web service discovery. The proposed method outperforms both information-retrieval and machine-learning based methods. Experimental results and statistical analysis also show that the best Web services compositions are obtained by considering 10 to 15 Web services that are found in phase-I for linking. Empirical results also ascertain that the fusion engine boosts the accuracy of Web service discovery by combining the inputs from both the semantic analysis (phase-I) and the link analysis (phase-II) in a systematic fashion. Overall, the accuracy of Web service discovery with the proposed method shows a significant improvement over traditional discovery methods.
Resumo:
Principal Topic In this paper we seek to highlight the important intermediate role that the gestation process plays in entrepreneurship by examining its key antecedents and its consequences for new venture emergence. In doing so we take a behavioural perspective and argue that it is not only what a nascent venture is, but what it does (Katz & Gartner, 1988; Shane & Delmar, 2004; Reynolds, 2007) and when it does it during start-up (Reynolds & Miller, 1992; Lichtenstein, Carter, Dooley & Gartner, 2007) that is important. To extend an analogy from biological development, what we suggest is that the way a new venture is nurtured is just as fundamental as its nature. Much prior research has focused on the nature of new ventures and attempted to attribute variations in outcomes directly to the impact resource endowments and investments have. While there is little doubt that venture resource attributes such as human capital, and specifically prior entrepreneurial experience (Alsos & Kolvereid, 1998), access to social (Davidsson & Honig, 2003) and financial capital have an influence. Resource attributes themselves are distal from successful start-up endeavours and remain inanimate if not for the actions of the nascent venture. The key contribution we make is to shift focus from whether or not actions are taken, but when these actions happen and how that is situated in the overall gestation process. Thus, we suggest that it is gestation process dynamics, or when gestation actions occur, that is more proximal to venture outcomes and we focus on this. Recently scholars have highlighted the complexity that exists in the start-up or gestation process, be it temporal or contextual (Liao, Welsch & Tan, 2005; Lichtenstein et al. 2007). There is great variation in how long a start-up process might take (Reynolds & Miller, 1992), some processes require less action than others (Carter, Gartner & Reynolds, 1996), and the overall intensity of the start-up effort is also deemed important (Reynolds, 2007). And, despite some evidence that particular activities are more influential than others (Delmar & Shane, 2003), the order in which events may happen is, until now, largely indeterminate as regard its influence on success (Liao & Welsch, 2008). We suggest that it is this complexity of the intervening gestation process that attenuates the effect of resource endowment and has resulted in mixed findings in previous research. Thus, in order to reduce complexity we shall take a holistic view of the gestation process and argue that it is its’ dynamic properties that determine nascent venture attempt outcomes. Importantly, we acknowledge that particular gestation processes of themselves would not guarantee successful start-up, but it is more correctly the fit between the process dynamics and the ventures attributes (Davidsson, 2005) that is influential. So we aim to examine process dynamics by comparing sub-groups of venture types by resource attributes. Thus, as an initial step toward unpacking the complexity of the gestation process, this paper aims to establish the importance of its role as an intermediary between attributes of the nascent venture and the emergence of that venture. Here, we make a contribution by empirically examining gestation process dynamics and their fit with venture attributes. We do this by firstly, examining that nature of the influence that venture attributes such as human and social capital have on the dynamics of the gestation process, and secondly by investigating the effect that gestation process dynamics have on venture creation outcomes. Methodology and Propositions In order to explore the importance that gestation processes dynamics have in nascent entrepreneurship we conduct an empirical study of ventures start-ups. Data is drawn from a screened random sample of 625 Australian nascent business ventures prior to them achieving consistent outcomes in the market. This data was collected during 2007/8 and 2008/9 as part of the Comprehensive Australian Study of Entrepreneurial Emergence (CAUSEE) project (Davidsson et al., 2008). CAUSEE is a longitudinal panel study conducted over four years, sourcing information from annually administered telephone surveys. Importantly for our study, this methodology allows for the capture and tracking of active nascent venture creation as it happens, thus reducing hindsight and selection biases. In addition, improved tests of causality may be made given that outcome measures are temporally removed from preceding events. The data analysed in this paper represents the first two of these four years, and for the first time has access to follow-up outcome measures for these venture attempts: where 260 were successful, 126 were abandoned, and 191 are still in progress. With regards to venture attributes as gestation process antecedents, we examine specific human capital measured as successful prior experience in entrepreneurship, and direct social capital of the venture as ‘team start-ups’. In assessing gestation process dynamics we follow Lichtenstein et al. (2007) to suggest that the rate, concentration and timing of gestation activities may be used to summarise the complexity dynamics of that process. In addition, we extend this set of measures to include the interaction of discovery and exploitation by way of changes made to the venture idea. Those ventures with successful prior experience or those who conduct symbiotic parallel start-up attempts may be able to, or be forced to, leave their gestation action until later and still derive a successful outcome. In addition access to direct social capital may provide the support upon which the venture may draw in order to persevere in the face of adversity, turning a seemingly futile start-up attempt into a success. On the other hand prior experience may engender the foresight to terminate a venture attempt early should it be seen to be going nowhere. The temporal nature of these conjectures highlight the importance that process dynamics play and will be examined in this research Statistical models are developed to examine gestation process dynamics. We use multivariate general linear modelling to analyse how human and social capital factors influence gestation process dynamics. In turn, we use event history models and stratified Cox regression to assess the influence that gestation process dynamics have on venture outcomes. Results and Implications What entrepreneurs do is of interest to both scholars and practitioners’ alike. Thus the results of this research are important since they focus on nascent behaviour and its outcomes. While venture attributes themselves may be influential this is of little actionable assistance to practitioners. For example it is unhelpful to say to the prospective first time entrepreneur “you’ll be more successful if you have lots of prior experience in firm start-ups”. This research attempts to close this relevance gap by addressing what gestation behaviours might be appropriate, when actions best be focused, and most importantly in what circumstances. Further, we make a contribution to the entrepreneurship literature, examining the role that gestation process dynamics play in outcomes, by specifically attributing these to the nature of the venture itself. This extension is to the best of our knowledge new to the research field.
Resumo:
We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.
Resumo:
The construction industry has adapted information technology in its processes in terms of computer aided design and drafting, construction documentation and maintenance. The data generated within the construction industry has become increasingly overwhelming. Data mining is a sophisticated data search capability that uses classification algorithms to discover patterns and correlations within a large volume of data. This paper presents the selection and application of data mining techniques on maintenance data of buildings. The results of applying such techniques and potential benefits of utilising their results to identify useful patterns of knowledge and correlations to support decision making of improving the management of building life cycle are presented and discussed.
Resumo:
Introduction Many bilinguals will have had the experience of unintentionally reading something in a language other than the intended one (e.g. MUG to mean mosquito in Dutch rather than a receptacle for a hot drink, as one of the possible intended English meanings), of finding themselves blocked on a word for which many alternatives suggest themselves (but, somewhat annoyingly, not in the right language), of their accent changing when stressed or tired and, occasionally, of starting to speak in a language that is not understood by those around them. These instances where lexical access appears compromised and control over language behavior is reduced hint at the intricate structure of the bilingual lexical architecture and the complexity of the processes by which knowledge is accessed and retrieved. While bilinguals might tend to blame word finding and other language problems on their bilinguality, these difficulties per se are not unique to the bilingual population. However, what is unique, and yet far more common than is appreciated by monolinguals, is the cognitive architecture that subserves bilingual language processing. With bilingualism (and multilingualism) the rule rather than the exception (Grosjean, 1982), this architecture may well be the default structure of the language processing system. As such, it is critical that we understand more fully not only how the processing of more than one language is subserved by the brain, but also how this understanding furthers our knowledge of the cognitive architecture that encapsulates the bilingual mental lexicon. The neurolinguistic approach to bilingualism focuses on determining the manner in which the two (or more) languages are stored in the brain and how they are differentially (or similarly) processed. The underlying assumption is that the acquisition of more than one language requires at the very least a change to or expansion of the existing lexicon, if not the formation of language-specific components, and this is likely to manifest in some way at the physiological level. There are many sources of information, ranging from data on bilingual aphasic patients (Paradis, 1977, 1985, 1997) to lateralization (Vaid, 1983; see Hull & Vaid, 2006, for a review), recordings of event-related potentials (ERPs) (e.g. Ardal et al., 1990; Phillips et al., 2006), and positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) studies of neurologically intact bilinguals (see Indefrey, 2006; Vaid & Hull, 2002, for reviews). Following the consideration of methodological issues and interpretative limitations that characterize these approaches, the chapter focuses on how the application of these approaches has furthered our understanding of (1) selectivity of bilingual lexical access, (2) distinctions between word types in the bilingual lexicon and (3) control processes that enable language selection.
Using Agents for Mining Maintenance Data while interacting in 3D Objectoriented Virtual Environments
Resumo:
This report demonstrates the development of: (a) object-oriented representation to provide 3D interactive environment using data provided by Woods Bagot; (b) establishing basis of agent technology for mining building maintenance data, and (C) 3D interaction in virtual environments using object-oriented representation. Applying data mining over industry maintenance database has been demonstrated in the previous report.
Resumo:
This report demonstrates the development of: • Development of software agents for data mining • Link data mining to building model in virtual environments • Link knowledge development with building model in virtual environments • Demonstration of software agents for data mining • Populate with maintenance data
Resumo:
The building life cycle process is complex and prone to fragmentation as it moves through its various stages. The number of participants, and the diversity, specialisation and isolation both in space and time of their activities, have dramatically increased over time. The data generated within the construction industry has become increasingly overwhelming. Most currently available computer tools for the building industry have offered productivity improvement in the transmission of graphical drawings and textual specifications, without addressing more fundamental changes in building life cycle management. Facility managers and building owners are primarily concerned with highlighting areas of existing or potential maintenance problems in order to be able to improve the building performance, satisfying occupants and minimising turnover especially the operational cost of maintenance. In doing so, they collect large amounts of data that is stored in the building’s maintenance database. The work described in this paper is targeted at adding value to the design and maintenance of buildings by turning maintenance data into information and knowledge. Data mining technology presents an opportunity to increase significantly the rate at which the volumes of data generated through the maintenance process can be turned into useful information. This can be done using classification algorithms to discover patterns and correlations within a large volume of data. This paper presents how and what data mining techniques can be applied on maintenance data of buildings to identify the impediments to better performance of building assets. It demonstrates what sorts of knowledge can be found in maintenance records. The benefits to the construction industry lie in turning passive data in databases into knowledge that can improve the efficiency of the maintenance process and of future designs that incorporate that maintenance knowledge.
Resumo:
Qualitative research methods require transparency to ensure the ‘trustworthiness’ of the data analysis. The intricate processes of organizing, coding and analyzing the data are often rendered invisible in the presentation of the research findings, which requires a ‘leap of faith’ for the reader. Computer assisted data analysis software can be used to make the research process more transparent, without sacrificing rich, interpretive analysis by the researcher. This article describes in detail how one software package was used in a poststructural study to link and code multiple forms of data to four research questions for fine-grained analysis. This description will be useful for researchers seeking to use qualitative data analysis software as an analytic tool.
Resumo:
This project, as part of a broader Sustainable Sub-divisions research agenda, addresses the role of natural ventilation in reducing the use of energy required to cool dwellings
Resumo:
In the case of industrial relations research, particularly that which sets out to examine practices within workplaces, the best way to study this real-life context is to work for the organisation. Studies conducted by researchers working within the organisation comprise some of the (broad) field’s classic research (cf. Roy, 1954; Burawoy, 1979). Participant and non-participant ethnographic research provides an opportunity to investigate workplace behaviour beyond the scope of questionnaires and interviews. However, we suggest that the data collected outside a workplace can be just as important as the data collected inside the organisation’s walls. In recent years the introduction of anti-smoking legislation in Australia has meant that people who smoke cigarettes are no longer allowed to do so inside buildings. Not only are smokers forced outside to engage in their habit, but they have to smoke prescribed distances from doorways, or in some workplaces outside the property line. This chapter considers the importance of cigarette-smoking employees in ethnographic research. Through data collected across three separate research projects, the chapter argues that smokers, as social outcasts in the workplace, can provide a wealth of important research data. We suggest that smokers also appear more likely to provide stories that contradict the ‘management’ or ‘organisational’ position. Thus, within the haze of smoke, researchers can uncover a level of discontent with the ‘corporate line’ presented inside the workplace. There are several aspects to the increased propensity of smokers to provide a contradictory or discontented story. It may be that the researcher is better able to establish a rapport with smokers, as there is a removal of the artificial wall a researcher presents as an outsider. It may also be that a research location physically outside the boundaries of the organisation provides workers with the freedom to express their discontent. The authors offer no definitive answers; rather, this chapter is intended to extend our knowledge of workplace research through highlighting the methodological value in using smokers as research subjects. We present the experience of three separate case studies where interactions with cigarette smokers have provided either important organisational data or alternatively a means of entering what Cunnison (1966) referred to as the ‘gossip circle’. The final section of the chapter draws on the evidence to demonstrate how the community of smokers, as social outcasts, are valuable in investigating workplace issues. For researchers and practitioners, these social outcasts may very well prove to be an important barometer of employee attitudes; attitudes that perhaps cannot be measured through traditional staff surveys.