976 resultados para Dataset
Resumo:
In this article we describe a semantic localization dataset for indoor environments named ViDRILO. The dataset provides five sequences of frames acquired with a mobile robot in two similar office buildings under different lighting conditions. Each frame consists of a point cloud representation of the scene and a perspective image. The frames in the dataset are annotated with the semantic category of the scene, but also with the presence or absence of a list of predefined objects appearing in the scene. In addition to the frames and annotations, the dataset is distributed with a set of tools for its use in both place classification and object recognition tasks. The large number of labeled frames in conjunction with the annotation scheme make this dataset different from existing ones. The ViDRILO dataset is released for use as a benchmark for different problems such as multimodal place classification and object recognition, 3D reconstruction or point cloud data compression.
Resumo:
This study presents the results of a systematic literature review on the combined field of Accessibility and Massive Open Online Courses (MOOCs), covering the time period from 2008 to 2016 July. This dataset updates the previous release from 2008 to 2016 May (http://hdl.handle.net/10045/54846).
Resumo:
This thesis presents a cloud-based software platform for sharing publicly available scientific datasets. The proposed platform leverages the potential of NoSQL databases and asynchronous IO technologies, such as Node.JS, in order to achieve high performances and flexible solutions. This solution will serve two main groups of users. The dataset providers, which are the researchers responsible for sharing and maintaining datasets, and the dataset users, that are those who desire to access the public data. To the former are given tools to easily publish and maintain large volumes of data, whereas the later are given tools to enable the preview and creation of subsets of the original data through the introduction of filter and aggregation operations. The choice of NoSQL over more traditional RDDMS emerged from and extended benchmark between relational databases (MySQL) and NoSQL (MongoDB) that is also presented in this thesis. The obtained results come to confirm the theoretical guarantees that NoSQL databases are more suitable for the kind of data that our system users will be handling, i. e., non-homogeneous data structures that can grow really fast. It is envisioned that a platform like this can lead the way to a new era of scientific data sharing where researchers are able to easily share and access all kinds of datasets, and even in more advanced scenarios be presented with recommended datasets and already existing research results on top of those recommendations.
Resumo:
This thesis presents a cloud-based software platform for sharing publicly available scientific datasets. The proposed platform leverages the potential of NoSQL databases and asynchronous IO technologies, such as Node.JS, in order to achieve high performances and flexible solutions. This solution will serve two main groups of users. The dataset providers, which are the researchers responsible for sharing and maintaining datasets, and the dataset users, that are those who desire to access the public data. To the former are given tools to easily publish and maintain large volumes of data, whereas the later are given tools to enable the preview and creation of subsets of the original data through the introduction of filter and aggregation operations. The choice of NoSQL over more traditional RDDMS emerged from and extended benchmark between relational databases (MySQL) and NoSQL (MongoDB) that is also presented in this thesis. The obtained results come to confirm the theoretical guarantees that NoSQL databases are more suitable for the kind of data that our system users will be handling, i. e., non-homogeneous data structures that can grow really fast. It is envisioned that a platform like this can lead the way to a new era of scientific data sharing where researchers are able to easily share and access all kinds of datasets, and even in more advanced scenarios be presented with recommended datasets and already existing research results on top of those recommendations.
Resumo:
info:eu-repo/semantics/publishedVersion
Resumo:
info:eu-repo/semantics/publishedVersion
Resumo:
Driving under the influence (DUI) is a major road safety problem. Historically, alcohol has been assumed to play a larger role in crashes and DUI education programs have reflected this assumption, although recent evidence suggests that younger drivers are becoming more likely to drive drugged than to drive drunk. This is a study of 7096 Texas clients under age 21 who were admitted to state-funded treatment programs between 1997 and 2007 with a past-year DUI arrest, DUI probation, or DUI referral. Data were obtained from the State’s administrative dataset. Multivariate logistic regressions models were used to understand the differences between those minors entering treatment as a DUI as compared to a non-DUI as well as the risks for completing treatment and for being abstinent in the month prior to follow-up. A major finding was that over time, the primary problem for underage DUI drivers changed from alcohol to marijuana. Being abstinent in the month prior to discharge, having a primary problem with alcohol rather than another drug, and having more family involved were the strongest predictors of treatment completion. Living in a household where the client was exposed to alcohol abuse or drug use, having been in residential treatment, and having more drug and alcohol and family problems were the strongest predictors of not being abstinent at follow-up. As a result, there is a need to direct more attention towards meeting the needs of the young DUI population through programs that address drug as well as alcohol consumption problems.
Resumo:
Principal topic: Effectuation theory suggests that entrepreneurs develop their new ventures in an iterative way by selecting possibilities through flexibility and interactions with the market; a focus on affordability of loss rather than maximal return on the capital invested, and the development of pre-commitments and alliances from stakeholders (Sarasvathy, 2001, 2008; Sarasvathy et al., 2005, 2006). In contrast, causation may be described as a rationalistic reasoning method to create a company. After a comprehensive market analysis to discover opportunities, the entrepreneur will select the alternative with the higher expected return and implement it through the use of a business plan. However, little is known about the consequences of following either of these two processes. One aspect that remains unclear is the relationship between newness and effectuation. On one hand it can be argued that the combination of a means-centered, interactive (through pre-commitments and alliances with stakeholders from the early phases of the venture creation) and open-minded process (through flexibility of exploiting contingencies) should encourage and facilitate the development of innovative solutions. On the other hand, having a close relationship with their “future first customers” and focussing too much on the resources and knowledge already within the firm may be a constraint that is not conducive to innovation, or at least not to a radical innovation. While it has been suggested that effectuation strategy is more likely to be used by innovative entrepreneurs (Sarasvathy, 2001), this hypothesis has not been demonstrated yet (Sarasvathy, 2001). Method: In our attempt to capture newness in its different aspects we have considered the following four domains where newness may happen: new product/service; new method for promotion and sales; new production methods/sourcing; market creation. We identified how effectuation may be differently associated with these four domains of newness. To test our four sets of hypotheses a dataset of 1329 firms (702 nascent and 627 young firms) randomly selected in Australia was examined through ANOVA Tukey HSD Test. Results and Implications: Results indicate the existence of a curvilinear relationship between effectuation and newness where low and high levels of newness are associated with low level of effectuation while medium level of newness is associated with high level of effectuation. Implications for academia, practitioners and policy makers are also discussed.
Resumo:
The research presented in this thesis addresses inherent problems in signaturebased intrusion detection systems (IDSs) operating in heterogeneous environments. The research proposes a solution to address the difficulties associated with multistep attack scenario specification and detection for such environments. The research has focused on two distinct problems: the representation of events derived from heterogeneous sources and multi-step attack specification and detection. The first part of the research investigates the application of an event abstraction model to event logs collected from a heterogeneous environment. The event abstraction model comprises a hierarchy of events derived from different log sources such as system audit data, application logs, captured network traffic, and intrusion detection system alerts. Unlike existing event abstraction models where low-level information may be discarded during the abstraction process, the event abstraction model presented in this work preserves all low-level information as well as providing high-level information in the form of abstract events. The event abstraction model presented in this work was designed independently of any particular IDS and thus may be used by any IDS, intrusion forensic tools, or monitoring tools. The second part of the research investigates the use of unification for multi-step attack scenario specification and detection. Multi-step attack scenarios are hard to specify and detect as they often involve the correlation of events from multiple sources which may be affected by time uncertainty. The unification algorithm provides a simple and straightforward scenario matching mechanism by using variable instantiation where variables represent events as defined in the event abstraction model. The third part of the research looks into the solution to address time uncertainty. Clock synchronisation is crucial for detecting multi-step attack scenarios which involve logs from multiple hosts. Issues involving time uncertainty have been largely neglected by intrusion detection research. The system presented in this research introduces two techniques for addressing time uncertainty issues: clock skew compensation and clock drift modelling using linear regression. An off-line IDS prototype for detecting multi-step attacks has been implemented. The prototype comprises two modules: implementation of the abstract event system architecture (AESA) and of the scenario detection module. The scenario detection module implements our signature language developed based on the Python programming language syntax and the unification-based scenario detection engine. The prototype has been evaluated using a publicly available dataset of real attack traffic and event logs and a synthetic dataset. The distinct features of the public dataset are the fact that it contains multi-step attacks which involve multiple hosts with clock skew and clock drift. These features allow us to demonstrate the application and the advantages of the contributions of this research. All instances of multi-step attacks in the dataset have been correctly identified even though there exists a significant clock skew and drift in the dataset. Future work identified by this research would be to develop a refined unification algorithm suitable for processing streams of events to enable an on-line detection. In terms of time uncertainty, identified future work would be to develop mechanisms which allows automatic clock skew and clock drift identification and correction. The immediate application of the research presented in this thesis is the framework of an off-line IDS which processes events from heterogeneous sources using abstraction and which can detect multi-step attack scenarios which may involve time uncertainty.
Resumo:
Principal Topic The Comprehensive Australian Study of Entrepreneurial Emergence (CAUSEE) represents the first Australian study to employ and extend the longitudinal and large scale systematic research developed for the Panel Study of Entrepreneurial Dynamics (PSED) in the US (Gartner, Shaver, Carter and Reynolds, 2004; Reynolds, 2007). This research approach addresses several shortcomings of other data sets including under coverage; selection bias; memory decay and hindsight bias, and lack of time separation between the assessment of causes and their assumed effects (Johnson et al 2006; Davidsson 2006). However, a remaining problem is that any a random sample of start-ups will be dominated by low potential, imitative ventures. In recognition of this issue CAUSEE supplemented PSED-type random samples with theoretically representative samples of the 'high potential' emerging ventures employing a unique methodology using novel multiple screening criteria. We define new ''high-potential'' ventures as new entrepreneurial innovative ventures with high aspirations and potential for growth. This distinguishes them from those ''lifestyle'' imitative businesses that start small and remain intentionally small (Timmons, 1986). CAUSEE is providing the opportunity to explore, for the first time, if process and outcomes of high potentials differ from those of traditional lifestyle firms. This will allows us to compare process and outcome attributes of the random sample with the high potential over sample of new firms and young firms. The attributes in which we will examine potential differences will include source of funding, and internationalisation. This is interesting both in terms of helping to explain why different outcomes occur but also in terms of assistance to future policymaking, given that high growth potential firms are increasingly becoming the focus of government intervention in economic development policies around the world. The first wave of data of a four year longitudinal study has been collected using these samples, allowing us to also provide some initial analysis on which to continue further research. The aim of this paper therefore is to present some selected preliminary results from the first wave of the data collection, with comparisons of high potential with lifestyle firms. We expect to see owing to greater resource requirements and higher risk profiles, more use of venture capital and angel investment, and more internationalisation activity to assist in recouping investment and to overcome Australia's smaller economic markets Methodology/Key Propositions In order to develop the samples of 'high potential' in the NF and YF categories a set of qualification criteria were developed. Specifically, to qualify, firms as nascent or young high potentials, we used multiple, partly compensating screening criteria related to the human capital and aspirations of the founders as well as the novelty of the venture idea, and venture high technology. A variety of techniques were also employed to develop a multi level dataset of sources to develop leads and firm details. A dataset was generated from a variety of websites including major stakeholders including the Federal and State Governments, Australian Chamber of Commerce, University Commercialisation Offices, Patent and Trademark Attorneys, Government Awards and Industry Awards in Entrepreneurship and Innovation, Industry lead associations, Venture Capital Association, Innovation directories including Australian Technology Showcase, Business and Entrepreneurs Magazines including BRW and Anthill. In total, over 480 industry, association, government and award sources were generated in this process. Of these, 74 discrete sources generated high potentials that fufilled the criteria. 1116 firms were contacted as high potential cases. 331 cases agreed to participate in the screener, with 279 firms (134 nascents, and 140 young firms) successfully passing the high potential criteria. 222 Firms (108 Nascents and 113 Young firms) completed the full interview. For the general sample CAUSEE conducts screening phone interviews with a very large number of adult members of households randomly selected through random digit dialing using screening questions which determine whether respondents qualify as 'nascent entrepreneurs'. CAUSEE additionally targets 'young firms' those that commenced trading from 2004 or later. This process yielded 977 Nascent Firms (3.4%) and 1,011 Young Firms (3.6%). These were directed to the full length interview (40-60 minutes) either directly following the screener or later by appointment. The full length interviews were completed by 594 NF and 514 YF cases. These are the cases we will use in the comparative analysis in this report. Results and Implications The results for this paper are based on Wave one of the survey which has been completed and the data obtained. It is expected that the findings will assist in beginning to develop an understanding of high potential nascent and young firms in Australia, how they differ from the larger lifestyle entrepreneur group that makes up the vast majority of the new firms created each year, and the elements that may contribute to turning high potential growth status into high growth realities. The results have implications for Government in the design of better conditions for the creation of new business, firms who assist high potentials in developing better advice programs in line with a better understanding of their needs and requirements, individuals who may be considering becoming entrepreneurs in high potential arenas and existing entrepreneurs make better decisions.
Resumo:
Recent theoretical work has suggested “entrepreneurial capabilities” themselves may provide the resource foundations to deliver competitive advantage for entrepreneurial firms. This paper empirically examines how start-ups use such entrepreneurial capabilities to build competitive advantage. We investigate the effects of technological and marketing expertise, knowledge of market trends, flexibility and networking on the ability to obtain a cost leadership or differentiation advantage. Using a large dataset of 1,108 start-ups obtained after random sampling of over 30,193 households, we find that differentiation strategies benefit from most resource advantages. Cost leadership strategies, however, seem only to benefit from technological expertise and flexibility and not related to market-based advantages. By doing so, this study contributes to both entrepreneurship and RBV-theories by showing how entrepreneurial capabilities lead to competitive advantages in nascent and early-stage start-ups.
Resumo:
To date, automatic recognition of semantic information such as salient objects and mid-level concepts from images is a challenging task. Since real-world objects tend to exist in a context within their environment, the computer vision researchers have increasingly incorporated contextual information for improving object recognition. In this paper, we present a method to build a visual contextual ontology from salient objects descriptions for image annotation. The ontologies include not only partOf/kindOf relations, but also spatial and co-occurrence relations. A two-step image annotation algorithm is also proposed based on ontology relations and probabilistic inference. Different from most of the existing work, we specially exploit how to combine representation of ontology, contextual knowledge and probabilistic inference. The experiments show that image annotation results are improved in the LabelMe dataset.
Resumo:
Aims: The Rural and Remote Road Safety Study (RRRSS) addresses a recognised need for greater research on road trauma in rural and remote Australia, the costs of which are disproportionately high compared with urban areas. The 5-year multi-phase study with whole-of-government support concluded in June 2008. Drawing on RRRSS data, we analysed fatal motorcycle crashes which occurred over 39 months to provide a description of crash characteristics, contributing factors and people involved. The descriptive analysis and discussion may inform development of tailored motorcycle safety interventions. Methods: RRRSS criteria sought vehicle crashes resulting in death or hospitalisation for 24 hours minimum of at least 1 person aged 16 years or over, in the study area defined roughly as the Queensland area north from Bowen in the east and Boulia in the west (excluding Townsville and Cairns urban areas). Fatal motorcycle crashes were selected from the RRRSS dataset. Analysis considered medical data covering injury types and severity, evidence of alcohol, drugs and prior medical conditions, as well as crash descriptions supplied by police to Queensland Transport on contributing circumstances, vehicle types, environmental conditions and people involved. Crash data were plotted in a geographic information system (MapInfo) for spatial analysis. Results: There were 23 deaths from 22 motorcycle crashes on public roads meeting RRRSS criteria. Of these, half were single vehicle crashes and half involved 2 or more vehicles. In contrast to general patterns for driver/rider age distribution in crashes, riders below 25 years of age were represented proportionally within the population. Riders in their thirties comprised 41% of fatalities, with a further 36% accounted for by riders in their fifties. 18 crashes occurred in the Far North Statistical Division (SD), with 2 crashes in both the Northern and North West SDs. Behavioural factors comprised the vast majority of contributing circumstances cited by police, with adverse environmental conditions noted in only 4 cases. Conclusions: Fatal motorcycle crashes were more likely to involve another vehicle and less likely to involve a young rider than non-fatal crashes recorded by the RRRSS. Rider behaviour contributed to the majority of crashes and should be a major focus of research, education and policy development, while other road users’ behaviour and awareness also remains important. With 68% of crashes occurring on major and secondary roads within a 130km radius of Cairns, efforts should focus on this geographic area.
Resumo:
Automatic detection of suspicious activities in CCTV camera feeds is crucial to the success of video surveillance systems. Such a capability can help transform the dumb CCTV cameras into smart surveillance tools for fighting crime and terror. Learning and classification of basic human actions is a precursor to detecting suspicious activities. Most of the current approaches rely on a non-realistic assumption that a complete dataset of normal human actions is available. This paper presents a different approach to deal with the problem of understanding human actions in video when no prior information is available. This is achieved by working with an incomplete dataset of basic actions which are continuously updated. Initially, all video segments are represented by Bags-Of-Words (BOW) method using only Term Frequency-Inverse Document Frequency (TF-IDF) features. Then, a data-stream clustering algorithm is applied for updating the system's knowledge from the incoming video feeds. Finally, all the actions are classified into different sets. Experiments and comparisons are conducted on the well known Weizmann and KTH datasets to show the efficacy of the proposed approach.
Resumo:
Phase-type distributions represent the time to absorption for a finite state Markov chain in continuous time, generalising the exponential distribution and providing a flexible and useful modelling tool. We present a new reversible jump Markov chain Monte Carlo scheme for performing a fully Bayesian analysis of the popular Coxian subclass of phase-type models; the convenient Coxian representation involves fewer parameters than a more general phase-type model. The key novelty of our approach is that we model covariate dependence in the mean whilst using the Coxian phase-type model as a very general residual distribution. Such incorporation of covariates into the model has not previously been attempted in the Bayesian literature. A further novelty is that we also propose a reversible jump scheme for investigating structural changes to the model brought about by the introduction of Erlang phases. Our approach addresses more questions of inference than previous Bayesian treatments of this model and is automatic in nature. We analyse an example dataset comprising lengths of hospital stays of a sample of patients collected from two Australian hospitals to produce a model for a patient's expected length of stay which incorporates the effects of several covariates. This leads to interesting conclusions about what contributes to length of hospital stay with implications for hospital planning. We compare our results with an alternative classical analysis of these data.