266 resultados para applicazione, business analysis, data mining, Facebook, PRIN, relazioni sociali, social network
Resumo:
Honig and Samuelsson (2014) and Delmar (2015) recently had an exchange in this journal related to a replication-and-extension attempt of two papers which originally arrived at different conclusions based on the same data set. This commentary provides further clarification on the issues and links the debate to broader issues scholarly culture and practices in entrepreneurship research.
Resumo:
This chapter discusses the methodological aspects and empirical findings of a large-scale, funded project investigating public communication through social media in Australia. The project concentrates on Twitter, but we approach it as representative of broader current trends toward the integration of large datasets and computational methods into media and communication studies in general, and social media scholarship in particular. The research discussed in this chapter aims to empirically describe networks of affiliation and interest in the Australian Twittersphere, while reflecting on the methodological implications and imperatives of ‘big data’ in the humanities. Using custom network crawling technology, we have conducted a snowball crawl of Twitter accounts operated by Australian users to identify more than one million users and their follower/followee relationships, and have mapped their interconnections. In itself, the map provides an overview of the major clusters of densely interlinked users, largely centred on shared topics of interest (from politics through arts to sport) and/or sociodemographic factors (geographic origins, age groups). Our map of the Twittersphere is the first of its kind for the Australian part of the global Twitter network, and also provides a first independent and scholarly estimation of the size of the total Australian Twitter population. In combination with our investigation of participation patterns in specific thematic hashtags, the map also enables us to examine which areas of the underlying follower/followee network are activated in the discussion of specific current topics – allowing new insights into the extent to which particular topics and issues are of interest to specialised niches or to the Australian public more broadly. Specifically, we examine the Twittersphere footprint of dedicated political discussion, under the #auspol hashtag, and compare it with the heightened, broader interest in Australian politics during election campaigns, using #ausvotes; we explore the different patterns of Twitter activity across the map for major television events (the popular competitive cooking show #masterchef, the British #royalwedding, and the annual #stateoforigin Rugby League sporting contest); and we investigate the circulation of links to the articles published by a number of major Australian news organisations across the network. Such analysis, which combines the ‘big data’-informed map and a close reading of individual communicative phenomena, makes it possible to trace the dynamic formation and dissolution of issue publics against the backdrop of longer-term network connections, and the circulation of information across these follower/followee links. Such research sheds light on the communicative dynamics of Twitter as a space for mediated social interaction. Our work demonstrates the possibilities inherent in the current ‘computational turn’ (Berry, 2010) in the digital humanities, as well as adding to the development and critical examination of methodologies for dealing with ‘big data’ (boyd and Crawford, 2011). Out tools and methods for doing Twitter research, released under Creative Commons licences through our project Website, provide the basis for replicable and verifiable digital humanities research on the processes of public communication which take place through this important new social network.
Resumo:
Public buildings and large infrastructure are typically monitored by tens or hundreds of cameras, all capturing different physical spaces and observing different types of interactions and behaviours. However to date, in large part due to limited data availability, crowd monitoring and operational surveillance research has focused on single camera scenarios which are not representative of real-world applications. In this paper we present a new, publicly available database for large scale crowd surveillance. Footage from 12 cameras for a full work day covering the main floor of a busy university campus building, including an internal and external foyer, elevator foyers, and the main external approach are provided; alongside annotation for crowd counting (single or multi-camera) and pedestrian flow analysis for 10 and 6 sites respectively. We describe how this large dataset can be used to perform distributed monitoring of building utilisation, and demonstrate the potential of this dataset to understand and learn the relationship between different areas of a building.
Resumo:
Product reviews are the foremost source of information for customers and manufacturers to help them make appropriate purchasing and production decisions. Natural language data is typically very sparse; the most common words are those that do not carry a lot of semantic content, and occurrences of any particular content-bearing word are rare, while co-occurrences of these words are rarer. Mining product aspects, along with corresponding opinions, is essential for Aspect-Based Opinion Mining (ABOM) as a result of the e-commerce revolution. Therefore, the need for automatic mining of reviews has reached a peak. In this work, we deal with ABOM as sequence labelling problem and propose a supervised extraction method to identify product aspects and corresponding opinions. We use Conditional Random Fields (CRFs) to solve the extraction problem and propose a feature function to enhance accuracy. The proposed method is evaluated using two different datasets. We also evaluate the effectiveness of feature function and the optimisation through multiple experiments.
Resumo:
Driving on an approach to a signalized intersection while distracted is relatively risky, as potential vehicular conflicts and resulting angle collisions tend to be relatively more severe compared to other locations. Given the prevalence and importance of this particular scenario, the objective of this study was to examine the decisions and actions of distracted drivers during the onset of yellow lights. Driving simulator data were obtained from a sample of 69 drivers under baseline and handheld cell phone conditions at the University of Iowa – National Advanced Driving Simulator. Explanatory variables included age, gender, cell phone use, distance to stop-line, and speed. Although there is extensive research on drivers’ responses to yellow traffic signals, the examinations have been conducted from a traditional regression-based approach, which do not necessary provide the underlying relations and patterns among the sampled data. In this paper, we exploit the benefits of both classical statistical inference and data mining techniques to identify the a priori relationships among main effects, non-linearities, and interaction effects. Results suggest that the probability of yellow light running increases with the increase in driving speed at the onset of yellow. Both young (18–25 years) and middle-aged (30–45 years) drivers reveal reduced propensity for yellow light running whilst distracted across the entire speed range, exhibiting possible risk compensation during this critical driving situation. The propensity for yellow light running for both distracted male and female older (50–60 years) drivers is significantly higher. Driver experience captured by age interacts with distraction, resulting in their combined effect having slower physiological response and being distracted particularly risky.
Resumo:
In fisheries managed using individual transferable quotas (ITQs) it is generally assumed that quota markets are well-functioning, allowing quota to flow on either a temporary or permanent basis to those able to make best use of it. However, despite an increasing number of fisheries being managed under ITQs, empirical assessments of the quota markets that have actually evolved in these fisheries remain scarce. The Queensland Coral Reef Fin-Fish Fishery (CRFFF) on the Great Barrier Reef has been managed under a system of ITQs since 2004. Data on individual quota holdings and trades for the period 2004-2012 were used to assess the CRFFF quota market and its evolution through time. Network analysis was applied to assess market structure and the nature of lease-trading relationships. An assessment of market participants’ abilities to balance their quota accounts, i.e., gap analysis, provided insights into market functionality and how this may have changed in the period observed. Trends in ownership and trade were determined, and market participants were identified as belonging to one out of a set of seven generalized types. The emergence of groups such as investors and lease-dependent fishers is clear. In 2011-2012, 41% of coral trout quota was owned by participants that did not fish it, and 64% of total coral trout landings were made by fishers that owned only 10% of the quota. Quota brokers emerged whose influence on the market varied with the bioeconomic conditions of the fishery. Throughout the study period some quota was found to remain inactive, implying potential market inefficiencies. Contribution to this inactivity appeared asymmetrical, with most residing in the hands of smaller quota holders. The importance of transaction costs in the operation of the quota market and the inequalities that may result are discussed in light of these findings
Resumo:
Bird species richness survey is one of the most intriguing ecological topics for evaluating environmental health. Here, bird species richness denotes the number of unique bird species in a particular area. Factors affecting the investigation of bird species richness include weather, observation bias, and most importantly, the prohibitive costs of conducting surveys at large spatiotemporal scales. Thanks to advances in recording techniques, these problems have been alleviated by deploying sensors for acoustic data collection. Although automated detection techniques have been introduced to identify various bird species, the innate complexity of bird vocalizations, the background noise present in the recording and the escalating volumes of acoustic data pose a challenging task on determination of bird species richness. In this paper we proposed a two-step computer-assisted sampling approach for determining bird species richness in one-day acoustic data. First, a classification model is built based on acoustic indices for filtering out minutes that contain few bird species. Then the classified bird minutes are ordered by an acoustic index and the redundant temporal minutes are removed from the ranked minute sequence. The experimental results show that our method is more efficient in directing experts for determination of bird species compared with the previous methods.
Resumo:
Acoustic classification of anurans (frogs) has received increasing attention for its promising application in biological and environment studies. In this study, a novel feature extraction method for frog call classification is presented based on the analysis of spectrograms. The frog calls are first automatically segmented into syllables. Then, spectral peak tracks are extracted to separate desired signal (frog calls) from background noise. The spectral peak tracks are used to extract various syllable features, including: syllable duration, dominant frequency, oscillation rate, frequency modulation, and energy modulation. Finally, a k-nearest neighbor classifier is used for classifying frog calls based on the results of principal component analysis. The experiment results show that syllable features can achieve an average classification accuracy of 90.5% which outperforms Mel-frequency cepstral coefficients features (79.0%).
Resumo:
Over past few decades, frog species have been experiencing dramatic decline around the world. The reason for this decline includes habitat loss, invasive species, climate change and so on. To better know the status of frog species, classifying frogs has become increasingly important. In this study, acoustic features are investigated for multi-level classification of Australian frogs: family, genus and species, including three families, eleven genera and eighty five species which are collected from Queensland, Australia. For each frog species, six instances are selected from which ten acoustic features are calculated. Then, the multicollinearity between ten features are studied for selecting non-correlated features for subsequent analysis. A decision tree (DT) classifier is used to visually and explicitly determine which acoustic features are relatively important for classifying family, which for genus, and which for species. Finally, a weighted support vector machines (SVMs) classifier is used for the multi- level classification with three most important acoustic features respectively. Our experiment results indicate that using different acoustic feature sets can successfully classify frogs at different levels and the average classification accuracy can be up to 85.6%, 86.1% and 56.2% for family, genus and species respectively.
Resumo:
Document clustering is one of the prominent methods for mining important information from the vast amount of data available on the web. However, document clustering generally suffers from the curse of dimensionality. Providentially in high dimensional space, data points tend to be more concentrated in some areas of clusters. We take advantage of this phenomenon by introducing a novel concept of dynamic cluster representation named as loci. Clusters’ loci are efficiently calculated using documents’ ranking scores generated from a search engine. We propose a fast loci-based semi-supervised document clustering algorithm that uses clusters’ loci instead of conventional centroids for assigning documents to clusters. Empirical analysis on real-world datasets shows that the proposed method produces cluster solutions with promising quality and is substantially faster than several benchmarked centroid-based semi-supervised document clustering methods.