74 resultados para Text mining, Classificazione, Stemming, Text categorization
Resumo:
This article clarifies what was done with the sub-7-man positions in data-mining Harold van der Heijden's 'HHdbIV' database of chess studies prior to its publication. It emphasises that only positions in the main lines of studies were examined and that the information about uniqueness of move was not incorporated in HHdbIV. There is some reflection on the separate technical and artistic dimensions of study evaluation.
Resumo:
Typeface design: a series of collaborative projects commissioned by Adobe, Inc. and Brill to develop extensive polytonic Greek typefaces. The two Adobe typefaces can be seen as extension of previous research for the Garamond Premier Pro family (2005), and concludes a research theme started in 1998 with work for Adobe’s Minion Pro Greek. These typefaces together define the state of the art for text-intensive Greek typesetting for wide character set texts (from classical texts, to poetry, to essays, to prose). They serve both as exemplar for other developers, and as vehicles for developing the potential of Greek text typography, for example with the parallel inclusion of monotonic and polytonic characters, detailed localised punctuation options, fluid handling of case-conversion issues, and innovative options such as accented small caps (originally requested by bibliographers, and subsequently rolled out to a general user base). The Brill typeface (for the established academic publisher) has an exceptionally wide character set to cover several academic disciplines, and is intended to differentiate sufficiently from its partner Latin typeface, while maintaining a clear texture in both offset and low-resolution print-on-demand reproduction. This work involved substantial amounts of testing and modifying the design, especially of diacritics, to maintain clarity the readability of unfamiliar words. All together these typefaces form a study in how Greek typesetting meets contemporary typographic requirements, while resonating with historically accurate styles, where these are present. Significant research in printing archives helped to identify appropriate styles, as well as originate variants that are coherent stylistically, even when historical equivalents were absent.
Resumo:
Background: Since their inception, Twitter and related microblogging systems have provided a rich source of information for researchers and have attracted interest in their affordances and use. Since 2009 PubMed has included 123 journal articles on medicine and Twitter, but no overview exists as to how the field uses Twitter in research. // Objective: This paper aims to identify published work relating to Twitter indexed by PubMed, and then to classify it. This classification will provide a framework in which future researchers will be able to position their work, and to provide an understanding of the current reach of research using Twitter in medical disciplines. Limiting the study to papers indexed by PubMed ensures the work provides a reproducible benchmark. // Methods: Papers, indexed by PubMed, on Twitter and related topics were identified and reviewed. The papers were then qualitatively classified based on the paper’s title and abstract to determine their focus. The work that was Twitter focused was studied in detail to determine what data, if any, it was based on, and from this a categorization of the data set size used in the studies was developed. Using open coded content analysis additional important categories were also identified, relating to the primary methodology, domain and aspect. // Results: As of 2012, PubMed comprises more than 21 million citations from biomedical literature, and from these a corpus of 134 potentially Twitter related papers were identified, eleven of which were subsequently found not to be relevant. There were no papers prior to 2009 relating to microblogging, a term first used in 2006. Of the remaining 123 papers which mentioned Twitter, thirty were focussed on Twitter (the others referring to it tangentially). The early Twitter focussed papers introduced the topic and highlighted the potential, not carrying out any form of data analysis. The majority of published papers used analytic techniques to sort through thousands, if not millions, of individual tweets, often depending on automated tools to do so. Our analysis demonstrates that researchers are starting to use knowledge discovery methods and data mining techniques to understand vast quantities of tweets: the study of Twitter is becoming quantitative research. // Conclusions: This work is to the best of our knowledge the first overview study of medical related research based on Twitter and related microblogging. We have used five dimensions to categorise published medical related research on Twitter. This classification provides a framework within which researchers studying development and use of Twitter within medical related research, and those undertaking comparative studies of research relating to Twitter in the area of medicine and beyond, can position and ground their work.
Resumo:
This article examines Corporate Social Responsibility (CSR) and mining community development, sustainability and viability. These issues are considered focussing on current and former company-owned mining towns in Namibia. Historically company towns have been a feature of mining activity in Namibia. However, the fate of such towns upon mine closure has been and remains controversial. Declining former mining communities and even ghost mining towns can be found across the country. This article draws upon research undertaken in Namibia and considers these issues with reference to three case study communities. This article examines the complexities which surround decision-making about these communities, and the challenges faced in efforts to encourage their sustainability after mining. In this article, mine company engagements through CSR with the development, sustainability and viability of such communities are also critically discussed. The role, responsibilities, and actions of the state in relation to these communities are furthermore reflected upon. Finally, ways forward for these communities are considered.
Resumo:
This article examines the marginal position of artisanal miners in sub-Saharan Africa, and considers how they are incorporated into mineral sector change in the context of institutional and legal integration. Taking the case of diamond and gold mining in Tanzania, the concept of social exclusion is used to explore the consequences of marginalization on people's access to mineral resources and ability to make a living from artisanal mining. Because existing inequalities and forms of discrimination are ignored by the Tanzanian state, the institutionalization of mineral titles conceals social and power relations that perpetuate highly unequal access to resources. The article highlights the complexity of these processes, and shows that while legal integration can benefit certain wealthier categories of people, who fit into the model of an 'entrepreneurial small-scale miner', for others adverse incorporation contributes to socio-economic dependence, exploitation and insecurity. For the issue of marginality to be addressed within integration processes, the existence of local forms of organization, institutions and relationships, which underpin inequalities and discrimination, need to be recognized.
Resumo:
This paper introduces a novel approach for free-text keystroke dynamics authentication which incorporates the use of the keyboard’s key-layout. The method extracts timing features from specific key-pairs. The Euclidean distance is then utilized to find the level of similarity between a user’s profile data and his/her test data. The results obtained from this method are reasonable for free-text authentication while maintaining the maximum level of user relaxation. Moreover, it has been proven in this study that flight time yields better authentication results when compared with dwell time. In particular, the results were obtained with only one training sample for the purpose of practicality and ease of real life application.
Resumo:
Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.
Resumo:
In the context of environmental valuation of natural disasters, an important component of the evaluation procedure lies in determining the periodicity of events. This paper explores alternative methodologies for determining such periodicity, illustrating the advantages and the disadvantages of the separate methods and their comparative predictions. The procedures employ Bayesian inference and explore recent advances in computational aspects of mixtures methodology. The procedures are applied to the classic data set of Maguire et al (Biometrika, 1952) which was subsequently updated by Jarrett (Biometrika, 1979) and which comprise the seminal investigations examining the periodicity of mining disasters within the United Kingdom, 1851-1962.
Resumo:
This article critically explores the nature and purpose of relationships and inter-dependencies between stakeholders in the context of a parastatal chromite mining company in the Betsiboka Region of Northern Madagascar. An examination of the institutional arrangements at the interface between the mining company and local communities identified power hierarchies and dependencies in the context of a dominant paternalistic environment. The interactions, inter alia, limited social cohesion and intensified the fragility and weakness of community representation, which was further influenced by ethnic hierarchies between the varied community groups; namely, indigenous communities and migrants to the area from different ethnic groups. Moreover, dependencies and nepotism, which may exist at all institutional levels, can create civil society stakeholder representatives who are unrepresentative of the society they are intended to represent. Similarly, a lack of horizontal and vertical trust and reciprocity inherent in Malagasy society engenders a culture of low expectations regarding transparency and accountability, which further catalyses a cycle of nepotism and elite rent-seeking behaviour. On the other hand, leaders retain power with minimal vertical delegation or decentralisation of authority among levels of government and limit opportunities to benefit the elite, perpetuating rent-seeking behaviour within the privileged minority. Within the union movement, pluralism and the associated politicisation of individual unions restricts solidarity, which impacts on the movement’s capacity to act as a cohesive body of opinion and opposition. Nevertheless, the unions’ drive to improve their social capital has increased expectations of transparency and accountability, resulting in demands for greater engagement in decision-making processes.
Resumo:
Purpose: This paper explores the extent of site-specific and geographic segmental social, environmental and ethical reporting by mining companies operating in Ghana. We aim to: (i) establish a picture of corporate transparency relating to geographic segmentation of social, environmental and ethical reporting which is specific to operating sites and country of operation, and; (ii) gauge the impact of the introduction of integrated reporting on site-specific social, environmental and ethical reporting. Methodology/Approach: We conducted an interpretive content analysis of the annual/integrated reports of mining companies for the years 2009, 2010 and 2011 in order to extract site-specific social, environmental and ethical information relating to the companies’ mining operations in Ghana. Findings and Implications: We found that site-specific social, environmental and ethical reporting is extremely patchy and inconsistent between the companies’ reports studied. We also found that there was no information relating to certain sites, which were in operation, according to the Ghana Minerals Commission. This could simply be because operations were not in progress. Alternatively it could be that decisions are made concerning which site-specific information is reported according to a certain benchmark. One policy implication arising from this research is that IFRS should require geographic segmental reporting of material social, environmental and ethical information in order to bring IFRS into line with global developments in integrated reporting. Originality: Although there is a wealth of sustainability reporting research and an emergent literature on integrated reporting, there is currently no academic research exploring site-specific social, environmental and ethical reporting
Resumo:
Expert systems have been increasingly popular for commercial importance. A rule based system is a special type of an expert system, which consists of a set of ‘if-then‘ rules and can be applied as a decision support system in many areas such as healthcare, transportation and security. Rule based systems can be constructed based on both expert knowledge and data. This paper aims to introduce the theory of rule based systems especially on categorization and construction of such systems from a conceptual point of view. This paper also introduces rule based systems for classification tasks in detail.