69 resultados para Document Signatures
Resumo:
We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.
Resumo:
This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.
Resumo:
Public key cryptography, and with it,the ability to compute digital signatures, have made it possible for electronic commerce to flourish. It is thus unsurprising that the proposed Australian NECS will also utilise digital signatures in its system so as to provide a fully automated process from the creation of electronic land title instrument to the digital signing, and electronic lodgment of these instruments. This necessitates an analysis of the fraud risks raised by the usage of digital signatures because a compromise of the integrity of digital signatures will lead to a compromise of the Torrens system itself. This article will show that digital signatures may in fact offer greater security against fraud than handwritten signatures; but to achieve this, digital signatures require an infrastructure whereby each component is properly implemented and managed.
Resumo:
Digital collections are growing exponentially in size as the information age takes a firm grip on all aspects of society. As a result Information Retrieval (IR) has become an increasingly important area of research. It promises to provide new and more effective ways for users to find information relevant to their search intentions. Document clustering is one of the many tools in the IR toolbox and is far from being perfected. It groups documents that share common features. This grouping allows a user to quickly identify relevant information. If these groups are misleading then valuable information can accidentally be ignored. There- fore, the study and analysis of the quality of document clustering is important. With more and more digital information available, the performance of these algorithms is also of interest. An algorithm with a time complexity of O(n2) can quickly become impractical when clustering a corpus containing millions of documents. Therefore, the investigation of algorithms and data structures to perform clustering in an efficient manner is vital to its success as an IR tool. Document classification is another tool frequently used in the IR field. It predicts categories of new documents based on an existing database of (doc- ument, category) pairs. Support Vector Machines (SVM) have been found to be effective when classifying text documents. As the algorithms for classifica- tion are both efficient and of high quality, the largest gains can be made from improvements to representation. Document representations are vital for both clustering and classification. Representations exploit the content and structure of documents. Dimensionality reduction can improve the effectiveness of existing representations in terms of quality and run-time performance. Research into these areas is another way to improve the efficiency and quality of clustering and classification results. Evaluating document clustering is a difficult task. Intrinsic measures of quality such as distortion only indicate how well an algorithm minimised a sim- ilarity function in a particular vector space. Intrinsic comparisons are inherently limited by the given representation and are not comparable between different representations. Extrinsic measures of quality compare a clustering solution to a “ground truth” solution. This allows comparison between different approaches. As the “ground truth” is created by humans it can suffer from the fact that not every human interprets a topic in the same manner. Whether a document belongs to a particular topic or not can be subjective.
Resumo:
It is well known that a statutory requirement of formality is associated with contracts concerning land. In this regard, s 59 of the Property Law Act 1974 (Qld) provides: No action may be brought upon any contract for the sale or other disposition of land or any interest in land unless the contract upon which such action is brought, or some memorandum or note of the contract, is in writing, and signed by the party to be charged, or by some person by the party lawfully authorised. In addition to the possibility of a formal contract, the statutory wording clearly contemplates reliance on an informal note or memorandum. To constitute a sufficient note or memorandum for the purposes of the statute, the signed note or memorandum must contain details of the parties to the contract, an adequate description of the property, the price and any other essential terms. It is also accepted that the doctrine of joinder may be invoked in circumstances where the document signed by the party to be charged contains an express or implied reference to any other document. In this way, a sufficient note or memorandum may be constituted by the joinder of a number of documents.
Resumo:
Mesenchymal stem cells (MSCs) are undifferentiated, multi-potent stem cells with the ability to renew. They can differentiate into many types of terminal cells, such as osteoblasts, chondrocytes, adipocytes, myocytes, and neurons. These cells have been applied in tissue engineering as the main cell type to regenerate new tissues. However, a number of issues remain concerning the use of MSCs, such as cell surface markers, the determining factors responsible for their differentiation to terminal cells, and the mechanisms whereby growth factors stimulate MSCs. In this chapter, we will discuss how proteomic techniques have contributed to our current knowledge and how they can be used to address issues currently facing MSC research. The application of proteomics has led to the identification of a special pattern of cell surface protein expression of MSCs. The technique has also contributed to the study of a regulatory network of MSC differentiation to terminal differentiated cells, including osteocytes, chondrocytes, adipocytes, neurons, cardiomyocytes, hepatocytes, and pancreatic islet cells. It has also helped elucidate mechanisms for growth factor–stimulated differentiation of MSCs. Proteomics can, however, not reveal the accurate role of a special pathway and must therefore be combined with other approaches for this purpose. A new generation of proteomic techniques have recently been developed, which will enable a more comprehensive study of MSCs. Keywords
Resumo:
The aetiology behind overuse injuries such as stress fractures is complex and multi-factorial. In sporting events where the loading is likely to be uneven (e.g. hurdling and jumps), research has suggested that the frequency of stress fractures seems to favour the athlete’s dominant limb. The tendency for an individual to have a preferred limb for voluntary motor acts makes limb selection a possible factor behind the development of unilateral overuse injuries, particularly when repeatedly used during high loading activities. The event of sprint hurdling is well suited for the study of loading asymmetry as the hurdling technique is repetitive and the limb movement asymmetrical. Of relevance to this study is the high incidence of Navicular Stress Fractures (NSF) in hurdlers, with suggestions there is a tendency for the fracture to develop in the trail leg foot, although this is not fully accepted. The Ground Reaction Force (GRF) with each foot contact is influenced by the hurdle action, with research finding step-to-step loading variations. However, it is unknown if this loading asymmetry extends to individual forefoot joints, thereby influencing stress fracture development. The first part of the study involved a series of investigations using a commercially available matrix style in-shoe sensor system (FscanTM, Tekscan Inc.). The suitability of insole sensor systems and custom made discrete sensors for use in hurdling-related training activities was assessed. The methodology used to analyse foot loading with each technology was investigated. The insole and discrete sensors systems tested proved to be unsuitable for use during full pace hurdling. Instead, a running barrier task designed to replicate the four repetitive foot contacts present during hurdling was assessed. This involved the clearance of a series of 6 barriers (low training hurdles), place in a straight line, using 4 strides between each. The second part of the study involved the analysis of "inter-limb" and "within foot loading asymmetries" using stance duration as well as vertical GRF under the Hallux (T1), the first metatarsal head (M1) and the central forefoot peak pressure site (M2), during walking, running, and running with barrier clearances. The contribution to loading asymmetry that each of the four repetitive foot contacts made during a series of barrier clearances was also assessed. Inter-limb asymmetry, in forefoot loading, occurred at discrete forefoot sites in a non-uniform manner across the three gait conditions. When the individual barrier foot contacts were compared, the stance duration was asymmetrical and the proportion of total forefoot load at M2 was asymmetrical. There were no significant differences between the proportion of forefoot load at M1, compared to M2; for any of the steps involved in the barrier clearance. A case study testing experimental (discrete) sensors during full pace sprinting and hurdling found that during both gait conditions, the trail limb experienced the greater vertical GRF at M1 and M2. During full pace hurdling, increased stance duration and vertical loading was a characteristic of the trail limb hurdle foot contacts. Commercially available in-shoe systems are not suitable for on field assessment of full pace hurdling. For the use of discrete sensor technology to become commonplace in the field, more robust sensors need to be developed.
Resumo:
Divergence from a random baseline is a technique for the evaluation of document clustering. It ensures cluster quality measures are performing work that prevents ineffective clusterings from giving high scores to clusterings that provide no useful result. These concepts are defined and analysed using intrinsic and extrinsic approaches to the evaluation of document cluster quality. This includes the classical clusters to categories approach and a novel approach that uses ad hoc information retrieval. The divergence from a random baseline approach is able to differentiate ineffective clusterings encountered in the INEX XML Mining track. It also appears to perform a normalisation similar to the Normalised Mutual Information (NMI) measure but it can be applied to any measure of cluster quality. When it is applied to the intrinsic measure of distortion as measured by RMSE, subtraction from a random baseline provides a clear optimum that is not apparent otherwise. This approach can be applied to any clustering evaluation. This paper describes its use in the context of document clustering evaluation.
Resumo:
Finding and labelling semantic features patterns of documents in a large, spatial corpus is a challenging problem. Text documents have characteristics that make semantic labelling difficult; the rapidly increasing volume of online documents makes a bottleneck in finding meaningful textual patterns. Aiming to deal with these issues, we propose an unsupervised documnent labelling approach based on semantic content and feature patterns. A world ontology with extensive topic coverage is exploited to supply controlled, structured subjects for labelling. An algorithm is also introduced to reduce dimensionality based on the study of ontological structure. The proposed approach was promisingly evaluated by compared with typical machine learning methods including SVMs, Rocchio, and kNN.
Resumo:
Enterprise Systems (ES) can be understood as the de facto standard for holistic operational and managerial support within an organization. Most commonly ES are offered as commercial off-the-shelf packages, requiring customization in the user organization. This process is a complex and resource-intensive task, which often prevents small and midsize enterprises (SME) from undertaking configuration projects. Especially in the SME market independent software vendors provide pre-configured ES for a small customer base. The problem of ES configuration is shifted from the customer to the vendor, but remains critical. We argue that the yet unexplored link between process configuration and business document configuration must be closer examined as both types of configuration are closely tied to one another.