280 resultados para cosmologia, clustering, AP-test


Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes an innovative instance similarity based evaluation metric that reduces the search map for clustering to be performed. An aggregate global score is calculated for each instance using the novel idea of Fibonacci series. The use of Fibonacci numbers is able to separate the instances effectively and, in hence, the intra-cluster similarity is increased and the inter-cluster similarity is decreased during clustering. The proposed FIBCLUS algorithm is able to handle datasets with numerical, categorical and a mix of both types of attributes. Results obtained with FIBCLUS are compared with the results of existing algorithms such as k-means, x-means expected maximization and hierarchical algorithms that are widely used to cluster numeric, categorical and mix data types. Empirical analysis shows that FIBCLUS is able to produce better clustering solutions in terms of entropy, purity and F-score in comparison to the above described existing algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose to use the Tensor Space Modeling (TSM) to represent and analyze the user’s web log data that consists of multiple interests and spans across multiple dimensions. Further we propose to use the decomposition factors of the Tensors for clustering the users based on similarity of search behaviour. Preliminary results show that the proposed method outperforms the traditional Vector Space Model (VSM) based clustering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose: Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people may be looking for specific web sites or may wish to conduct transactions with web services. This paper aims to focus on automatically classifying the different user intents behind web queries. Design/methodology/approach: For the research reported in this paper, 130,000 web search engine queries are categorized as informational, navigational, or transactional using a k-means clustering approach based on a variety of query traits. Findings: The research findings show that more than 75 percent of web queries (clustered into eight classifications) are informational in nature, with about 12 percent each for navigational and transactional. Results also show that web queries fall into eight clusters, six primarily informational, and one each of primarily transactional and navigational. Research limitations/implications: This study provides an important contribution to web search literature because it provides information about the goals of searchers and a method for automatically classifying the intents of the user queries. Automatic classification of user intent can lead to improved web search engines by tailoring results to specific user needs. Practical implications: The paper discusses how web search engines can use automatically classified user queries to provide more targeted and relevant results in web searching by implementing a real time classification method as presented in this research. Originality/value: This research investigates a new application of a method for automatically classifying the intent of user queries. There has been limited research to date on automatically classifying the user intent of web queries, even though the pay-off for web search engines can be quite beneficial. © Emerald Group Publishing Limited.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The great male Aussie cossie is growing spots. The ‘dick’ tog, as it is colloquially referred to, is linked to Australia’s national identify with overtly masculine bronzed Aussie bodies clothed in this iconic apparel. Yet the reality is our hunger for worshiping the sun and the addiction to a beach lifestyle is tempered by the pragmatic need for neck-to-knee, or more apt head-to-toe, swimwear. Spotty Dick is an irreverent play on male swimwear – it experiments with alternate modes to sheath the body with Lyrca in order to protect it from searing UV’s and at the same time light-heartedly fools around with texture and pattern; to be specific, black Scharovsky crystals, jewelled in spot patterns - jewelled clothing is not characteristically aligned to menswear and even less so to the great Aussie cossie. The crystals form a matrix of spots that attempt to provoke a sense of mischievousness aligned to the Aussie beach larrikin. Ironically, spot patterns are in itself a form of a parody, as prolonged sun exposure ages the skin and sun spots can occur if appropriate sun protection is not used. ‘Spotty Dick’ – a research experiment to test design suitability for the use of jewelled spot matrix patterns for UV aware men’s swimwear. The creative work was paraded at 56 shows, over a 2 week period, and an estimated 50,000 people viewed the work.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective: Parental illness (PI) may have adverse impacts on youth and family functioning. Research in this area has suffered from the absence of a guiding comprehensive framework. This study tested a conceptual model of the effects of PI on youth and family functioning derived from the Family Ecology Framework (FEF; Pedersen & Revenson, 2005). Method. A total of 85 parents with multiple sclerosis and 127 youth completed questionnaires at Time 1 and 12 months later at Time 2. Results. Structural equation modeling results supported the FEF with regards to physical-illness disability. Specifically, the proposed mediators (role redistribution, stress, and stigma) were implicated in the processes that link parental disability to several domains of youth adjustment. The results suggest that the effects of parental depression (PD) are not mediated through these processes; rather, PD directly affects family functioning, which in turn mediates the effects onto youth adjustment. Family functioning further mediated between PD and youth well-being and behavioral-social difficulties. Conclusions. Although results support the effects of parental-illness disability on youth and family functioning via the proposed mediational mechanisms, the additive effects of PD on youth physical and mental health occur through direct and indirect (via family functioning) pathways, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the last few years we have observed a proliferation of approaches for clustering XML docu- ments and schemas based on their structure and content. The presence of such a huge amount of approaches is due to the different applications requiring the XML data to be clustered. These applications need data in the form of similar contents, tags, paths, structures and semantics. In this paper, we first outline the application contexts in which clustering is useful, then we survey approaches so far proposed relying on the abstract representation of data (instances or schema), on the identified similarity measure, and on the clustering algorithm. This presentation leads to draw a taxonomy in which the current approaches can be classified and compared. We aim at introducing an integrated view that is useful when comparing XML data clustering approaches, when developing a new clustering algorithm, and when implementing an XML clustering compo- nent. Finally, the paper moves into the description of future trends and research issues that still need to be faced.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the growing number of XML documents on theWeb it becomes essential to effectively organise these XML documents in order to retrieve useful information from them. A possible solution is to apply clustering on the XML documents to discover knowledge that promotes effective data management, information retrieval and query processing. However, many issues arise in discovering knowledge from these types of semi-structured documents due to their heterogeneity and structural irregularity. Most of the existing research on clustering techniques focuses only on one feature of the XML documents, this being either their structure or their content due to scalability and complexity problems. The knowledge gained in the form of clusters based on the structure or the content is not suitable for reallife datasets. It therefore becomes essential to include both the structure and content of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both these kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. The overall objective of this thesis is to address these issues by: (1) proposing methods to utilise frequent pattern mining techniques to reduce the dimension; (2) developing models to effectively combine the structure and content of XML documents; and (3) utilising the proposed models in clustering. This research first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. A clustering framework with two types of models, implicit and explicit, is developed. The implicit model uses a Vector Space Model (VSM) to combine the structure and the content information. The explicit model uses a higher order model, namely a 3- order Tensor Space Model (TSM), to explicitly combine the structure and the content information. This thesis also proposes a novel incremental technique to decompose largesized tensor models to utilise the decomposed solution for clustering the XML documents. The proposed framework and its components were extensively evaluated on several real-life datasets exhibiting extreme characteristics to understand the usefulness of the proposed framework in real-life situations. Additionally, this research evaluates the outcome of the clustering process on the collection selection problem in the information retrieval on the Wikipedia dataset. The experimental results demonstrate that the proposed frequent pattern mining and clustering methods outperform the related state-of-the-art approaches. In particular, the proposed framework of utilising frequent structures for constraining the content shows an improvement in accuracy over content-only and structure-only clustering results. The scalability evaluation experiments conducted on large scaled datasets clearly show the strengths of the proposed methods over state-of-the-art methods. In particular, this thesis work contributes to effectively combining the structure and the content of XML documents for clustering, in order to improve the accuracy of the clustering solution. In addition, it also contributes by addressing the research gaps in frequent pattern mining to generate efficient and concise frequent subtrees with various node relationships that could be used in clustering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In response to the need to leverage private finance and the lack of competition in some parts of the Australian public sector infrastructure market, especially in the very large economic infrastructure sector procured using Pubic Private Partnerships, the Australian Federal government has demonstrated its desire to attract new sources of in-bound foreign direct investment (FDI). This paper aims to report on progress towards an investigation into the determinants of multinational contractors’ willingness to bid for Australian public sector major infrastructure projects. This research deploys Dunning’s eclectic theory for the first time in terms of in-bound FDI by multinational contractors into Australia. Elsewhere, the authors have developed Dunning’s principal hypothesis to suit the context of this research and to address a weakness arising in this hypothesis that is based on a nominal approach to the factors in Dunning's eclectic framework and which fails to speak to the relative explanatory power of these factors. In this paper, a first stage test of the authors' development of Dunning's hypothesis is presented by way of an initial review of secondary data vis-à-vis the selected sector (roads and bridges) in Australia (as the host location) and with respect to four selected home countries (China; Japan; Spain; and US). In doing so, the next stage in the research method concerning sampling and case studies is also further developed and described in this paper. In conclusion, the extent to which the initial review of secondary data suggests the relative importance of the factors in the eclectic framework is considered. It is noted that more robust conclusions are expected following the future planned stages of the research including primary data from the case studies and a global survey of the world’s largest contractors and which is briefly previewed. Finally, and beyond theoretical contributions expected from the overall approach taken to developing and testing Dunning’s framework, other expected contributions concerning research method and practical implications are mentioned.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a road survey as part of a workshop conducted by the Texas Department of Transportation (TxDOT) to evaluate and improve the maintenance practices of the Texas highway system. Directors of maintenance from six peer states (California, Kansas, Georgia, Missouri, North Carolina, and Washington) were invited to this 3-day workshop. One of the important parts of this workshop was a Maintenance Test Section Survey (MTSS) to evaluate a number of pre-selected one-mile roadway sections. The workshop schedule allowed half a day to conduct the field survey and 34 sections were evaluated. Each of the evaluators was given a booklet and asked to rate the selected road sections. The goals of the MTSS were to: 1. Assess the threshold level at which maintenance activities are required as perceived by the evaluators from the peer states; 2. Assess the threshold level at which maintenance activities are required as perceived by evaluators from other TxDOT districts; and 3. Perform a pilot evaluation of the MTSS concept. This paper summarizes the information obtained from survey and discusses the major findings based on a statistical analysis of the data and comments from the survey participants.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study explores the international entrepreneurial values influencing the intensity of Internet use in the internationalization process of small to medium sized enterprises (SMEs), within the Australian tourism industry. The findings point to a relationship between the values of international entrepreneurs and the inclination of the firm to develop and initiate international activity. And so, this study endeavors to offer insight into issues that remain unresolved in existing tourism and international entrepreneurship (IE) literature. Two effective but underutilized qualitative methods were used in this study to identify the values of international entrepreneurs. They are repertory test and laddering analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Accurate and detailed road models play an important role in a number of geospatial applications, such as infrastructure planning, traffic monitoring, and driver assistance systems. In this thesis, an integrated approach for the automatic extraction of precise road features from high resolution aerial images and LiDAR point clouds is presented. A framework of road information modeling has been proposed, for rural and urban scenarios respectively, and an integrated system has been developed to deal with road feature extraction using image and LiDAR analysis. For road extraction in rural regions, a hierarchical image analysis is first performed to maximize the exploitation of road characteristics in different resolutions. The rough locations and directions of roads are provided by the road centerlines detected in low resolution images, both of which can be further employed to facilitate the road information generation in high resolution images. The histogram thresholding method is then chosen to classify road details in high resolution images, where color space transformation is used for data preparation. After the road surface detection, anisotropic Gaussian and Gabor filters are employed to enhance road pavement markings while constraining other ground objects, such as vegetation and houses. Afterwards, pavement markings are obtained from the filtered image using the Otsu's clustering method. The final road model is generated by superimposing the lane markings on the road surfaces, where the digital terrain model (DTM) produced by LiDAR data can also be combined to obtain the 3D road model. As the extraction of roads in urban areas is greatly affected by buildings, shadows, vehicles, and parking lots, we combine high resolution aerial images and dense LiDAR data to fully exploit the precise spectral and horizontal spatial resolution of aerial images and the accurate vertical information provided by airborne LiDAR. Objectoriented image analysis methods are employed to process the feature classiffcation and road detection in aerial images. In this process, we first utilize an adaptive mean shift (MS) segmentation algorithm to segment the original images into meaningful object-oriented clusters. Then the support vector machine (SVM) algorithm is further applied on the MS segmented image to extract road objects. Road surface detected in LiDAR intensity images is taken as a mask to remove the effects of shadows and trees. In addition, normalized DSM (nDSM) obtained from LiDAR is employed to filter out other above-ground objects, such as buildings and vehicles. The proposed road extraction approaches are tested using rural and urban datasets respectively. The rural road extraction method is performed using pan-sharpened aerial images of the Bruce Highway, Gympie, Queensland. The road extraction algorithm for urban regions is tested using the datasets of Bundaberg, which combine aerial imagery and LiDAR data. Quantitative evaluation of the extracted road information for both datasets has been carried out. The experiments and the evaluation results using Gympie datasets show that more than 96% of the road surfaces and over 90% of the lane markings are accurately reconstructed, and the false alarm rates for road surfaces and lane markings are below 3% and 2% respectively. For the urban test sites of Bundaberg, more than 93% of the road surface is correctly reconstructed, and the mis-detection rate is below 10%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Contends that South African universities must find admissions criteria, other than high school grades, that are both fair and valid for Black applicants severely disadvantaged by an inferior school education. The use of traditional intellectual assessments and aptitude tests for disadvantaged and minority students remains controversial as a fair assessment; they do not take account of potential for change. In this study, therefore, a measure of students' cognitive modifiability, assessed by means of an interactive assessment model, was added as a moderator of traditional intellectual assessment in predicting 1st-yr university success. Cognitive modifiability significantly moderated the predictive validity of the traditional intellectual assessment for 52 disadvantaged Black students. The higher the level of cognitive modifiability, the less effective were traditional methods for predicting academic success and vice versa.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective To examine the risk factors for Mycobacterium tuberculosis infection (MTI) among Greenlandic children for the purpose of identifying those at highest risk of infection. Methods Between 2005 and 2007, 1797 Greenlandic schoolchildren in five different areas were tested for MTI with an interferon gamma release assay (IGRA) and a tuberculin skin test (TST). Parents or guardians were surveyed using a standardized self-administered questionnaire to obtain data on crowding in the household, parents’ educational level and the child’s health status. Demographic data for each child – i.e. parents’ place of birth, number of siblings, distance between siblings (next younger and next older), birth order and mother’s age when the child was born – were also extracted from a public registry. Logistic regression was used to check for associations between these variables and MTI, and all results were expressed as odds ratios (ORs) and 95% confidence intervals (CIs). Children were considered to have MTI if they tested positive on both the IGRA assay and the TST. Findings The overall prevalence of MTI was 8.5% (152/1797). MTI was diagnosed in 26.7% of the children with a known TB contact, as opposed to 6.4% of the children without such contact. Overall, the MTI rate was higher among Inuit children (OR: 4.22; 95% CI: 1.55–11.5) and among children born less than one year after the birth of the next older sibling (OR: 2.48; 95% CI: 1.33–4.63). Self-reported TB contact modified the profile to include household crowding and low mother’s education. Children who had an older MTI-positive sibling were much more likely to test positive for MTI themselves (OR: 14.2; 95% CI: 5.75–35.0) than children without an infected older sibling. Conclusion Ethnicity, sibling relations, number of household residents and maternal level of education are factors associated with the risk of TB infection among children in Greenland. The strong household clustering of MTI suggests that family sources of exposure are important.