915 resultados para Model Classification
Resumo:
Multi-parametric and quantitative magnetic resonance imaging (MRI) techniques have come into the focus of interest, both as a research and diagnostic modality for the evaluation of patients suffering from mild cognitive decline and overt dementia. In this study we address the question, if disease related quantitative magnetization transfer effects (qMT) within the intra- and extracellular matrices of the hippocampus may aid in the differentiation between clinically diagnosed patients with Alzheimer disease (AD), patients with mild cognitive impairment (MCI) and healthy controls. We evaluated 22 patients with AD (n=12) and MCI (n=10) and 22 healthy elderly (n=12) and younger (n=10) controls with multi-parametric MRI. Neuropsychological testing was performed in patients and elderly controls (n=34). In order to quantify the qMT effects, the absorption spectrum was sampled at relevant off-resonance frequencies. The qMT-parameters were calculated according to a two-pool spin-bath model including the T1- and T2 relaxation parameters of the free pool, determined in separate experiments. Histograms (fixed bin-size) of the normalized qMT-parameter values (z-scores) within the anterior and posterior hippocampus (hippocampal head and body) were subjected to a fuzzy-c-means classification algorithm with downstreamed PCA projection. The within-cluster sums of point-to-centroid distances were used to examine the effects of qMT- and diffusion anisotropy parameters on the discrimination of healthy volunteers, patients with Alzheimer and MCIs. The qMT-parameters T2(r) (T2 of the restricted pool) and F (fractional pool size) differentiated between the three groups (control, MCI and AD) in the anterior hippocampus. In our cohort, the MT ratio, as proposed in previous reports, did not differentiate between MCI and AD or healthy controls and MCI, but between healthy controls and AD.
Resumo:
BACKGROUND: Wheezing disorders in childhood vary widely in clinical presentation and disease course. During the last years, several ways to classify wheezing children into different disease phenotypes have been proposed and are increasingly used for clinical guidance, but validation of these hypothetical entities is difficult. METHODOLOGY/PRINCIPAL FINDINGS: The aim of this study was to develop a testable disease model which reflects the full spectrum of wheezing illness in preschool children. We performed a qualitative study among a panel of 7 experienced clinicians from 4 European countries working in primary, secondary and tertiary paediatric care. In a series of questionnaire surveys and structured discussions, we found a general consensus that preschool wheezing disorders consist of several phenotypes, with a great heterogeneity of specific disease concepts between clinicians. Initially, 24 disease entities were described among the 7 physicians. In structured discussions, these could be narrowed down to three entities which were linked to proposed mechanisms: a) allergic wheeze, b) non-allergic wheeze due to structural airway narrowing and c) non-allergic wheeze due to increased immune response to viral infections. This disease model will serve to create an artificial dataset that allows the validation of data-driven multidimensional methods, such as cluster analysis, which have been proposed for identification of wheezing phenotypes in children. CONCLUSIONS/SIGNIFICANCE: While there appears to be wide agreement among clinicians that wheezing disorders consist of several diseases, there is less agreement regarding their number and nature. A great diversity of disease concepts exist but a unified phenotype classification reflecting underlying disease mechanisms is lacking. We propose a disease model which may help guide future research so that proposed mechanisms are measured at the right time and their role in disease heterogeneity can be studied.
Resumo:
BACKGROUND Low-grade gliomas (LGGs) are rare brain neoplasms, with survival spanning up to a few decades. Thus, accurate evaluations on how biomarkers impact survival among patients with LGG require long-term studies on samples prospectively collected over a long period. METHODS The 210 adult LGGs collected in our databank were screened for IDH1 and IDH2 mutations (IDHmut), MGMT gene promoter methylation (MGMTmet), 1p/19q loss of heterozygosity (1p19qloh), and nuclear TP53 immunopositivity (TP53pos). Multivariate survival analyses with multiple imputation of missing data were performed using either histopathology or molecular markers. Both models were compared using Akaike's information criterion (AIC). The molecular model was reduced by stepwise model selection to filter out the most critical predictors. A third model was generated to assess for various marker combinations. RESULTS Molecular parameters were better survival predictors than histology (ΔAIC = 12.5, P< .001). Forty-five percent of studied patients died. MGMTmet was positively associated with IDHmut (P< .001). In the molecular model with marker combinations, IDHmut/MGMTmet combined status had a favorable impact on overall survival, compared with IDHwt (hazard ratio [HR] = 0.33, P< .01), and even more so the triple combination, IDHmut/MGMTmet/1p19qloh (HR = 0.18, P< .001). Furthermore, IDHmut/MGMTmet/TP53pos triple combination was a significant risk factor for malignant transformation (HR = 2.75, P< .05). CONCLUSION By integrating networks of activated molecular glioma pathways, the model based on genotype better predicts prognosis than histology and, therefore, provides a more reliable tool for standardizing future treatment strategies.
Resumo:
Traditional methods do not actually measure peoples’ risk attitude naturally and precisely. Therefore, a fuzzy risk attitude classification method is developed. Since the prospect theory is usually considered as an effective model of decision making, the personalized parameters in prospect theory are firstly fuzzified to distinguish people with different risk attitudes, and then a fuzzy classification database schema is applied to calculate the exact value of risk value attitude and risk be- havior attitude. Finally, by applying a two-hierarchical clas- sification model, the precise value of synthetical risk attitude can be acquired.
Resumo:
A patient classification system was developed integrating a patient acuity instrument with a computerized nursing distribution method based on a linear programming model. The system was designed for real-time measurement of patient acuity (workload) and allocation of nursing personnel to optimize the utilization of resources.^ The acuity instrument was a prototype tool with eight categories of patients defined by patient severity and nursing intensity parameters. From this tool, the demand for nursing care was defined in patient points with one point equal to one hour of RN time. Validity and reliability of the instrument was determined as follows: (1) Content validity by a panel of expert nurses; (2) predictive validity through a paired t-test analysis of preshift and postshift categorization of patients; (3) initial reliability by a one month pilot of the instrument in a practice setting; and (4) interrater reliability by the Kappa statistic.^ The nursing distribution system was a linear programming model using a branch and bound technique for obtaining integer solutions. The objective function was to minimize the total number of nursing personnel used by optimally assigning the staff to meet the acuity needs of the units. A penalty weight was used as a coefficient of the objective function variables to define priorities for allocation of staff.^ The demand constraints were requirements to meet the total acuity points needed for each unit and to have a minimum number of RNs on each unit. Supply constraints were: (1) total availability of each type of staff and the value of that staff member (value was determined relative to that type of staff's ability to perform the job function of an RN (i.e., value for eight hours RN = 8 points, LVN = 6 points); (2) number of personnel available for floating between units.^ The capability of the model to assign staff quantitatively and qualitatively equal to the manual method was established by a thirty day comparison. Sensitivity testing demonstrated appropriate adjustment of the optimal solution to changes in penalty coefficients in the objective function and to acuity totals in the demand constraints.^ Further investigation of the model documented: correct adjustment of assignments in response to staff value changes; and cost minimization by an addition of a dollar coefficient to the objective function. ^
Resumo:
Computer vision-based food recognition could be used to estimate a meal's carbohydrate content for diabetic patients. This study proposes a methodology for automatic food recognition, based on the Bag of Features (BoF) model. An extensive technical investigation was conducted for the identification and optimization of the best performing components involved in the BoF architecture, as well as the estimation of the corresponding parameters. For the design and evaluation of the prototype system, a visual dataset with nearly 5,000 food images was created and organized into 11 classes. The optimized system computes dense local features, using the scale-invariant feature transform on the HSV color space, builds a visual dictionary of 10,000 visual words by using the hierarchical k-means clustering and finally classifies the food images with a linear support vector machine classifier. The system achieved classification accuracy of the order of 78%, thus proving the feasibility of the proposed approach in a very challenging image dataset.
Resumo:
The porcine skin has striking similarities to the human skin in terms of general structure, thickness, hair follicle content, pigmentation, collagen and lipid composition. This has been the basis for numerous studies using the pig as a model for wound healing, transdermal delivery, dermal toxicology, radiation and UVB effects. Considering that the skin also represents an immune organ of utmost importance for health, immune cells present in the skin of the pig will be reviewed. The focus of this review is on dendritic cells, which play a central role in the skin immune system as they serve as sentinels in the skin, which offers a large surface area exposed to the environment. Based on a literature review and original data we propose a classification of porcine dendritic cell subsets in the skin corresponding to the subsets described in the human skin. The equivalent of the human CD141(+) DC subset is CD1a(-)CD4(-)CD172a(-)CADM1(high), that of the CD1c(+) subset is CD1a(+)CD4(-)CD172a(+)CADM1(+/low), and porcine plasmacytoid dendritic cells are CD1a(-)CD4(+)CD172a(+)CADM1(-). CD209 and CD14 could represent markers of inflammatory monocyte-derived cells, either dendritic cells or macrophages. Future studies for example using transriptomic analysis of sorted populations are required to confirm the identity of these cells.
Resumo:
Background and purpose: Breast cancer continues to be a health problem for women, representing 28 percent of all female cancers and remaining one of the leading causes of death for women. Breast cancer incidence rates become substantial before the age of 50. After menopause, breast cancer incidence rates continue to increase with age creating a long-lasting source of concern (Harris et al., 1992). Mammography, a technique for the detection of breast tumors in their nonpalpable stage when they are most curable, has taken on considerable importance as a public health measure. The lifetime risk of breast cancer is approximately 1 in 9 and occurs over many decades. Recommendations are that screening be periodic in order to detect cancer at early stages. These recommendations, largely, are not followed. Not only are most women not getting regular mammograms, but this circumstance is particularly the case among older women where regular mammography has been proven to reduce mortality by approximately 30 percent. The purpose of this project was to increase our understanding of factors that are associated with stage of readiness to obtain subsequent mammograms. A secondary purpose of this research was to suggest further conceptual considerations toward the extension of the Transtheoretical Model (TTM) of behavior change to repeat screening mammography. ^ Methods. A sample (n = 1,222) of women 50 years and older in a large multi-specialty clinic in Houston, Texas was surveyed by mail questionnaire regarding their previous screening experience and stage of readiness to obtain repeat screening. A computerized database, maintained on all women who undergo mammography at the clinic, was used to identify women who are eligible for the project. The major statistical technique employed to select the significant variables and to examine the man and interaction effects of independent variables on dependent variables was polychotomous stepwise, logistic regression. A prediction model for each stage of readiness definition was estimated. The expected probabilities for stage of readiness were calculated to assess the magnitude and direction of significant predictors. ^ Results. Analysis showed that both ways of defining stage of readiness for obtaining a screening mammogram were associated with specific constructs, including decisional balance and processes of the change. ^ Conclusions. The results of the present study demonstrate that the TTM appears to translate to repeat mammography screening. Findings in the current study also support finding of previous studies that suggest that stage of readiness is associated with respondent decisional balance and the processes of change. ^
Resumo:
A census of 925 U.S. colleges and universities offering masters and doctorate degrees was conducted in order to study the number of elements of an environmental management system as defined by ISO 14001 possessed by small, medium and large institutions. A 30% response rate was received with 273 responses included in the final data analysis. Overall, the number of ISO 14001 elements implemented among the 273 institutions ranged from 0 to 16, with a median of 12. There was no significant association between the number of elements implemented among institutions and the size of the institution (p = 0.18; Kruskal-Wallis test) or among USEPA regions (p = 0.12; Kruskal-Wallis test). The proportion of U.S. colleges and universities that reported having implemented a structured, comprehensive environmental management system, defined by answering yes to all 16 elements, was 10% (95% C.I. 6.6%–14.1%); however 38% (95% C.I. 32.0%–43.8%) reported that they had implemented a structured, comprehensive environmental management system, while 30.0% (95% C.I. 24.7%–35.9%) are planning to implement a comprehensive environmental management system within the next five years. Stratified analyses were performed by institution size, Carnegie Classification and job title. ^ The Osnabruck model, and another under development by the South Carolina Sustainable Universities Initiative, are the only two environmental management system models that have been proposed specifically for colleges and universities, although several guides are now available. The Environmental Management System Implementation Model for U.S. Colleges and Universities developed is an adaptation of the ISO 14001 standard and USEPA recommendations and has been tailored to U.S. colleges and universities for use in streamlining the implementation process. In using this implementation model created for the U.S. research and academic setting, it is hoped that these highly specialized institutions will be provided with a clearer and more cost-effective path towards the implementation of an EMS and greater compliance with local, state and federal environmental legislation. ^
Resumo:
Developing a Model Interruption is a known human factor that contributes to errors and catastrophic events in healthcare as well as other high-risk industries. The landmark Institute of Medicine (IOM) report, To Err is Human, brought attention to the significance of preventable errors in medicine and suggested that interruptions could be a contributing factor. Previous studies of interruptions in healthcare did not offer a conceptual model by which to study interruptions. As a result of the serious consequences of interruptions investigated in other high-risk industries, there is a need to develop a model to describe, understand, explain, and predict interruptions and their consequences in healthcare. Therefore, the purpose of this study was to develop a model grounded in the literature and to use the model to describe and explain interruptions in healthcare. Specifically, this model would be used to describe and explain interruptions occurring in a Level One Trauma Center. A trauma center was chosen because this environment is characterized as intense, unpredictable, and interrupt-driven. The first step in developing the model began with a review of the literature which revealed that the concept interruption did not have a consistent definition in either the healthcare or non-healthcare literature. Walker and Avant’s method of concept analysis was used to clarify and define the concept. The analysis led to the identification of five defining attributes which include (1) a human experience, (2) an intrusion of a secondary, unplanned, and unexpected task, (3) discontinuity, (4) externally or internally initiated, and (5) situated within a context. However, before an interruption could commence, five conditions known as antecedents must occur. For an interruption to take place (1) an intent to interrupt is formed by the initiator, (2) a physical signal must pass a threshold test of detection by the recipient, (3) the sensory system of the recipient is stimulated to respond to the initiator, (4) an interruption task is presented to recipient, and (5) the interruption task is either accepted or rejected by v the recipient. An interruption was determined to be quantifiable by (1) the frequency of occurrence of an interruption, (2) the number of times the primary task has been suspended to perform an interrupting task, (3) the length of time the primary task has been suspended, and (4) the frequency of returning to the primary task or not returning to the primary task. As a result of the concept analysis, a definition of an interruption was derived from the literature. An interruption is defined as a break in the performance of a human activity initiated internal or external to the recipient and occurring within the context of a setting or location. This break results in the suspension of the initial task by initiating the performance of an unplanned task with the assumption that the initial task will be resumed. The definition is inclusive of all the defining attributes of an interruption. This is a standard definition that can be used by the healthcare industry. From the definition, a visual model of an interruption was developed. The model was used to describe and explain the interruptions recorded for an instrumental case study of physicians and registered nurses (RNs) working in a Level One Trauma Center. Five physicians were observed for a total of 29 hours, 31 minutes. Eight registered nurses were observed for a total of 40 hours 9 minutes. Observations were made on either the 0700–1500 or the 1500-2300 shift using the shadowing technique. Observations were recorded in the field note format. The field notes were analyzed by a hybrid method of categorizing activities and interruptions. The method was developed by using both a deductive a priori classification framework and by the inductive process utilizing line-byline coding and constant comparison as stated in Grounded Theory. The following categories were identified as relative to this study: Intended Recipient - the person to be interrupted Unintended Recipient - not the intended recipient of an interruption; i.e., receiving a phone call that was incorrectly dialed Indirect Recipient – the incidental recipient of an interruption; i.e., talking with another, thereby suspending the original activity Recipient Blocked – the intended recipient does not accept the interruption Recipient Delayed – the intended recipient postpones an interruption Self-interruption – a person, independent of another person, suspends one activity to perform another; i.e., while walking, stops abruptly and talks to another person Distraction – briefly disengaging from a task Organizational Design – the physical layout of the workspace that causes a disruption in workflow Artifacts Not Available – supplies and equipment that are not available in the workspace causing a disruption in workflow Initiator – a person who initiates an interruption Interruption by Organizational Design and Artifacts Not Available were identified as two new categories of interruption. These categories had not previously been cited in the literature. Analysis of the observations indicated that physicians were found to perform slightly fewer activities per hour when compared to RNs. This variance may be attributed to differing roles and responsibilities. Physicians were found to have more activities interrupted when compared to RNs. However, RNs experienced more interruptions per hour. Other people were determined to be the most commonly used medium through which to deliver an interruption. Additional mediums used to deliver an interruption vii included the telephone, pager, and one’s self. Both physicians and RNs were observed to resume an original interrupted activity more often than not. In most interruptions, both physicians and RNs performed only one or two interrupting activities before returning to the original interrupted activity. In conclusion the model was found to explain all interruptions observed during the study. However, the model will require an even more comprehensive study in order to establish its predictive value.
Resumo:
It is well accepted that tumorigenesis is a multi-step procedure involving aberrant functioning of genes regulating cell proliferation, differentiation, apoptosis, genome stability, angiogenesis and motility. To obtain a full understanding of tumorigenesis, it is necessary to collect information on all aspects of cell activity. Recent advances in high throughput technologies allow biologists to generate massive amounts of data, more than might have been imagined decades ago. These advances have made it possible to launch comprehensive projects such as (TCGA) and (ICGC) which systematically characterize the molecular fingerprints of cancer cells using gene expression, methylation, copy number, microRNA and SNP microarrays as well as next generation sequencing assays interrogating somatic mutation, insertion, deletion, translocation and structural rearrangements. Given the massive amount of data, a major challenge is to integrate information from multiple sources and formulate testable hypotheses. This thesis focuses on developing methodologies for integrative analyses of genomic assays profiled on the same set of samples. We have developed several novel methods for integrative biomarker identification and cancer classification. We introduce a regression-based approach to identify biomarkers predictive to therapy response or survival by integrating multiple assays including gene expression, methylation and copy number data through penalized regression. To identify key cancer-specific genes accounting for multiple mechanisms of regulation, we have developed the integIRTy software that provides robust and reliable inferences about gene alteration by automatically adjusting for sample heterogeneity as well as technical artifacts using Item Response Theory. To cope with the increasing need for accurate cancer diagnosis and individualized therapy, we have developed a robust and powerful algorithm called SIBER to systematically identify bimodally expressed genes using next generation RNAseq data. We have shown that prediction models built from these bimodal genes have the same accuracy as models built from all genes. Further, prediction models with dichotomized gene expression measurements based on their bimodal shapes still perform well. The effectiveness of outcome prediction using discretized signals paves the road for more accurate and interpretable cancer classification by integrating signals from multiple sources.
Resumo:
Feather pecking is a behaviour by which birds damage or destroy the feathers of themselves (self-pecking) or other birds (allo feather pecking), in some cases even plucking out feathers and eating these. The self-pecking is rarely seen in domestic laying hens but is not uncommon in parrots. Feather pecking in laying hens has been described as being stereotypic, i.e. a repetitive invariant motor pattern without an obvious function, and indeed the amount of self-pecking in parrots was found to correlate positively with the amount of recurrent perseveration (RP), the tendency to repeat responses inappropriately, which in humans and other animals was found to correlate with stereotypic behaviour. In the present experiment we set out to investigate the correlation between allo feather pecking and RP in laying hens. We used birds (N = 92) from the 10th and 11th generation (G10 and G11) of lines selectively bred for high feather pecking (HFP) and low feather pecking (LFP), and from an unselected control line (CON) with intermediate levels of feather pecking. We hypothesised that levels of RP would be higher, and the time taken (standardised latency) to repeat a response lower, in HFP compared to LFP hens, with CON hens in between. Using a two-choice guessing task, we found that lines differed significantly in their levels of RP, with HFP unexpectedly showing lower levels of RP than CON and LFP. Latency to make a repeat did not differ between lines. Latency to make a switch differed between lines with a shorter latency in HFP compared to LFP (in G10), or CON (in G11). Latency to peck for repeats vs. latency to peck for switches did not differ between lines. Total time to complete the test was significantly shorter in HFP compared to CON and LFP. Thus, our hypotheses were not supported by the data. In contrast, selection for feather pecking seems to induce the opposite effects than would be expected from stereotyping animals: pecking was less sequenced and reaction to make a switch and to complete the test was lower in HFP. This supports the hyperactivity-model of feather pecking, suggesting that feather pecking is related to a higher general activity, possibly due to changes in the dopaminergic system.
Resumo:
We present a novel approach using both sustained vowels and connected speech, to detect obstructive sleep apnea (OSA) cases within a homogeneous group of speakers. The proposed scheme is based on state-of-the-art GMM-based classifiers, and acknowledges specifically the way in which acoustic models are trained on standard databases, as well as the complexity of the resulting models and their adaptation to specific data. Our experimental database contains a suitable number of utterances and sustained speech from healthy (i.e control) and OSA Spanish speakers. Finally, a 25.1% relative reduction in classification error is achieved when fusing continuous and sustained speech classifiers. Index Terms: obstructive sleep apnea (OSA), gaussian mixture models (GMMs), background model (BM), classifier fusion.
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
Abstract Due to recent scientific and technological advances in information sys¬tems, it is now possible to perform almost every application on a mobile device. The need to make sense of such devices more intelligent opens an opportunity to design data mining algorithm that are able to autonomous execute in local devices to provide the device with knowledge. The problem behind autonomous mining deals with the proper configuration of the algorithm to produce the most appropriate results. Contextual information together with resource information of the device have a strong impact on both the feasibility of a particu¬lar execution and on the production of the proper patterns. On the other hand, performance of the algorithm expressed in terms of efficacy and efficiency highly depends on the features of the dataset to be analyzed together with values of the parameters of a particular implementation of an algorithm. However, few existing approaches deal with autonomous configuration of data mining algorithms and in any case they do not deal with contextual or resources information. Both issues are of particular significance, in particular for social net¬works application. In fact, the widespread use of social networks and consequently the amount of information shared have made the need of modeling context in social application a priority. Also the resource consumption has a crucial role in such platforms as the users are using social networks mainly on their mobile devices. This PhD thesis addresses the aforementioned open issues, focusing on i) Analyzing the behavior of algorithms, ii) mapping contextual and resources information to find the most appropriate configuration iii) applying the model for the case of a social recommender. Four main contributions are presented: - The EE-Model: is able to predict the behavior of a data mining algorithm in terms of resource consumed and accuracy of the mining model it will obtain. - The SC-Mapper: maps a situation defined by the context and resource state to a data mining configuration. - SOMAR: is a social activity (event and informal ongoings) recommender for mobile devices. - D-SOMAR: is an evolution of SOMAR which incorporates the configurator in order to provide updated recommendations. Finally, the experimental validation of the proposed contributions using synthetic and real datasets allows us to achieve the objectives and answer the research questions proposed for this dissertation.