926 resultados para visual data analysis


Relevância:

90.00% 90.00%

Publicador:

Resumo:

This dissertation research points out major challenging problems with current Knowledge Organization (KO) systems, such as subject gateways or web directories: (1) the current systems use traditional knowledge organization systems based on controlled vocabulary which is not very well suited to web resources, and (2) information is organized by professionals not by users, which means it does not reflect intuitively and instantaneously expressed users’ current needs. In order to explore users’ needs, I examined social tags which are user-generated uncontrolled vocabulary. As investment in professionally-developed subject gateways and web directories diminishes (support for both BUBL and Intute, examined in this study, is being discontinued), understanding characteristics of social tagging becomes even more critical. Several researchers have discussed social tagging behavior and its usefulness for classification or retrieval; however, further research is needed to qualitatively and quantitatively investigate social tagging in order to verify its quality and benefit. This research particularly examined the indexing consistency of social tagging in comparison to professional indexing to examine the quality and efficacy of tagging. The data analysis was divided into three phases: analysis of indexing consistency, analysis of tagging effectiveness, and analysis of tag attributes. Most indexing consistency studies have been conducted with a small number of professional indexers, and they tended to exclude users. Furthermore, the studies mainly have focused on physical library collections. This dissertation research bridged these gaps by (1) extending the scope of resources to various web documents indexed by users and (2) employing the Information Retrieval (IR) Vector Space Model (VSM) - based indexing consistency method since it is suitable for dealing with a large number of indexers. As a second phase, an analysis of tagging effectiveness with tagging exhaustivity and tag specificity was conducted to ameliorate the drawbacks of consistency analysis based on only the quantitative measures of vocabulary matching. Finally, to investigate tagging pattern and behaviors, a content analysis on tag attributes was conducted based on the FRBR model. The findings revealed that there was greater consistency over all subjects among taggers compared to that for two groups of professionals. The analysis of tagging exhaustivity and tag specificity in relation to tagging effectiveness was conducted to ameliorate difficulties associated with limitations in the analysis of indexing consistency based on only the quantitative measures of vocabulary matching. Examination of exhaustivity and specificity of social tags provided insights into particular characteristics of tagging behavior and its variation across subjects. To further investigate the quality of tags, a Latent Semantic Analysis (LSA) was conducted to determine to what extent tags are conceptually related to professionals’ keywords and it was found that tags of higher specificity tended to have a higher semantic relatedness to professionals’ keywords. This leads to the conclusion that the term’s power as a differentiator is related to its semantic relatedness to documents. The findings on tag attributes identified the important bibliographic attributes of tags beyond describing subjects or topics of a document. The findings also showed that tags have essential attributes matching those defined in FRBR. Furthermore, in terms of specific subject areas, the findings originally identified that taggers exhibited different tagging behaviors representing distinctive features and tendencies on web documents characterizing digital heterogeneous media resources. These results have led to the conclusion that there should be an increased awareness of diverse user needs by subject in order to improve metadata in practical applications. This dissertation research is the first necessary step to utilize social tagging in digital information organization by verifying the quality and efficacy of social tagging. This dissertation research combined both quantitative (statistics) and qualitative (content analysis using FRBR) approaches to vocabulary analysis of tags which provided a more complete examination of the quality of tags. Through the detailed analysis of tag properties undertaken in this dissertation, we have a clearer understanding of the extent to which social tagging can be used to replace (and in some cases to improve upon) professional indexing.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In September 2013, staff from the University of the South Pacific (USP) Honiara campus, the Secretariat of the Pacific Community (SPC) and IFREMER (UR LEADNC, AMBIO project) in New Caledonia, and the French Institute for Pacific Coral Reefs (IRCP) in Moorea, French Polynesia, co-facilitated a workshop entitled “Different survey methods of coral reef fish, including the methods based on underwater video”. The workshop was attended by students from USP, NGO and fisheries officers. They were trained to several underwater visual census techniques and to the STAVIRO video-based technique, including both field work and data analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Analysis of data without labels is commonly subject to scrutiny by unsupervised machine learning techniques. Such techniques provide more meaningful representations, useful for better understanding of a problem at hand, than by looking only at the data itself. Although abundant expert knowledge exists in many areas where unlabelled data is examined, such knowledge is rarely incorporated into automatic analysis. Incorporation of expert knowledge is frequently a matter of combining multiple data sources from disparate hypothetical spaces. In cases where such spaces belong to different data types, this task becomes even more challenging. In this paper we present a novel immune-inspired method that enables the fusion of such disparate types of data for a specific set of problems. We show that our method provides a better visual understanding of one hypothetical space with the help of data from another hypothetical space. We believe that our model has implications for the field of exploratory data analysis and knowledge discovery.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Tese (doutorado)—Universidade de Brasília, Faculdade de Educação, Programa de Pós-graduação em Educação, 2015.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Objetivo: Determinar si el uso de transparencias de color ayuda a mejorar la lectura, eliminando distorsiones visuales perceptuales, malestares físicos al leer, sintomatología del síndrome Irlen. Materiales y Métodos: Estudio cuasi-experimental sobre efectos del Método Irlen® - uso del color - en sesenta y un estudiantes del cuarto grado de las escuelas urbanas de Cuenca, identificados como severos en el rango de Irlen, en un estudio anterior de prevalencia. Los participantes fueron evaluados a través de nuevas observaciones, entrevistas y cuatro pruebas de la Escala Perceptual de Lectura Irlen. Medidas de tendencia central y porcentajes fueron utilizadas para el análisis de datos. Resultados: Las mejoras atribuidas al uso del color en rango considerable fueron: 1) 59% comodidad; 2) 37.7% menos borroso; 3) 41% menos tensión y fatiga; 4) 45.9% más seguridad y fluidez al leer; 5) 34.4% menos movimientos en la página; 6) 31.2% eliminación de distorsiones; 7) 13.1% menos errores al leer; 8) 9.8% mejora del espacio limitado; 9) 8.2% en atención limitada; y 10) 1.6% mejora en comprensión lectora. Conclusión: El uso de las transparencias de color ayuda parcialmente a eliminar algunas distorsiones visuales perceptuales y malestares físicos al leer lo que facilita la lectura

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Dissertação (mestrado)—Universidade de Brasília, Faculdade de Educação, Programa de Pós-Graduação em Educação, 2016.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

An overview is given of a user interaction monitoring and analysis framework called BaranC. Monitoring and analysing human-digital interaction is an essential part of developing a user model as the basis for investigating user experience. The primary human-digital interaction, such as on a laptop or smartphone, is best understood and modelled in the wider context of the user and their environment. The BaranC framework provides monitoring and analysis capabilities that not only records all user interaction with a digital device (e.g. smartphone), but also collects all available context data (such as from sensors in the digital device itself, a fitness band or a smart appliances). The data collected by BaranC is recorded as a User Digital Imprint (UDI) which is, in effect, the user model and provides the basis for data analysis. BaranC provides functionality that is useful for user experience studies, user interface design evaluation, and providing user assistance services. An important concern for personal data is privacy, and the framework gives the user full control over the monitoring, storing and sharing of their data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Problema. Esta investigación se aproxima al entorno escolar con el propósito de avanzar en la comprensión de los imaginarios de los adolescentes y docentes en torno al cuerpo, la corporalidad y la AF, como un elemento relevante en el diseño de programas y planes efectivos para fomento de la práctica de AF. Objetivo. Analizar los imaginarios sociales de docentes y adolescentes en torno a los conceptos de cuerpo, corporalidad y AF. Métodos. Investigación de corte cualitativo, descriptivo e interpretativo. Se realizaron entrevistas semi-estructuradas a docentes y a estudiantes entre los 12 y 18 años de un colegio público de Bogotá. Se realizó análisis de contenido. Se compararon los resultados de estudiantes por grupos de edades y género. Resultados. Docentes y estudiantes definen el cuerpo a partir de las características biológicas, las diferencias sexuales y las funciones vitales. La definición de corporalidad en los estudiantes se encuentra ligada con la imagen y la apariencia física; los docentes la entienden como la posibilidad de interactuar con el entorno y como la materialización de la existencia. La AF en los estudiantes se asocia con la práctica de ejercicio y deporte, en los docentes se comprende como una práctica de autocuidado que permite el mantenimiento de la salud. Conclusiones. Para promover la AF tempranamente como una experiencia vital es necesario intervenir los espacios escolares. Hay que vincular al cuerpo a los procesos formativos con el propósito de desarrollar la autonomía corporal, este aspecto implica cambios en los currículos.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

RESUMEN Objetivo: Estimar la prevalencia de las diferentes enfermedades oftalmológicas que aparecen en el contexto de una enfermedad autoinmune (EAI) en pacientes de un centro de referencia reumatológica en Colombia, según características clínicas y sociodemográficas durante un período de 15 años, comprendido entre los años 2000 a 2015. Métodos: Se realizó un estudio descriptivo, observacional de prevalencia. El tipo de muestreo fue aleatorio estratificado con asignación proporcional en el programa Epidat 3.4. Los datos se analizaron en el programa SPSS v22.0 y se realizó análisis univariado de las variables categóricas, para las variables cuantitativas se realizaron medidas de tendencia central. Resultados: De 1640 historias clínicas revisadas, se encontraron 634 pacientes (38,65%) con compromiso ocular. Si excluimos los pacientes con SS, que por definición presentan ojo seco, 222 pacientes (13,53%) presentaron compromiso oftalmológico. Del total de pacientes, el 83,3% fueron mujeres. La AR fue la enfermedad autoinmune con mayor compromiso oftalmológico con 138 pacientes (62,2%), y en último lugar la sarcoidosis con 1 solo paciente afectado. La QCS fue la manifestación más común en todos los grupos diagnósticos de EAI, con 146 pacientes (63,5%). De 414 pacientes con Síndrome de Sjögren (SS) y QCS 8 presentaron compromiso ocular adicional, siendo la uveítis la segunda patología ocular asociada en pacientes con SS y la primera causa en las espondiloartropatias (71,4 %). Los pacientes con catarata (4,1%) presentaron la mayor prevalencia de uso de corticoide (88.8%). De 222 pacientes, 28 (12,6%) presentaron uveítis. Del total de pacientes, 16 (7,2%) presentaron maculopatía por antimalaráricos y 6 (18,75%) de los pacientes con LES. Los ANAS se presentaron en el 100% los pacientes con trastorno vascular de la retina. Los pacientes con epiescleritis presentaron la mayor proporción de positivización de anticuerpos anti-DNA. La EAI que más presentó epiescleritis fue LES con 4 pacientes (12,5%) El 22% de paciente con anticuerpos anti-RNP presentaron escleritis y 32,1% de los pacientes con uveítis presentaron HLA-B27 positivo. Las manifestaciones oftalmológicas precedieron a las sistémicas entre un 11,1% y un 33,3% de los pacientes. Conclusión: Las enfermedades oculares se presentan con frecuencia en los pacientes colombianos con EAI (38.65%), siendo la AR la enfermedad con mayor compromiso ocular (62,2%) y la QCS la enfermedad ocular con mayor prevalencia en todas las EAI (63,5%). La uveítis se presentó en 28 pacientes (12,6%). Las manifestaciones oftalmológicas pueden preceder a las sistémicas. El examen oftalmológico debe ser incluido en los pacientes con EAI, por ser la enfermedad ocular una comorbilidad frecuente. Adicionalmente, los efectos oftalmológicos de las medicaciones sistémicas utilizadas en EAI deben ser estrechamente monitorizados, durante el curso del tratamiento.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Big data are reshaping the way we interact with technology, thus fostering new applications to increase the safety-assessment of foods. An extraordinary amount of information is analysed using machine learning approaches aimed at detecting the existence or predicting the likelihood of future risks. Food business operators have to share the results of these analyses when applying to place on the market regulated products, whereas agri-food safety agencies (including the European Food Safety Authority) are exploring new avenues to increase the accuracy of their evaluations by processing Big data. Such an informational endowment brings with it opportunities and risks correlated to the extraction of meaningful inferences from data. However, conflicting interests and tensions among the involved entities - the industry, food safety agencies, and consumers - hinder the finding of shared methods to steer the processing of Big data in a sound, transparent and trustworthy way. A recent reform in the EU sectoral legislation, the lack of trust and the presence of a considerable number of stakeholders highlight the need of ethical contributions aimed at steering the development and the deployment of Big data applications. Moreover, Artificial Intelligence guidelines and charters published by European Union institutions and Member States have to be discussed in light of applied contexts, including the one at stake. This thesis aims to contribute to these goals by discussing what principles should be put forward when processing Big data in the context of agri-food safety-risk assessment. The research focuses on two interviewed topics - data ownership and data governance - by evaluating how the regulatory framework addresses the challenges raised by Big data analysis in these domains. The outcome of the project is a tentative Roadmap aimed to identify the principles to be observed when processing Big data in this domain and their possible implementations.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The world of Computational Biology and Bioinformatics presently integrates many different expertise, including computer science and electronic engineering. A major aim in Data Science is the development and tuning of specific computational approaches to interpret the complexity of Biology. Molecular biologists and medical doctors heavily rely on an interdisciplinary expert capable of understanding the biological background to apply algorithms for finding optimal solutions to their problems. With this problem-solving orientation, I was involved in two basic research fields: Cancer Genomics and Enzyme Proteomics. For this reason, what I developed and implemented can be considered a general effort to help data analysis both in Cancer Genomics and in Enzyme Proteomics, focusing on enzymes which catalyse all the biochemical reactions in cells. Specifically, as to Cancer Genomics I contributed to the characterization of intratumoral immune microenvironment in gastrointestinal stromal tumours (GISTs) correlating immune cell population levels with tumour subtypes. I was involved in the setup of strategies for the evaluation and standardization of different approaches for fusion transcript detection in sarcomas that can be applied in routine diagnostic. This was part of a coordinated effort of the Sarcoma working group of "Alleanza Contro il Cancro". As to Enzyme Proteomics, I generated a derived database collecting all the human proteins and enzymes which are known to be associated to genetic disease. I curated the data search in freely available databases such as PDB, UniProt, Humsavar, Clinvar and I was responsible of searching, updating, and handling the information content, and computing statistics. I also developed a web server, BENZ, which allows researchers to annotate an enzyme sequence with the corresponding Enzyme Commission number, the important feature fully describing the catalysed reaction. More to this, I greatly contributed to the characterization of the enzyme-genetic disease association, for a better classification of the metabolic genetic diseases.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Model misspecification affects the classical test statistics used to assess the fit of the Item Response Theory (IRT) models. Robust tests have been derived under model misspecification, as the Generalized Lagrange Multiplier and Hausman tests, but their use has not been largely explored in the IRT framework. In the first part of the thesis, we introduce the Generalized Lagrange Multiplier test to detect differential item response functioning in IRT models for binary data under model misspecification. By means of a simulation study and a real data analysis, we compare its performance with the classical Lagrange Multiplier test, computed using the Hessian and the cross-product matrix, and the Generalized Jackknife Score test. The power of these tests is computed empirically and asymptotically. The misspecifications considered are local dependence among items and non-normal distribution of the latent variable. The results highlight that, under mild model misspecification, all tests have good performance while, under strong model misspecification, the performance of the tests deteriorates. None of the tests considered show an overall superior performance than the others. In the second part of the thesis, we extend the Generalized Hausman test to detect non-normality of the latent variable distribution. To build the test, we consider a seminonparametric-IRT model, that assumes a more flexible latent variable distribution. By means of a simulation study and two real applications, we compare the performance of the Generalized Hausman test with the M2 limited information goodness-of-fit test and the Likelihood-Ratio test. Additionally, the information criteria are computed. The Generalized Hausman test has a better performance than the Likelihood-Ratio test in terms of Type I error rates and the M2 test in terms of power. The performance of the Generalized Hausman test and the information criteria deteriorates when the sample size is small and with a few items.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this thesis, we investigate the role of applied physics in epidemiological surveillance through the application of mathematical models, network science and machine learning. The spread of a communicable disease depends on many biological, social, and health factors. The large masses of data available make it possible, on the one hand, to monitor the evolution and spread of pathogenic organisms; on the other hand, to study the behavior of people, their opinions and habits. Presented here are three lines of research in which an attempt was made to solve real epidemiological problems through data analysis and the use of statistical and mathematical models. In Chapter 1, we applied language-inspired Deep Learning models to transform influenza protein sequences into vectors encoding their information content. We then attempted to reconstruct the antigenic properties of different viral strains using regression models and to identify the mutations responsible for vaccine escape. In Chapter 2, we constructed a compartmental model to describe the spread of a bacterium within a hospital ward. The model was informed and validated on time series of clinical measurements, and a sensitivity analysis was used to assess the impact of different control measures. Finally (Chapter 3) we reconstructed the network of retweets among COVID-19 themed Twitter users in the early months of the SARS-CoV-2 pandemic. By means of community detection algorithms and centrality measures, we characterized users’ attention shifts in the network, showing that scientific communities, initially the most retweeted, lost influence over time to national political communities. In the Conclusion, we highlighted the importance of the work done in light of the main contemporary challenges for epidemiological surveillance. In particular, we present reflections on the importance of nowcasting and forecasting, the relationship between data and scientific research, and the need to unite the different scales of epidemiological surveillance.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Artificial Intelligence (AI) and Machine Learning (ML) are novel data analysis techniques providing very accurate prediction results. They are widely adopted in a variety of industries to improve efficiency and decision-making, but they are also being used to develop intelligent systems. Their success grounds upon complex mathematical models, whose decisions and rationale are usually difficult to comprehend for human users to the point of being dubbed as black-boxes. This is particularly relevant in sensitive and highly regulated domains. To mitigate and possibly solve this issue, the Explainable AI (XAI) field became prominent in recent years. XAI consists of models and techniques to enable understanding of the intricated patterns discovered by black-box models. In this thesis, we consider model-agnostic XAI techniques, which can be applied to Tabular data, with a particular focus on the Credit Scoring domain. Special attention is dedicated to the LIME framework, for which we propose several modifications to the vanilla algorithm, in particular: a pair of complementary Stability Indices that accurately measure LIME stability, and the OptiLIME policy which helps the practitioner finding the proper balance among explanations' stability and reliability. We subsequently put forward GLEAMS a model-agnostic surrogate interpretable model which requires to be trained only once, while providing both Local and Global explanations of the black-box model. GLEAMS produces feature attributions and what-if scenarios, from both dataset and model perspective. Eventually, we argue that synthetic data are an emerging trend in AI, being more and more used to train complex models instead of original data. To be able to explain the outcomes of such models, we must guarantee that synthetic data are reliable enough to be able to translate their explanations to real-world individuals. To this end we propose DAISYnt, a suite of tests to measure synthetic tabular data quality and privacy.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The abundance of visual data and the push for robust AI are driving the need for automated visual sensemaking. Computer Vision (CV) faces growing demand for models that can discern not only what images "represent," but also what they "evoke." This is a demand for tools mimicking human perception at a high semantic level, categorizing images based on concepts like freedom, danger, or safety. However, automating this process is challenging due to entropy, scarcity, subjectivity, and ethical considerations. These challenges not only impact performance but also underscore the critical need for interoperability. This dissertation focuses on abstract concept-based (AC) image classification, guided by three technical principles: situated grounding, performance enhancement, and interpretability. We introduce ART-stract, a novel dataset of cultural images annotated with ACs, serving as the foundation for a series of experiments across four key domains: assessing the effectiveness of the end-to-end DL paradigm, exploring cognitive-inspired semantic intermediaries, incorporating cultural and commonsense aspects, and neuro-symbolic integration of sensory-perceptual data with cognitive-based knowledge. Our results demonstrate that integrating CV approaches with semantic technologies yields methods that surpass the current state of the art in AC image classification, outperforming the end-to-end deep vision paradigm. The results emphasize the role semantic technologies can play in developing both effective and interpretable systems, through the capturing, situating, and reasoning over knowledge related to visual data. Furthermore, this dissertation explores the complex interplay between technical and socio-technical factors. By merging technical expertise with an understanding of human and societal aspects, we advocate for responsible labeling and training practices in visual media. These insights and techniques not only advance efforts in CV and explainable artificial intelligence but also propel us toward an era of AI development that harmonizes technical prowess with deep awareness of its human and societal implications.