840 resultados para Object-oriented methods (Computer science)
Resumo:
Changing or creating an organisation means creating a new process. Each process involves many risks that need to be identified and managed. The main risks considered here are procedural and legal risks. The former are related to the risks of errors that may occur during processes, while the latter are related to the compliance of processes with regulations. Managing the risks implies proposing changes to the processes that allow the desired result: an optimised process. In order to manage a company and optimise it in the best possible way, not only should the organisational aspect, risk management and legal compliance be taken into account, but it is important that they are all analysed simultaneously with the aim of finding the right balance that satisfies them all. This is the aim of this thesis, to provide methods and tools to balance these three characteristics, and to enable this type of optimisation, ICT support is used. This work isn’t a thesis in computer science or law, but rather an interdisciplinary thesis. Most of the work done so far is vertical and in a specific domain. The particularity and aim of this thesis is not to carry out an in-depth analysis of a particular aspect, but rather to combine several important aspects, normally analysed separately, which however have an impact and influence each other. In order to carry out this kind of interdisciplinary analysis, the knowledge base of both areas was involved and the combination and collaboration of different experts in the various fields was necessary. Although the methodology described is generic and can be applied to all sectors, the case study considered is a new type of healthcare service that allows patients in acute disease to be hospitalised to their home. This provide the possibility to perform experiments using real hospital database.
Resumo:
The world of Computational Biology and Bioinformatics presently integrates many different expertise, including computer science and electronic engineering. A major aim in Data Science is the development and tuning of specific computational approaches to interpret the complexity of Biology. Molecular biologists and medical doctors heavily rely on an interdisciplinary expert capable of understanding the biological background to apply algorithms for finding optimal solutions to their problems. With this problem-solving orientation, I was involved in two basic research fields: Cancer Genomics and Enzyme Proteomics. For this reason, what I developed and implemented can be considered a general effort to help data analysis both in Cancer Genomics and in Enzyme Proteomics, focusing on enzymes which catalyse all the biochemical reactions in cells. Specifically, as to Cancer Genomics I contributed to the characterization of intratumoral immune microenvironment in gastrointestinal stromal tumours (GISTs) correlating immune cell population levels with tumour subtypes. I was involved in the setup of strategies for the evaluation and standardization of different approaches for fusion transcript detection in sarcomas that can be applied in routine diagnostic. This was part of a coordinated effort of the Sarcoma working group of "Alleanza Contro il Cancro". As to Enzyme Proteomics, I generated a derived database collecting all the human proteins and enzymes which are known to be associated to genetic disease. I curated the data search in freely available databases such as PDB, UniProt, Humsavar, Clinvar and I was responsible of searching, updating, and handling the information content, and computing statistics. I also developed a web server, BENZ, which allows researchers to annotate an enzyme sequence with the corresponding Enzyme Commission number, the important feature fully describing the catalysed reaction. More to this, I greatly contributed to the characterization of the enzyme-genetic disease association, for a better classification of the metabolic genetic diseases.
Resumo:
In the framework of industrial problems, the application of Constrained Optimization is known to have overall very good modeling capability and performance and stands as one of the most powerful, explored, and exploited tool to address prescriptive tasks. The number of applications is huge, ranging from logistics to transportation, packing, production, telecommunication, scheduling, and much more. The main reason behind this success is to be found in the remarkable effort put in the last decades by the OR community to develop realistic models and devise exact or approximate methods to solve the largest variety of constrained or combinatorial optimization problems, together with the spread of computational power and easily accessible OR software and resources. On the other hand, the technological advancements lead to a data wealth never seen before and increasingly push towards methods able to extract useful knowledge from them; among the data-driven methods, Machine Learning techniques appear to be one of the most promising, thanks to its successes in domains like Image Recognition, Natural Language Processes and playing games, but also the amount of research involved. The purpose of the present research is to study how Machine Learning and Constrained Optimization can be used together to achieve systems able to leverage the strengths of both methods: this would open the way to exploiting decades of research on resolution techniques for COPs and constructing models able to adapt and learn from available data. In the first part of this work, we survey the existing techniques and classify them according to the type, method, or scope of the integration; subsequently, we introduce a novel and general algorithm devised to inject knowledge into learning models through constraints, Moving Target. In the last part of the thesis, two applications stemming from real-world projects and done in collaboration with Optit will be presented.
Resumo:
Following the approval of the 2030 Agenda for Sustainable Development in 2015, sustainability became a hotly debated topic. In order to build a better and more sustainable future by 2030, this agenda addressed several global issues, including inequality, climate change, peace, and justice, in the form of 17 Sustainable Development Goals (SDGs), that should be understood and pursued by nations, corporations, institutions, and individuals. In this thesis, we researched how to exploit and integrate Human-Computer Interaction (HCI) and Data Visualization to promote knowledge and awareness about SDG 8, which wants to encourage lasting, inclusive, and sustainable economic growth, full and productive employment, and decent work for all. In particular, we focused on three targets: green economy, sustainable tourism, employment, decent work for all, and social protection. The primary goal of this research is to determine whether HCI approaches may be used to create and validate interactive data visualization that can serve as helpful decision-making aids for specific groups and raise their knowledge of public-interest issues. To accomplish this goal, we analyzed four case studies. In the first two, we wanted to promote knowledge and awareness about green economy issues: we investigated the Human-Building Interaction inside a Smart Campus and the dematerialization process inside a University. In the third, we focused on smart tourism, investigating the relationship between locals and tourists to create meaningful connections and promote more sustainable tourism. In the fourth, we explored the industry context to highlight sustainability policies inside well-known companies. This research focuses on the hypothesis that interactive data visualization tools can make communities aware of sustainability aspects related to SDG8 and its targets. The research questions addressed are two: "how to promote awareness about SDG8 and its targets through interactive data visualizations?" and "to what extent are these interactive data visualizations effective?".
Resumo:
Deep Neural Networks (DNNs) have revolutionized a wide range of applications beyond traditional machine learning and artificial intelligence fields, e.g., computer vision, healthcare, natural language processing and others. At the same time, edge devices have become central in our society, generating an unprecedented amount of data which could be used to train data-hungry models such as DNNs. However, the potentially sensitive or confidential nature of gathered data poses privacy concerns when storing and processing them in centralized locations. To this purpose, decentralized learning decouples model training from the need of directly accessing raw data, by alternating on-device training and periodic communications. The ability of distilling knowledge from decentralized data, however, comes at the cost of facing more challenging learning settings, such as coping with heterogeneous hardware and network connectivity, statistical diversity of data, and ensuring verifiable privacy guarantees. This Thesis proposes an extensive overview of decentralized learning literature, including a novel taxonomy and a detailed description of the most relevant system-level contributions in the related literature for privacy, communication efficiency, data and system heterogeneity, and poisoning defense. Next, this Thesis presents the design of an original solution to tackle communication efficiency and system heterogeneity, and empirically evaluates it on federated settings. For communication efficiency, an original method, specifically designed for Convolutional Neural Networks, is also described and evaluated against the state-of-the-art. Furthermore, this Thesis provides an in-depth review of recently proposed methods to tackle the performance degradation introduced by data heterogeneity, followed by empirical evaluations on challenging data distributions, highlighting strengths and possible weaknesses of the considered solutions. Finally, this Thesis presents a novel perspective on the usage of Knowledge Distillation as a mean for optimizing decentralized learning systems in settings characterized by data heterogeneity or system heterogeneity. Our vision on relevant future research directions close the manuscript.
Resumo:
This thesis reports on the two main areas of our research: introductory programming as the traditional way of accessing informatics and cultural teaching informatics through unconventional pathways. The research on introductory programming aims to overcome challenges in traditional programming education, thus increasing participation in informatics. Improving access to informatics enables individuals to pursue more and better professional opportunities and contribute to informatics advancements. We aimed to balance active, student-centered activities and provide optimal support to novices at their level. Inspired by Productive Failure and exploring the concept of notional machine, our work focused on developing Necessity Learning Design, a design to help novices tackle new programming concepts. Using this design, we implemented a learning sequence to introduce arrays and evaluated it in a real high-school context. The subsequent chapters discuss our experiences teaching CS1 in a remote-only scenario during the COVID-19 pandemic and our collaborative effort with primary school teachers to develop a learning module for teaching iteration using a visual programming environment. The research on teaching informatics principles through unconventional pathways, such as cryptography, aims to introduce informatics to a broader audience, particularly younger individuals that are less technical and professional-oriented. It emphasizes the importance of understanding informatics's cultural and scientific aspects to focus on the informatics societal value and its principles for active citizenship. After reflecting on computational thinking and inspired by the big ideas of science and informatics, we describe our hands-on approach to teaching cryptography in high school, which leverages its key scientific elements to emphasize its social aspects. Additionally, we present an activity for teaching public-key cryptography using graphs to explore fundamental concepts and methods in informatics and mathematics and their interdisciplinarity. In broadening the understanding of informatics, these research initiatives also aim to foster motivation and prime for more professional learning of informatics.
Resumo:
Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdos-Renyi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabasi-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree k variation, decreasing its network recovery rate with the increase of k. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.
Resumo:
Objective: We carry out a systematic assessment on a suite of kernel-based learning machines while coping with the task of epilepsy diagnosis through automatic electroencephalogram (EEG) signal classification. Methods and materials: The kernel machines investigated include the standard support vector machine (SVM), the least squares SVM, the Lagrangian SVM, the smooth SVM, the proximal SVM, and the relevance vector machine. An extensive series of experiments was conducted on publicly available data, whose clinical EEG recordings were obtained from five normal subjects and five epileptic patients. The performance levels delivered by the different kernel machines are contrasted in terms of the criteria of predictive accuracy, sensitivity to the kernel function/parameter value, and sensitivity to the type of features extracted from the signal. For this purpose, 26 values for the kernel parameter (radius) of two well-known kernel functions (namely. Gaussian and exponential radial basis functions) were considered as well as 21 types of features extracted from the EEG signal, including statistical values derived from the discrete wavelet transform, Lyapunov exponents, and combinations thereof. Results: We first quantitatively assess the impact of the choice of the wavelet basis on the quality of the features extracted. Four wavelet basis functions were considered in this study. Then, we provide the average accuracy (i.e., cross-validation error) values delivered by 252 kernel machine configurations; in particular, 40%/35% of the best-calibrated models of the standard and least squares SVMs reached 100% accuracy rate for the two kernel functions considered. Moreover, we show the sensitivity profiles exhibited by a large sample of the configurations whereby one can visually inspect their levels of sensitiveness to the type of feature and to the kernel function/parameter value. Conclusions: Overall, the results evidence that all kernel machines are competitive in terms of accuracy, with the standard and least squares SVMs prevailing more consistently. Moreover, the choice of the kernel function and parameter value as well as the choice of the feature extractor are critical decisions to be taken, albeit the choice of the wavelet family seems not to be so relevant. Also, the statistical values calculated over the Lyapunov exponents were good sources of signal representation, but not as informative as their wavelet counterparts. Finally, a typical sensitivity profile has emerged among all types of machines, involving some regions of stability separated by zones of sharp variation, with some kernel parameter values clearly associated with better accuracy rates (zones of optimality). (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.
Resumo:
Introduction: Internet users are increasingly using the worldwide web to search for information relating to their health. This situation makes it necessary to create specialized tools capable of supporting users in their searches. Objective: To apply and compare strategies that were developed to investigate the use of the Portuguese version of Medical Subject Headings (MeSH) for constructing an automated classifier for Brazilian Portuguese-language web-based content within or outside of the field of healthcare, focusing on the lay public. Methods: 3658 Brazilian web pages were used to train the classifier and 606 Brazilian web pages were used to validate it. The strategies proposed were constructed using content-based vector methods for text classification, such that Naive Bayes was used for the task of classifying vector patterns with characteristics obtained through the proposed strategies. Results: A strategy named InDeCS was developed specifically to adapt MeSH for the problem that was put forward. This approach achieved better accuracy for this pattern classification task (0.94 sensitivity, specificity and area under the ROC curve). Conclusions: Because of the significant results achieved by InDeCS, this tool has been successfully applied to the Brazilian healthcare search portal known as Busca Saude. Furthermore, it could be shown that MeSH presents important results when used for the task of classifying web-based content focusing on the lay public. It was also possible to show from this study that MeSH was able to map out mutable non-deterministic characteristics of the web. (c) 2010 Elsevier Inc. All rights reserved.
Resumo:
Diabetes mellitus (DM) is a disease that affects a large number of people, and the number of problems associated with the disease has been increasing in the past few decades. These problems include cardiovascular disorders, blindness and the eventual need to amputate limbs. Therefore, the quality of life for people living with DM is less than it is for healthy people. In several cases, metabolic syndrome (MS), which can be considered a disturbance of the lipid metabolism, is associated with DM. In this work, two drugs used to treat DM, pioglitazone and rosiglitazone, were studied using theoretical methods, and their molecular properties were related to the biological activity of these drugs. From the results, it was possible to correlate the properties of each substance-particularly electronic properties-with the biological interactions that are linked to their pharmacological effects. These results suggest that there are future prospects for designing or developing new drugs based on the correlation between theoretical and experimental properties.
Resumo:
A study on the possible sites of oxidation and epoxidation of nortriptyline was performed using electrochemical and quantum chemical methods; these sites are involved in the biological responses (for example, hepatotoxicity) of nortriptyline and other similar antidepressants. Quantum chemical studies and electrochemical experiments demonstrated that the oxidation and epoxidation sites are located on the apolar region of nortriptyline, which will useful for understanding the molecule`s activity. Also, for the determination of the compound in biological fluids or in pharmaceutical formulations, we propose a useful analytical methodology using a graphite-polyurethane composite electrode, which exhibited the best performance when compared with boron-doped diamond or glassy carbon surfaces.
Resumo:
We show that commutative group spherical codes in R(n), as introduced by D. Slepian, are directly related to flat tori and quotients of lattices. As consequence of this view, we derive new results on the geometry of these codes and an upper bound for their cardinality in terms of minimum distance and the maximum center density of lattices and general spherical packings in the half dimension of the code. This bound is tight in the sense it can be arbitrarily approached in any dimension. Examples of this approach and a comparison of this bound with Union and Rankin bounds for general spherical codes is also presented.
Resumo:
Motivation: Understanding the patterns of association between polymorphisms at different loci in a population ( linkage disequilibrium, LD) is of fundamental importance in various genetic studies. Many coefficients were proposed for measuring the degree of LD, but they provide only a static view of the current LD structure. Generative models (GMs) were proposed to go beyond these measures, giving not only a description of the actual LD structure but also a tool to help understanding the process that generated such structure. GMs based in coalescent theory have been the most appealing because they link LD to evolutionary factors. Nevertheless, the inference and parameter estimation of such models is still computationally challenging. Results: We present a more practical method to build GM that describe LD. The method is based on learning weighted Bayesian network structures from haplotype data, extracting equivalence structure classes and using them to model LD. The results obtained in public data from the HapMap database showed that the method is a promising tool for modeling LD. The associations represented by the learned models are correlated with the traditional measure of LD D`. The method was able to represent LD blocks found by standard tools. The granularity of the association blocks and the readability of the models can be controlled in the method. The results suggest that the causality information gained by our method can be useful to tell about the conservability of the genetic markers and to guide the selection of subset of representative markers.
Resumo:
This paper presents an approach for the active transmission losses allocation between the agents of the system. The approach uses the primal and dual variable information of the Optimal Power Flow in the losses allocation strategy. The allocation coefficients are determined via Lagrange multipliers. The paper emphasizes the necessity to consider the operational constraints and parameters of the systems in the problem solution. An example, for a 3-bus system is presented in details, as well as a comparative test with the main allocation methods. Case studies on the IEEE 14-bus systems are carried out to verify the influence of the constraints and parameters of the system in the losses allocation.