901 resultados para Learning techniques
Resumo:
This dissertation investigates the connection between spectral analysis and frame theory. When considering the spectral properties of a frame, we present a few novel results relating to the spectral decomposition. We first show that scalable frames have the property that the inner product of the scaling coefficients and the eigenvectors must equal the inverse eigenvalues. From this, we prove a similar result when an approximate scaling is obtained. We then focus on the optimization problems inherent to the scalable frames by first showing that there is an equivalence between scaling a frame and optimization problems with a non-restrictive objective function. Various objective functions are considered, and an analysis of the solution type is presented. For linear objectives, we can encourage sparse scalings, and with barrier objective functions, we force dense solutions. We further consider frames in high dimensions, and derive various solution techniques. From here, we restrict ourselves to various frame classes, to add more specificity to the results. Using frames generated from distributions allows for the placement of probabilistic bounds on scalability. For discrete distributions (Bernoulli and Rademacher), we bound the probability of encountering an ONB, and for continuous symmetric distributions (Uniform and Gaussian), we show that symmetry is retained in the transformed domain. We also prove several hyperplane-separation results. With the theory developed, we discuss graph applications of the scalability framework. We make a connection with graph conditioning, and show the in-feasibility of the problem in the general case. After a modification, we show that any complete graph can be conditioned. We then present a modification of standard PCA (robust PCA) developed by Cand\`es, and give some background into Electron Energy-Loss Spectroscopy (EELS). We design a novel scheme for the processing of EELS through robust PCA and least-squares regression, and test this scheme on biological samples. Finally, we take the idea of robust PCA and apply the technique of kernel PCA to perform robust manifold learning. We derive the problem and present an algorithm for its solution. There is also discussion of the differences with RPCA that make theoretical guarantees difficult.
Resumo:
Nigerian scam, also known as advance fee fraud or 419 scam, is a prevalent form of online fraudulent activity that causes financial loss to individuals and businesses. Nigerian scam has evolved from simple non-targeted email messages to more sophisticated scams targeted at users of classifieds, dating and other websites. Even though such scams are observed and reported by users frequently, the community’s understanding of Nigerian scams is limited since the scammers operate “underground”. To better understand the underground Nigerian scam ecosystem and seek effective methods to deter Nigerian scam and cybercrime in general, we conduct a series of active and passive measurement studies. Relying upon the analysis and insight gained from the measurement studies, we make four contributions: (1) we analyze the taxonomy of Nigerian scam and derive long-term trends in scams; (2) we provide an insight on Nigerian scam and cybercrime ecosystems and their underground operation; (3) we propose a payment intervention as a potential deterrent to cybercrime operation in general and evaluate its effectiveness; and (4) we offer active and passive measurement tools and techniques that enable in-depth analysis of cybercrime ecosystems and deterrence on them. We first created and analyze a repository of more than two hundred thousand user-reported scam emails, stretching from 2006 to 2014, from four major scam reporting websites. We select ten most commonly observed scam categories and tag 2,000 scam emails randomly selected from our repository. Based upon the manually tagged dataset, we train a machine learning classifier and cluster all scam emails in the repository. From the clustering result, we find a strong and sustained upward trend for targeted scams and downward trend for non-targeted scams. We then focus on two types of targeted scams: sales scams and rental scams targeted users on Craigslist. We built an automated scam data collection system and gathered large-scale sales scam emails. Using the system we posted honeypot ads on Craigslist and conversed automatically with the scammers. Through the email conversation, the system obtained additional confirmation of likely scam activities and collected additional information such as IP addresses and shipping addresses. Our analysis revealed that around 10 groups were responsible for nearly half of the over 13,000 total scam attempts we received. These groups used IP addresses and shipping addresses in both Nigeria and the U.S. We also crawled rental ads on Craigslist, identified rental scam ads amongst the large number of benign ads and conversed with the potential scammers. Through in-depth analysis of the rental scams, we found seven major scam campaigns employing various operations and monetization methods. We also found that unlike sales scammers, most rental scammers were in the U.S. The large-scale scam data and in-depth analysis provide useful insights on how to design effective deterrence techniques against cybercrime in general. We study underground DDoS-for-hire services, also known as booters, and measure the effectiveness of undermining a payment system of DDoS Services. Our analysis shows that the payment intervention can have the desired effect of limiting cybercriminals’ ability and increasing the risk of accepting payments.
Resumo:
Evolutionary algorithms alone cannot solve optimization problems very efficiently since there are many random (not very rational) decisions in these algorithms. Combination of evolutionary algorithms and other techniques have been proven to be an efficient optimization methodology. In this talk, I will explain the basic ideas of our three algorithms along this line (1): Orthogonal genetic algorithm which treats crossover/mutation as an experimental design problem, (2) Multiobjective evolutionary algorithm based on decomposition (MOEA/D) which uses decomposition techniques from traditional mathematical programming in multiobjective optimization evolutionary algorithm, and (3) Regular model based multiobjective estimation of distribution algorithms (RM-MEDA) which uses the regular property and machine learning methods for improving multiobjective evolutionary algorithms.
Resumo:
The aim of this study is to investigate the effectiveness of problem-based learning (PBL) on students’ mathematical performance. This includes mathematics achievement and students’ attitudes towards mathematics for third and eighth grade students in Saudi Arabia. Mathematics achievement includes, knowing, applying, and reasoning domains, while students’ attitudes towards mathematics covers, ‘Like learning mathematics’, ‘value mathematics’, and ‘a confidence to learn mathematics’. This study goes deeper to examine the interaction of a PBL teaching strategy, with trained face-to-face and self-directed learning teachers, on students’ performance (mathematics achievement and attitudes towards mathematics). It also examines the interaction between different ability levels of students (high and low levels) with a PBL teaching strategy (with trained face-to-face or self-directed learning teachers) on students’ performance. It draws upon findings and techniques of the TIMSS international benchmarking studies. Mixed methods are used to analyse the quasi-experimental study data. One -way ANOVA, Mixed ANOVA, and paired t-tests models are used to analyse quantitative data, while a semi-structured interview with teachers, and author’s observations are used to enrich understanding of PBL and mathematical performance. The findings show that the PBL teaching strategy significantly improves students’ knowledge application, and is better than the traditional teaching methods among third grade students. This improvement, however, occurred only with the trained face-to-face teacher’s group. Furthermore, there is robust evidence that using a PBL teaching strategy could raise significantly students’ liking of learning mathematics, and confidence to learn mathematics, more than traditional teaching methods among third grade students. Howe ver, there was no evidence that PBL could improve students’ performance (mathematics achievement and attitudes towards mathematics), more than traditional teaching methods, among eighth grade students. In 8th grade, the findings for low achieving students show significant improvement compared to high achieving students, whether PBL is applied or not. However, for 3th grade students, no significant difference in mathematical achievement between high and low achieving students was found. The results were not expected for high achieving students and this is also discussed. The implications of these findings for mathematics education in Saudi Arabia are considered.
Resumo:
This document describes the experience of academic cooperation between professionals in the field of library science, both from West Chester University (WCU), and the National University (UNA) of Costa Rica. The event took place at West Chester University during the week May 4th to May 8th, 2009. The objectives of this revolved around the exchange of ideas and interests in the academic and cultural relations between the two universities. In addition, it unveiled several services and procedures in the handling of information and highlighted the importance of promoting the exchange of students from both institutions. Finally, this article highlights the schedule of activities to integrate international and intercultural perspective in various areas related to the teaching-learning process, the contribution of university libraries on student success and techniques of information dissemination.
Resumo:
The present study aims to investigate the constructs of Technological Readiness Index (TRI) and the Expectancy Disconfirmation Theory (EDT) as determinants of satisfaction and continuance intention use in e-learning services. Is proposed a theoretical model that seeks to measure the phenomenon suited to the needs of public organizations that offer distance learning course with the use of virtual platforms for employees. The research was conducted from a quantitative analytical approach, via online survey in a sample of 343 employees of 2 public organizations in RN who have had e-learning experience. The strategy of data analysis used multivariate analysis techniques, including structural equation modeling (SEM), operationalized by AMOS© software. The results showed that quality, quality disconfirmation, value and value disconfirmation positively impact on satisfaction, as well as disconfirmation usability, innovativeness and optimism. Likewise, satisfaction proved to be decisive for the purpose of continuance intention use. In addition, technological readiness and performance are strongly related. Based on the structural model found by the study, public organizations can implement e-learning services for employees focusing on improving learning and improving skills practiced in the organizational environment
Resumo:
Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2014
Resumo:
Data sources are often dispersed geographically in real life applications. Finding a knowledge model may require to join all the data sources and to run a machine learning algorithm on the joint set. We present an alternative based on a Multi Agent System (MAS): an agent mines one data source in order to extract a local theory (knowledge model) and then merges it with the previous MAS theory using a knowledge fusion technique. This way, we obtain a global theory that summarizes the distributed knowledge without spending resources and time in joining data sources. New experiments have been executed including statistical significance analysis. The results show that, as a result of knowledge fusion, the accuracy of initial theories is significantly improved as well as the accuracy of the monolithic solution.
Resumo:
A growing body of research in higher education suggests that teachers should move away from traditional lecturing towards more active and student-focus education approaches. Several classroom techniques are available to engage students and achieve more effective teaching and better learning experiences. The purpose of this paper is to share an example of how two of them – case-based teaching, and the use of response technologies – were implemented into a graduate-level food science course. The paper focuses in particular on teaching sensory science and sensometrics, including several concrete examples used during the course, and discussing in each case some of the observed outcomes. Overall, it was observed that the particular initiatives were effective in engaging student participation and promoting a more active way of learning. Case-base teaching provided students with the opportunity to apply their knowledge and their analytical skills to complex, real-life scenarios relevant to the subject matter. The use of audience response systems further facilitated class discussion, and was extremely well received by the students, providing a more enjoyable classroom experience.
Resumo:
That humans and animals learn from interaction with the environment is a foundational idea underlying nearly all theories of learning and intelligence. Learning that certain outcomes are associated with specific actions or stimuli (both internal and external), is at the very core of the capacity to adapt behaviour to environmental changes. In the present work, appetitive and aversive reinforcement learning paradigms have been used to investigate the fronto-striatal loops and behavioural correlates of adaptive and maladaptive reinforcement learning processes, aiming to a deeper understanding of how cortical and subcortical substrates interacts between them and with other brain systems to support learning. By combining a large variety of neuroscientific approaches, including behavioral and psychophysiological methods, EEG and neuroimaging techniques, these studies aim at clarifying and advancing the knowledge of the neural bases and computational mechanisms of reinforcement learning, both in normal and neurologically impaired population.
Resumo:
Molecular radiotherapy (MRT) is a fast developing and promising treatment for metastasised neuroendocrine tumours. Efficacy of MRT is based on the capability to selectively "deliver" radiation to tumour cells, minimizing administered dose to normal tissues. Outcome of MRT depends on the individual patient characteristics. For that reason, personalized treatment planning is important to improve outcomes of therapy. Dosimetry plays a key role in this setting, as it is the main physical quantity related to radiation effects on cells. Dosimetry in MRT consists in a complex series of procedures ranging from imaging quantification to dose calculation. This doctoral thesis focused on several aspects concerning the clinical implementation of absorbed dose calculations in MRT. Accuracy of SPECT/CT quantification was assessed in order to determine the optimal reconstruction parameters. A model of PVE correction was developed in order to improve the activity quantification in small volume, such us lesions in clinical patterns. Advanced dosimetric methods were compared with the aim of defining the most accurate modality, applicable in clinical routine. Also, for the first time on a large number of clinical cases, the overall uncertainty of tumour dose calculation was assessed. As part of the MRTDosimetry project, protocols for calibration of SPECT/CT systems and implementation of dosimetry were drawn up in order to provide standard guidelines to the clinics offering MRT. To estimate the risk of experiencing radio-toxicity side effects and the chance of inducing damage on neoplastic cells is crucial for patient selection and treatment planning. In this thesis, the NTCP and TCP models were derived based on clinical data as help to clinicians to decide the pharmaceutical dosage in relation to the therapy control and the limitation of damage to healthy tissues. Moreover, a model for tumour response prediction based on Machine Learning analysis was developed.
Resumo:
Reinforcement learning is a particular paradigm of machine learning that, recently, has proved times and times again to be a very effective and powerful approach. On the other hand, cryptography usually takes the opposite direction. While machine learning aims at analyzing data, cryptography aims at maintaining its privacy by hiding such data. However, the two techniques can be jointly used to create privacy preserving models, able to make inferences on the data without leaking sensitive information. Despite the numerous amount of studies performed on machine learning and cryptography, reinforcement learning in particular has never been applied to such cases before. Being able to successfully make use of reinforcement learning in an encrypted scenario would allow us to create an agent that efficiently controls a system without providing it with full knowledge of the environment it is operating in, leading the way to many possible use cases. Therefore, we have decided to apply the reinforcement learning paradigm to encrypted data. In this project we have applied one of the most well-known reinforcement learning algorithms, called Deep Q-Learning, to simple simulated environments and studied how the encryption affects the training performance of the agent, in order to see if it is still able to learn how to behave even when the input data is no longer readable by humans. The results of this work highlight that the agent is still able to learn with no issues whatsoever in small state spaces with non-secure encryptions, like AES in ECB mode. For fixed environments, it is also able to reach a suboptimal solution even in the presence of secure modes, like AES in CBC mode, showing a significant improvement with respect to a random agent; however, its ability to generalize in stochastic environments or big state spaces suffers greatly.
Resumo:
The aim of this thesis project is to automatically localize HCC tumors in the human liver and subsequently predict if the tumor will undergo microvascular infiltration (MVI), the initial stage of metastasis development. The input data for the work have been partially supplied by Sant'Orsola Hospital and partially downloaded from online medical databases. Two Unet models have been implemented for the automatic segmentation of the livers and the HCC malignancies within it. The segmentation models have been evaluated with the Intersection-over-Union and the Dice Coefficient metrics. The outcomes obtained for the liver automatic segmentation are quite good (IOU = 0.82; DC = 0.35); the outcomes obtained for the tumor automatic segmentation (IOU = 0.35; DC = 0.46) are, instead, affected by some limitations: it can be state that the algorithm is almost always able to detect the location of the tumor, but it tends to underestimate its dimensions. The purpose is to achieve the CT images of the HCC tumors, necessary for features extraction. The 14 Haralick features calculated from the 3D-GLCM, the 120 Radiomic features and the patients' clinical information are collected to build a dataset of 153 features. Now, the goal is to build a model able to discriminate, based on the features given, the tumors that will undergo MVI and those that will not. This task can be seen as a classification problem: each tumor needs to be classified either as “MVI positive” or “MVI negative”. Techniques for features selection are implemented to identify the most descriptive features for the problem at hand and then, a set of classification models are trained and compared. Among all, the models with the best performances (around 80-84% ± 8-15%) result to be the XGBoost Classifier, the SDG Classifier and the Logist Regression models (without penalization and with Lasso, Ridge or Elastic Net penalization).
Resumo:
Most of the existing open-source search engines, utilize keyword or tf-idf based techniques to find relevant documents and web pages relative to an input query. Although these methods, with the help of a page rank or knowledge graphs, proved to be effective in some cases, they often fail to retrieve relevant instances for more complicated queries that would require a semantic understanding to be exploited. In this Thesis, a self-supervised information retrieval system based on transformers is employed to build a semantic search engine over the library of Gruppo Maggioli company. Semantic search or search with meaning can refer to an understanding of the query, instead of simply finding words matches and, in general, it represents knowledge in a way suitable for retrieval. We chose to investigate a new self-supervised strategy to handle the training of unlabeled data based on the creation of pairs of ’artificial’ queries and the respective positive passages. We claim that by removing the reliance on labeled data, we may use the large volume of unlabeled material on the web without being limited to languages or domains where labeled data is abundant.
Resumo:
Collecting and analysing data is an important element in any field of human activity and research. Even in sports, collecting and analyzing statistical data is attracting a growing interest. Some exemplar use cases are: improvement of technical/tactical aspects for team coaches, definition of game strategies based on the opposite team play or evaluation of the performance of players. Other advantages are related to taking more precise and impartial judgment in referee decisions: a wrong decision can change the outcomes of important matches. Finally, it can be useful to provide better representations and graphic effects that make the game more engaging for the audience during the match. Nowadays it is possible to delegate this type of task to automatic software systems that can use cameras or even hardware sensors to collect images or data and process them. One of the most efficient methods to collect data is to process the video images of the sporting event through mixed techniques concerning machine learning applied to computer vision. As in other domains in which computer vision can be applied, the main tasks in sports are related to object detection, player tracking, and to the pose estimation of athletes. The goal of the present thesis is to apply different models of CNNs to analyze volleyball matches. Starting from video frames of a volleyball match, we reproduce a bird's eye view of the playing court where all the players are projected, reporting also for each player the type of action she/he is performing.