10 resultados para Rede neural artificial

em AMS Tesi di Laurea - Alm@DL - Università di Bologna


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and often surpassing the state of the art on many tasks. The rise of deep learning is also revolutionizing the entire field of Machine Learning and Pattern Recognition pushing forward the concepts of automatic feature extraction and unsupervised learning in general. However, despite the strong success both in science and business, deep learning has its own limitations. It is often questioned if such techniques are only some kind of brute-force statistical approaches and if they can only work in the context of High Performance Computing with tons of data. Another important question is whether they are really biologically inspired, as claimed in certain cases, and if they can scale well in terms of "intelligence". The dissertation is focused on trying to answer these key questions in the context of Computer Vision and, in particular, Object Recognition, a task that has been heavily revolutionized by recent advances in the field. Practically speaking, these answers are based on an exhaustive comparison between two, very different, deep learning techniques on the aforementioned task: Convolutional Neural Network (CNN) and Hierarchical Temporal memory (HTM). They stand for two different approaches and points of view within the big hat of deep learning and are the best choices to understand and point out strengths and weaknesses of each of them. CNN is considered one of the most classic and powerful supervised methods used today in machine learning and pattern recognition, especially in object recognition. CNNs are well received and accepted by the scientific community and are already deployed in large corporation like Google and Facebook for solving face recognition and image auto-tagging problems. HTM, on the other hand, is known as a new emerging paradigm and a new meanly-unsupervised method, that is more biologically inspired. It tries to gain more insights from the computational neuroscience community in order to incorporate concepts like time, context and attention during the learning process which are typical of the human brain. In the end, the thesis is supposed to prove that in certain cases, with a lower quantity of data, HTM can outperform CNN.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Combinatorial optimization problems are typically tackled by the branch-and-bound paradigm. We propose to learn a variable selection policy for branch-and-bound in mixed-integer linear programming, by imitation learning on a diversified variant of the strong branching expert rule. We encode states as bipartite graphs and parameterize the policy as a graph convolutional neural network. Experiments on a series of synthetic problems demonstrate that our approach produces policies that can improve upon expert-designed branching rules on large problems, and generalize to instances significantly larger than seen during training.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Most of the existing open-source search engines, utilize keyword or tf-idf based techniques to find relevant documents and web pages relative to an input query. Although these methods, with the help of a page rank or knowledge graphs, proved to be effective in some cases, they often fail to retrieve relevant instances for more complicated queries that would require a semantic understanding to be exploited. In this Thesis, a self-supervised information retrieval system based on transformers is employed to build a semantic search engine over the library of Gruppo Maggioli company. Semantic search or search with meaning can refer to an understanding of the query, instead of simply finding words matches and, in general, it represents knowledge in a way suitable for retrieval. We chose to investigate a new self-supervised strategy to handle the training of unlabeled data based on the creation of pairs of ’artificial’ queries and the respective positive passages. We claim that by removing the reliance on labeled data, we may use the large volume of unlabeled material on the web without being limited to languages or domains where labeled data is abundant.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis examines the state of audiovisual translation (AVT) in the aftermath of the COVID-19 emergency, highlighting new trends with regards to the implementation of AI technologies as well as their strengths, constraints, and ethical implications. It starts with an overview of the current AVT landscape, focusing on future projections about its evolution and its critical aspects such as the worsening working conditions lamented by AVT professionals – especially freelancers – in recent years and how they might be affected by the advent of AI technologies in the industry. The second chapter delves into the history and development of three AI technologies which are used in combination with neural machine translation in automatic AVT tools: automatic speech recognition, speech synthesis and deepfakes (voice cloning and visual deepfakes for lip syncing), including real examples of start-up companies that utilize them – or are planning to do so – to localize audiovisual content automatically or semi-automatically. The third chapter explores the many ethical concerns around these innovative technologies, which extend far beyond the field of translation; at the same time, it attempts to revindicate their potential to bring about immense progress in terms of accessibility and international cooperation, provided that their use is properly regulated. Lastly, the fourth chapter describes two experiments, testing the efficacy of the currently available tools for automatic subtitling and automatic dubbing respectively, in order to take a closer look at their perks and limitations compared to more traditional approaches. This analysis aims to help discerning legitimate concerns from unfounded speculations with regards to the AI technologies which are entering the field of AVT; the intention behind it is to humbly suggest a constructive and optimistic view of the technological transformations that appear to be underway, whilst also acknowledging their potential risks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Neural Networks customized and tested in this thesis (WaldoNet, FlowNet and PatchNet) are a first exploration and approach to the Template Matching task. The possibilities of extension are therefore many and some are proposed below. During my thesis, I have analyzed the functioning of the classical algorithms and adapted with deep learning algorithms. The features extracted from both the template and the query images resemble the keypoints of the SIFT algorithm. Then, instead of similarity function or keypoints matching, WaldoNet and PatchNet use the convolutional layer to compare the features, while FlowNet uses the correlational layer. In addition, I have identified the major challenges of the Template Matching task (affine/non-affine transformations, intensity changes...) and solved them with a careful design of the dataset.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Depth estimation from images has long been regarded as a preferable alternative compared to expensive and intrusive active sensors, such as LiDAR and ToF. The topic has attracted the attention of an increasingly wide audience thanks to the great amount of application domains, such as autonomous driving, robotic navigation and 3D reconstruction. Among the various techniques employed for depth estimation, stereo matching is one of the most widespread, owing to its robustness, speed and simplicity in setup. Recent developments has been aided by the abundance of annotated stereo images, which granted to deep learning the opportunity to thrive in a research area where deep networks can reach state-of-the-art sub-pixel precision in most cases. Despite the recent findings, stereo matching still begets many open challenges, two among them being finding pixel correspondences in presence of objects that exhibits a non-Lambertian behaviour and processing high-resolution images. Recently, a novel dataset named Booster, which contains high-resolution stereo pairs featuring a large collection of labeled non-Lambertian objects, has been released. The work shown that training state-of-the-art deep neural network on such data improves the generalization capabilities of these networks also in presence of non-Lambertian surfaces. Regardless being a further step to tackle the aforementioned challenge, Booster includes a rather small number of annotated images, and thus cannot satisfy the intensive training requirements of deep learning. This thesis work aims to investigate novel view synthesis techniques to augment the Booster dataset, with ultimate goal of improving stereo matching reliability in presence of high-resolution images that displays non-Lambertian surfaces.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis contributes to the ArgMining 2021 shared task on Key Point Analysis. Key Point Analysis entails extracting and calculating the prevalence of a concise list of the most prominent talking points, from an input corpus. These talking points are usually referred to as key points. Key point analysis is divided into two subtasks: Key Point Matching, which involves assigning a matching score to each key point/argument pair, and Key Point Generation, which consists of the generation of key points. The task of Key Point Matching was approached using different models: a pretrained Sentence Transformers model and a tree-constrained Graph Neural Network were tested. The best model was the fine-tuned Sentence Transformers, which achieved a mean Average Precision score of 0.75, ranking 12 compared to other participating teams. The model was then used for the subtask of Key Point Generation using the extractive method in the selection of key point candidates and the model developed for the previous subtask to evaluate them.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neural scene representation and neural rendering are new computer vision techniques that enable the reconstruction and implicit representation of real 3D scenes from a set of 2D captured images, by fitting a deep neural network. The trained network can then be used to render novel views of the scene. A recent work in this field, Neural Radiance Fields (NeRF), presented a state-of-the-art approach, which uses a simple Multilayer Perceptron (MLP) to generate photo-realistic RGB images of a scene from arbitrary viewpoints. However, NeRF does not model any light interaction with the fitted scene; therefore, despite producing compelling results for the view synthesis task, it does not provide a solution for relighting. In this work, we propose a new architecture to enable relighting capabilities in NeRF-based representations and we introduce a new real-world dataset to train and evaluate such a model. Our method demonstrates the ability to perform realistic rendering of novel views under arbitrary lighting conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Artificial Intelligence (AI) is gaining ever more ground in every sphere of human life, to the point that it is now even used to pass sentences in courts. The use of AI in the field of Law is however deemed quite controversial, as it could provide more objectivity yet entail an abuse of power as well, given that bias in algorithms behind AI may cause lack of accuracy. As a product of AI, machine translation is being increasingly used in the field of Law too in order to translate laws, judgements, contracts, etc. between different languages and different legal systems. In the legal setting of Company Law, accuracy of the content and suitability of terminology play a crucial role within a translation task, as any addition or omission of content or mistranslation of terms could entail legal consequences for companies. The purpose of the present study is to first assess which neural machine translation system between DeepL and ModernMT produces a more suitable translation from Italian into German of the atto costitutivo of an Italian s.r.l. in terms of accuracy of the content and correctness of terminology, and then to assess which translation proves to be closer to a human reference translation. In order to achieve the above-mentioned aims, two human and automatic evaluations are carried out based on the MQM taxonomy and the BLEU metric. Results of both evaluations show an overall better performance delivered by ModernMT in terms of content accuracy, suitability of terminology, and closeness to a human translation. As emerged from the MQM-based evaluation, its accuracy and terminology errors account for just 8.43% (as opposed to DeepL’s 9.22%), while it obtains an overall BLEU score of 29.14 (against DeepL’s 27.02). The overall performances however show that machines still face barriers in overcoming semantic complexity, tackling polysemy, and choosing domain-specific terminology, which suggests that the discrepancy with human translation may still be remarkable.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The usage of Optical Character Recognition’s (OCR, systems is a widely spread technology into the world of Computer Vision and Machine Learning. It is a topic that interest many field, for example the automotive, where becomes a specialized task known as License Plate Recognition, useful for many application from the automation of toll road to intelligent payments. However, OCR systems need to be very accurate and generalizable in order to be able to extract the text of license plates under high variable conditions, from the type of camera used for acquisition to light changes. Such variables compromise the quality of digitalized real scenes causing the presence of noise and degradation of various type, which can be minimized with the application of modern approaches for image iper resolution and noise reduction. Oneclass of them is known as Generative Neural Networks, which are very strong ally for the solution of this popular problem.