841 resultados para Object based video
Resumo:
This thesis deals with Visual Servoing and its strictly connected disciplines like projective geometry, image processing, robotics and non-linear control. More specifically the work addresses the problem to control a robotic manipulator through one of the largely used Visual Servoing techniques: the Image Based Visual Servoing (IBVS). In Image Based Visual Servoing the robot is driven by on-line performing a feedback control loop that is closed directly in the 2D space of the camera sensor. The work considers the case of a monocular system with the only camera mounted on the robot end effector (eye in hand configuration). Through IBVS the system can be positioned with respect to a 3D fixed target by minimizing the differences between its initial view and its goal view, corresponding respectively to the initial and the goal system configurations: the robot Cartesian Motion is thus generated only by means of visual informations. However, the execution of a positioning control task by IBVS is not straightforward because singularity problems may occur and local minima may be reached where the reached image is very close to the target one but the 3D positioning task is far from being fulfilled: this happens in particular for large camera displacements, when the the initial and the goal target views are noticeably different. To overcame singularity and local minima drawbacks, maintaining the good properties of IBVS robustness with respect to modeling and camera calibration errors, an opportune image path planning can be exploited. This work deals with the problem of generating opportune image plane trajectories for tracked points of the servoing control scheme (a trajectory is made of a path plus a time law). The generated image plane paths must be feasible i.e. they must be compliant with rigid body motion of the camera with respect to the object so as to avoid image jacobian singularities and local minima problems. In addition, the image planned trajectories must generate camera velocity screws which are smooth and within the allowed bounds of the robot. We will show that a scaled 3D motion planning algorithm can be devised in order to generate feasible image plane trajectories. Since the paths in the image are off-line generated it is also possible to tune the planning parameters so as to maintain the target inside the camera field of view even if, in some unfortunate cases, the feature target points would leave the camera images due to 3D robot motions. To test the validity of the proposed approach some both experiments and simulations results have been reported taking also into account the influence of noise in the path planning strategy. The experiments have been realized with a 6DOF anthropomorphic manipulator with a fire-wire camera installed on its end effector: the results demonstrate the good performances and the feasibility of the proposed approach.
Resumo:
<p>Facial expression recognition is one of the most challenging research areas in the image recognition eld and has been actively studied since the 70's. For instance, smile recognition has been studied due to the fact that it is considered an important facial expression in human communication, it is therefore likely useful for humanmachine interaction. Moreover, if a smile can be detected and also its intensity estimated, it will raise the possibility of new applications in the future</p>
Resumo:
<p>[EN]This paper describes a low-cost system that allows the user to visualize different glasses models in live video. The user can also move the glasses to adjust its position on the face. The system, which runs at 9.5 frames/s on general-purpose hardware, has a homeostatic module that keeps image parameters controlled. This is achieved by using a camera with motorized zoom, iris, white balance, etc. This feature can be specially useful in environments with changing illumination and shadows, like in an optical shop. The system also includes a face and eye detection module and a glasses management module.</p>
Resumo:
<p>[EN]We present a new method, based on the idea of the meccano method and a novel T-mesh optimization procedure, to construct a T-spline parameterization of 2D geometries for the application of isogeometric analysis. The proposed method only demands a boundary representation of the geometry as input data. The algorithm obtains, as a result, high quality parametric transformation between 2D objects and the parametric domain, the unit square. First, we define a parametric mapping between the input boundary of the object and the boundary of the parametric domain. Then, we build a T-mesh adapted to the geometric singularities of the domain in order to preserve the features of the object boundary with a desired tolerance…</p>
Resumo:
Generic programming is likely to become a new challenge for a critical mass of developers. Therefore, it is crucial to refine the support for generic programming in mainstream Object-Oriented languages both at the design and at the implementation level as well as to suggest novel ways to exploit the additional degree of expressiveness made available by genericity. This study is meant to provide a contribution towards bringing Java genericity to a more mature stage with respect to mainstream programming practice, by increasing the effectiveness of its implementation, and by revealing its full expressive power in real world scenario. With respect to the current research setting, the main contribution of the thesis is twofold. First, we propose a revised implementation for Java generics that greatly increases the expressiveness of the Java platform by adding reification support for generic types. Secondly, we show how Java genericity can be leveraged in a real world case-study in the context of the multi-paradigm language integration. Several approaches have been proposed in order to overcome the lack of reification of generic types in the Java programming language. Existing approaches tackle the problem of reification of generic types by defining new translation techniques which would allow for a runtime representation of generics and wildcards. Unfortunately most approaches suffer from several problems: heterogeneous translations are known to be problematic when considering reification of generic methods and wildcards. On the other hand, more sophisticated techniques requiring changes in the Java runtime, supports reified generics through a true language extension (where clauses) so that backward compatibility is compromised. In this thesis we develop a sophisticated type-passing technique for addressing the problem of reification of generic types in the Java programming language; this approach first pioneered by the so called EGO translator is here turned into a full-blown solution which reifies generic types inside the Java Virtual Machine (JVM) itself, thus overcoming both performance penalties and compatibility issues of the original EGO translator. Java-Prolog integration Integrating Object-Oriented and declarative programming has been the subject of several researches and corresponding technologies. Such proposals come in two flavours, either attempting at joining the two paradigms, or simply providing an interface library for accessing Prolog declarative features from a mainstream Object-Oriented languages such as Java. Both solutions have however drawbacks: in the case of hybrid languages featuring both Object-Oriented and logic traits, such resulting language is typically too complex, thus making mainstream application development an harder task; in the case of library-based integration approaches there is no true language integration, and some boilerplate code has to be implemented to fix the paradigm mismatch. In this thesis we develop a framework called PatJ which promotes seamless exploitation of Prolog programming in Java. A sophisticated usage of generics/wildcards allows to define a precise mapping between Object-Oriented and declarative features. PatJ defines a hierarchy of classes where the bidirectional semantics of Prolog terms is modelled directly at the level of the Java generic type-system.
Resumo:
The term Ambient Intelligence (AmI) refers to a vision on the future of the information society where smart, electronic environment are sensitive and responsive to the presence of people and their activities (Context awareness). In an ambient intelligence world, devices work in concert to support people in carrying out their everyday life activities, tasks and rituals in an easy, natural way using information and intelligence that is hidden in the network connecting these devices. This promotes the creation of pervasive environments improving the quality of life of the occupants and enhancing the human experience. AmI stems from the convergence of three key technologies: ubiquitous computing, ubiquitous communication and natural interfaces. Ambient intelligent systems are heterogeneous and require an excellent cooperation between several hardware/software technologies and disciplines, including signal processing, networking and protocols, embedded systems, information management, and distributed algorithms. Since a large amount of fixed and mobile sensors embedded is deployed into the environment, the Wireless Sensor Networks is one of the most relevant enabling technologies for AmI. WSN are complex systems made up of a number of sensor nodes which can be deployed in a target area to sense physical phenomena and communicate with other nodes and base stations. These simple devices typically embed a low power computational unit (microcontrollers, FPGAs etc.), a wireless communication unit, one or more sensors and a some form of energy supply (either batteries or energy scavenger modules). WNS promises of revolutionizing the interactions between the real physical worlds and human beings. Low-cost, low-computational power, low energy consumption and small size are characteristics that must be taken into consideration when designing and dealing with WSNs. To fully exploit the potential of distributed sensing approaches, a set of challengesmust be addressed. Sensor nodes are inherently resource-constrained systems with very low power consumption and small size requirements which enables than to reduce the interference on the physical phenomena sensed and to allow easy and low-cost deployment. They have limited processing speed,storage capacity and communication bandwidth that must be efficiently used to increase the degree of local understanding of the observed phenomena. A particular case of sensor nodes are video sensors. This topic holds strong interest for a wide range of contexts such as military, security, robotics and most recently consumer applications. Vision sensors are extremely effective for medium to long-range sensing because vision provides rich information to human operators. However, image sensors generate a huge amount of data, whichmust be heavily processed before it is transmitted due to the scarce bandwidth capability of radio interfaces. In particular, in video-surveillance, it has been shown that source-side compression is mandatory due to limited bandwidth and delay constraints. Moreover, there is an ample opportunity for performing higher-level processing functions, such as object recognition that has the potential to drastically reduce the required bandwidth (e.g. by transmitting compressed images only when something interesting is detected). The energy cost of image processing must however be carefully minimized. Imaging could play and plays an important role in sensing devices for ambient intelligence. Computer vision can for instance be used for recognising persons and objects and recognising behaviour such as illness and rioting. Having a wireless camera as a camera mote opens the way for distributed scene analysis. More eyes see more than one and a camera system that can observe a scene from multiple directions would be able to overcome occlusion problems and could describe objects in their true 3D appearance. In real-time, these approaches are a recently opened field of research. In this thesis we pay attention to the realities of hardware/software technologies and the design needed to realize systems for distributed monitoring, attempting to propose solutions on open issues and filling the gap between AmI scenarios and hardware reality. The physical implementation of an individual wireless node is constrained by three important metrics which are outlined below. Despite that the design of the sensor network and its sensor nodes is strictly application dependent, a number of constraints should almost always be considered. Among them: Small form factor to reduce nodes intrusiveness. Low power consumption to reduce battery size and to extend nodes lifetime. Low cost for a widespread diffusion. These limitations typically result in the adoption of low power, low cost devices such as low powermicrocontrollers with few kilobytes of RAMand tenth of kilobytes of program memory with whomonly simple data processing algorithms can be implemented. However the overall computational power of the WNS can be very large since the network presents a high degree of parallelism that can be exploited through the adoption of ad-hoc techniques. Furthermore through the fusion of information from the dense mesh of sensors even complex phenomena can be monitored. In this dissertation we present our results in building several AmI applications suitable for a WSN implementation. The work can be divided into two main areas:Low Power Video Sensor Node and Video Processing Alghoritm and Multimodal Surveillance . Low Power Video Sensor Nodes and Video Processing Alghoritms In comparison to scalar sensors, such as temperature, pressure, humidity, velocity, and acceleration sensors, vision sensors generate much higher bandwidth data due to the two-dimensional nature of their pixel array. We have tackled all the constraints listed above and have proposed solutions to overcome the current WSNlimits for Video sensor node. We have designed and developed wireless video sensor nodes focusing on the small size and the flexibility of reuse in different applications. The video nodes target a different design point: the portability (on-board power supply, wireless communication), a scanty power budget (500mW),while still providing a prominent level of intelligence, namely sophisticated classification algorithmand high level of reconfigurability. We developed two different video sensor node: The device architecture of the first one is based on a low-cost low-power FPGA+microcontroller system-on-chip. The second one is based on ARM9 processor. Both systems designed within the above mentioned power envelope could operate in a continuous fashion with Li-Polymer battery pack and solar panel. Novel low power low cost video sensor nodes which, in contrast to sensors that just watch the world, are capable of comprehending the perceived information in order to interpret it locally, are presented. Featuring such intelligence, these nodes would be able to cope with such tasks as recognition of unattended bags in airports, persons carrying potentially dangerous objects, etc.,which normally require a human operator. Vision algorithms for object detection, acquisition like human detection with Support Vector Machine (SVM) classification and abandoned/removed object detection are implemented, described and illustrated on real world data. Multimodal surveillance: In several setup the use of wired video cameras may not be possible. For this reason building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. Energy efficiency for wireless smart camera networks is one of the major efforts in distributed monitoring and surveillance community. For this reason, building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. The Pyroelectric Infra-Red (PIR) sensors have been used to extend the lifetime of a solar-powered video sensor node by providing an energy level dependent trigger to the video camera and the wireless module. Such approach has shown to be able to extend node lifetime and possibly result in continuous operation of the node.Being low-cost, passive (thus low-power) and presenting a limited form factor, PIR sensors are well suited for WSN applications. Moreover techniques to have aggressive power management policies are essential for achieving long-termoperating on standalone distributed cameras needed to improve the power consumption. We have used an adaptive controller like Model Predictive Control (MPC) to help the system to improve the performances outperforming naive power management policies.
Resumo:
This work has been realized by the author in his PhD course in Electronics, Computer Science and Telecommunication at the University of Bologna, Faculty of Engineering, Italy. The subject of this thesis regards important channel estimation aspects in wideband wireless communication systems, such as echo cancellation in digital video broadcasting systems and pilot aided channel estimation through an innovative pilot design in Multi-Cell Multi-User MIMO-OFDM network. All the documentation here reported is a summary of years of work, under the supervision of Prof. Oreste Andrisano, coordinator of Wireless Communication Laboratory - WiLab, in Bologna. All the instrumentation that has been used for the characterization of the telecommunication systems belongs to CNR (National Research Council), CNIT (Italian Inter-University Center), and DEIS (Dept. of Electronics, Computer Science, and Systems). From November 2009 to May 2010, the author spent his time abroad, working in collaboration with DOCOMO - Communications Laboratories Europe GmbH (DOCOMO Euro-Labs) in Munich, Germany, in the Wireless Technologies Research Group. Some important scientific papers, submitted and/or published on IEEE journals and conferences have been produced by the author.
Resumo:
Skype is one of the well-known applications that has guided the evolution of real-time video streaming and has become one of the most used software in everyday life. It provides VoIP audio/video calls as well as messaging chat and file transfer. Many versions are available covering all the principal operating systems like Windows, Macintosh and Linux but also mobile systems. Voice quality decreed Skype success since its birth in 2003 and peer-to-peer architecture has allowed worldwide diffusion. After video call introduction in 2006 Skype became a complete solution to communicate between two or more people. As a primarily video conferencing application, Skype assumes certain characteristics of the delivered video to optimize its perceived quality. However in the last years, and with the recent release of SkypeKit1, many new Skype video-enabled devices came out especially in the mobile world. This forced a change to the traditional recording, streaming and receiving settings allowing for a wide range of network and content dynamics. Video calls are not anymore based on static chatting but mobile devices have opened new possibilities and can be used in several scenarios. For instance, lecture streaming or one-to-one mobile video conferences exhibit more dynamics as both caller and callee might be on move. Most of these cases are different from head&shoulder only content. Therefore, Skype needs to optimize its video streaming engine to cover more video types. Heterogeneous connections require different behaviors and solutions and Skype must face with this variety to maintain a certain quality independently from connection used. Part of the present work will be focused on analyzing Skype behavior depending on video content. Since Skype protocol is proprietary most of the studies so far have tried to characterize its traffic and to reverse engineer its protocol. However, questions related to the behavior of Skype, especially on quality as perceived by users, remain unanswered. We will study Skype video codecs capabilities and video quality assessment. Another motivation of our work is the design of a mechanism that estimates the perceived cost of network conditions on Skype video delivery. To this extent we will try to assess in an objective way the impact of network impairments on the perceived quality of a Skype video call. Traditional video streaming schemes lack the necessary flexibility and adaptivity that Skype tries to achieve at the edge of a network. Our contribution will lye on a testbed and consequent objective video quality analysis that we will carry out on input videos. We will stream raw video files with Skype via an impaired channel and then we will record it at the receiver side to analyze with objective quality of experience metrics.
Resumo:
Il Web ha subito numerose trasformazioni rispetto al passato. Si passati da un Web statico, in cui l'unica possibilit era quella di leggere i contenuti della pagina, ad un Web dinamico e interattivo come quello dei social network. Il Web moderno , ancora oggi, un universo in espansione. La possibilit di arricchire le pagine con contenuti interattivi, video, foto e molto altro, rende l'esperienza web sempre pi coinvolgente. Inoltre la diffusione sempre pi ampia di mobile device ha reso necessaria l'introduzione di nuovi strumenti per sfruttare al meglio le funzionalit di tali dispositivi. Esistono al momento tantissimi linguaggi di scripting e di programmazione, ma anche CMS che offrono a chiunque la possibilit di scrivere e amministrare siti web. Nonostante le grandi potenzialit che offrono, spesso queste tecnologie si occupano di ambiti specifici e non permettono di creare sistemi omogenei che comprendano sia client che server. Dart si inserisce proprio in questo contesto. Tale linguaggio d a i programmatori la possibilit di poter sviluppare sia lato client sia lato server. L'obiettivo principale di questo linguaggio infatti la risoluzione di alcune problematiche comuni a molti programmatori web. Importante in questo senso il fatto di rendere strutturata la costruzione di programmi web attraverso l'uso di interfacce e classi. Fornisce inoltre un supporto per l'integrazione di svariate funzionalit che allo stato attuale sono gestite da differenti tecnologie. L'obiettivo della presente tesi quello di mettere a confronto Dart con alcune delle tecnologie pi utilizzate al giorno d'oggi per la programmazione web-based. In particolare si prenderanno in considerazione JavaScript, jQuery, node.js e CoffeeScript.
Resumo:
Nanoscience is an emerging and fast-growing field of science with the aim of manipulating nanometric objects with dimension below 100 nm. Top down approach is currently used to build these type of architectures (e.g microchips). The miniaturization process cannot proceed indefinitely due to physical and technical limitations. Those limits are focusing the interest on the bottom-up approach and construction of nano-objects starting from nano-bricks like atoms, molecules or nanocrystals. Unlike atoms, molecules can be fully programmable and represent the best choice to build up nanostructures. In the past twenty years many examples of functional nano-devices able to perform simple actions have been reported. Nanocrystals which are often considered simply nanostructured materials, can be active part in the development of those nano-devices, in combination with functional molecules. The object of this dissertation is the photophysical and photochemical investigation of nano-objects bearing molecules and semiconductor nanocrystals (QDs) as components. The first part focuses on the characterization of a bistable rotaxane. This study, in collaboration with the group of Prof. J.F. Stoddart (Northwestern University, Evanston, Illinois, USA) who made the synthesis of the compounds, shows the ability of this artificial machine to operate as bistable molecular-level memory under kinetic control. The second part concerns the study of the surface properties of luminescent semiconductor nanocrystals (QDs) and in particular the effect of acid and base on the spectroscopical properties of those nanoparticles. In this section is also reported the work carried out in the laboratory of Prof H. Mattoussi (Florida State University, Tallahassee, Florida, USA), where I developed a novel method for the surface decoration of QDs with lipoic acid-based ligands involving the photoreduction of the di-thiolane moiety.
Resumo:
La neuroriabilitazione un processo attraverso cui individui affetti da patologie neurologiche mirano al conseguimento di un recupero completo o alla realizzazione del loro potenziale ottimale benessere fisico, mentale e sociale. Elementi essenziali per una riabilitazione efficace sono: una valutazione clinica da parte di un team multidisciplinare, un programma riabilitativo mirato e la valutazione dei risultati conseguiti mediante misure scientifiche e clinicamente appropriate. Obiettivo principale di questa tesi stato sviluppare metodi e strumenti quantitativi per il trattamento e la valutazione motoria di pazienti neurologici. I trattamenti riabilitativi convenzionali richiedono a pazienti neurologici lesecuzione di esercizi ripetitivi, diminuendo la loro motivazione. La realt virtuale e i feedback sono in grado di coinvolgerli nel trattamento, permettendo ripetibilit e standardizzazione dei protocolli. stato sviluppato e valutato uno strumento basato su feedback aumentati per il controllo del tronco. Inoltre, la realt virtuale permette lindividualizzare il trattamento in base alle esigenze del paziente. Unapplicazione virtuale per la riabilitazione del cammino stata sviluppata e testata durante un training su pazienti di sclerosi multipla, valutandone fattibilit e accettazione e dimostrando l'efficacia del trattamento. La valutazione quantitativa delle capacit motorie dei pazienti viene effettuata utilizzando sistemi di motion capture. Essendo il loro uso nella pratica clinica limitato, una metodologia per valutare loscillazione delle braccia in soggetti parkinsoniani basata su sensori inerziali stata proposta. Questi sono piccoli, accurati e flessibili ma accumulano errori durante lunghe misurazioni. stato affrontato questo problema e i risultati suggeriscono che, se il sensore sul piede e le accelerazioni sono integrate iniziando dalla fase di mid stance, lerrore e le sue conseguenze nella determinazione dei parametri spaziali sono contenuti. Infine, stata presentata una validazione del Kinect per il tracking del cammino in ambiente virtuale. Risultati preliminari consentono di definire il campo di utilizzo del sensore in riabilitazione.
Resumo:
Natural stones have been widely used in the construction field since antiquity. Building materials undergo decay processes due to mechanical,chemical, physical and biological causes that can act together. Therefore an interdisciplinary approach is required in order to understand the interaction between the stone and the surrounding environment. Utilization of buildings, inadequate restoration activities and in general anthropogenic weathering factors may contribute to this degradation process. For this reasons, in the last few decades new technologies and techniques have been developed and introduced in the restoration field. Consolidants are largely used in restoration and conservation of cultural heritage in order to improve the internal cohesion and to reduce the weathering rate of building materials. It is important to define the penetration depth of a consolidant for determining its efficacy. Impregnation mainly depends on the microstructure of the stone (i.e. porosity) and on the properties of the product itself. Throughout this study, tetraethoxysilane (TEOS) applied on globigerina limestone samples has been chosen as object of investigation. After hydrolysis and condensation, TEOS deposits silica gel inside the pores, improving the cohesion of the grains. X-ray computed tomography has been used to characterize the internal structure of the limestone samples,treated and untreated with a TEOS-based consolidant. The aim of this work is to investigate the penetration depth and the distribution of the TEOS inside the porosity, using both traditional approaches and advanced X-ray tomographic techniques, the latter allowing the internal visualization in three dimensions of the materials. Fluid transport properties and porosity have been studied both at macroscopic scale, by means of capillary uptake tests and radiography, and at microscopic scale,investigated with X-ray Tomographic Microscopy (XTM). This allows identifying changes in the porosity, by comparison of the images before and after the treatment, and locating the consolidant inside the stone. Tests were initially run at University of Bologna, where characterization of the stone was carried out. Then the research continued in Switzerland: X-ray tomography and radiography were performed at Empa, Swiss Federal Laboratories for Materials Science and Technology, while XTM measurements with synchrotron radiation were run at Paul Scherrer Institute in Villigen.
Resumo:
Laser Shock Peening (LSP) is a surface enhancement treatment which induces a significant layer of beneficial compressive residual stresses of up to several mm underneath the surface of metal components in order to improve the detrimental effects of the crack growth behavior rate in it. The aim of this thesis is to predict the crack growth behavior in metallic specimens with one or more stripes which define the compressive residual stress area induced by the Laser Shock Peening treatment. The process was applied as crack retardation stripes perpendicular to the crack propagation direction with the object of slowing down the crack when approaching the peened stripes. The finite element method has been applied to simulate the redistribution of stresses in a cracked model when it is subjected to a tension load and to a compressive residual stress field, and to evaluate the Stress Intensity Factor (SIF) in this condition. Finally, the Afgrow software is used to predict the crack growth behavior of the component following the Laser Shock Peening treatment and to detect the improvement in the fatigue life comparing it to the baseline specimen. An educational internship at the Research & Technologies Germany Hamburg department of AIRBUS helped to achieve knowledge and experience to write this thesis. The main tasks of the thesis are the following: To up to date Literature Survey related to Laser Shock Peening in Metallic Structures To validate the FE model developed against experimental measurements at coupon level To develop design of crack growth slowdown in Centered Cracked Tension specimens based on residual stress engineering approach using laser peened strip transversal to the crack path To evaluate the Stress Intensity Factor values for Centered Cracked Tension specimens after the Laser Shock Peening treatment via Finite Element Analysis To predict the crack growth behavior in Centered Cracked Tension specimens using as input the SIF values evaluated with the FE simulations To validate the results by means of experimental tests
Resumo:
The PhD research activity has taken place in the space debris field. In detail, it is focused on the possibility of detecting space debris from the space based platform. The research is focused at the same time on the software and the hardware of this detection system. For the software, a program has been developed for being able to detect an object in space and locate it in the sky solving the star field. For the hardware, the possibility of adapting a ground telescope for space activity has been considered and it has been tested on a possible electronic board.
Resumo:
In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and often surpassing the state of the art on many tasks. The rise of deep learning is also revolutionizing the entire field of Machine Learning and Pattern Recognition pushing forward the concepts of automatic feature extraction and unsupervised learning in general. However, despite the strong success both in science and business, deep learning has its own limitations. It is often questioned if such techniques are only some kind of brute-force statistical approaches and if they can only work in the context of High Performance Computing with tons of data. Another important question is whether they are really biologically inspired, as claimed in certain cases, and if they can scale well in terms of "intelligence". The dissertation is focused on trying to answer these key questions in the context of Computer Vision and, in particular, Object Recognition, a task that has been heavily revolutionized by recent advances in the field. Practically speaking, these answers are based on an exhaustive comparison between two, very different, deep learning techniques on the aforementioned task: Convolutional Neural Network (CNN) and Hierarchical Temporal memory (HTM). They stand for two different approaches and points of view within the big hat of deep learning and are the best choices to understand and point out strengths and weaknesses of each of them. CNN is considered one of the most classic and powerful supervised methods used today in machine learning and pattern recognition, especially in object recognition. CNNs are well received and accepted by the scientific community and are already deployed in large corporation like Google and Facebook for solving face recognition and image auto-tagging problems. HTM, on the other hand, is known as a new emerging paradigm and a new meanly-unsupervised method, that is more biologically inspired. It tries to gain more insights from the computational neuroscience community in order to incorporate concepts like time, context and attention during the learning process which are typical of the human brain. In the end, the thesis is supposed to prove that in certain cases, with a lower quantity of data, HTM can outperform CNN.