2 resultados para Efficient Solution
em DRUM (Digital Repository at the University of Maryland)
Resumo:
Human and robots have complementary strengths in performing assembly operations. Humans are very good at perception tasks in unstructured environments. They are able to recognize and locate a part from a box of miscellaneous parts. They are also very good at complex manipulation in tight spaces. The sensory characteristics of the humans, motor abilities, knowledge and skills give the humans the ability to react to unexpected situations and resolve problems quickly. In contrast, robots are very good at pick and place operations and highly repeatable in placement tasks. Robots can perform tasks at high speeds and still maintain precision in their operations. Robots can also operate for long periods of times. Robots are also very good at applying high forces and torques. Typically, robots are used in mass production. Small batch and custom production operations predominantly use manual labor. The high labor cost is making it difficult for small and medium manufacturers to remain cost competitive in high wage markets. These manufactures are mainly involved in small batch and custom production. They need to find a way to reduce the labor cost in assembly operations. Purely robotic cells will not be able to provide them the necessary flexibility. Creating hybrid cells where humans and robots can collaborate in close physical proximities is a potential solution. The underlying idea behind such cells is to decompose assembly operations into tasks such that humans and robots can collaborate by performing sub-tasks that are suitable for them. Realizing hybrid cells that enable effective human and robot collaboration is challenging. This dissertation addresses the following three computational issues involved in developing and utilizing hybrid assembly cells: - We should be able to automatically generate plans to operate hybrid assembly cells to ensure efficient cell operation. This requires generating feasible assembly sequences and instructions for robots and human operators, respectively. Automated planning poses the following two challenges. First, generating operation plans for complex assemblies is challenging. The complexity can come due to the combinatorial explosion caused by the size of the assembly or the complex paths needed to perform the assembly. Second, generating feasible plans requires accounting for robot and human motion constraints. The first objective of the dissertation is to develop the underlying computational foundations for automatically generating plans for the operation of hybrid cells. It addresses both assembly complexity and motion constraints issues. - The collaboration between humans and robots in the assembly cell will only be practical if human safety can be ensured during the assembly tasks that require collaboration between humans and robots. The second objective of the dissertation is to evaluate different options for real-time monitoring of the state of human operator with respect to the robot and develop strategies for taking appropriate measures to ensure human safety when the planned move by the robot may compromise the safety of the human operator. In order to be competitive in the market, the developed solution will have to include considerations about cost without significantly compromising quality. - In the envisioned hybrid cell, we will be relying on human operators to bring the part into the cell. If the human operator makes an error in selecting the part or fails to place it correctly, the robot will be unable to correctly perform the task assigned to it. If the error goes undetected, it can lead to a defective product and inefficiencies in the cell operation. The reason for human error can be either confusion due to poor quality instructions or human operator not paying adequate attention to the instructions. In order to ensure smooth and error-free operation of the cell, we will need to monitor the state of the assembly operations in the cell. The third objective of the dissertation is to identify and track parts in the cell and automatically generate instructions for taking corrective actions if a human operator deviates from the selected plan. Potential corrective actions may involve re-planning if it is possible to continue assembly from the current state. Corrective actions may also involve issuing warning and generating instructions to undo the current task.
Resumo:
Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state of the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state of the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.