996 resultados para indoor-scene-classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents an empirical study of multi-label classification methods, and gives suggestions for multi-label classification that are effective for automatic image annotation applications. The study shows that triple random ensemble multi-label classification algorithm (TREMLC) outperforms among its counterparts, especially on scene image dataset. Multi-label k-nearest neighbor (ML-kNN) and binary relevance (BR) learning algorithms perform well on Corel image dataset. Based on the overall evaluation results, examples are given to show label prediction performance for the algorithms using selected image examples. This provides an indication of the suitability of different multi-label classification methods for automatic image annotation under different problem settings.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we study the sound tracks in films and their indexical semiotic usage by developing a classification system that detects complex sound scenes and their constituent sound events in cinema. We investigate two main issues in this paper: Determination of what constitutes the presence of a high level sound scene and inferences about the thematic content of the scene that can be drawn from this presence, and classification of environmental sounds in the audio track of the scene, to assist in the automatic detection of the high level scene. Experiments with our classification system on pure sounds resulted in a correct event classification rate of 88.9%. When the audio content of a number of film scenes was examined, though a lower accuracy resulted with sound event detection due to the presence of mixed sounds, the film audio samples were generally classified with the correct high-level sound scene label, enabling correct inferences about the story content of the scenes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a comparative evaluation of popular multi-label classification methods on several multi-label problems from different domains. The methods include multi-label k-nearest neighbor, binary relevance, label power set, random k-label set ensemble learning, calibrated label ranking, hierarchy of multi-label classifiers and triple random ensemble multi-label classification algorithms. These multi-label learning algorithms are evaluated using several widely used MLC evaluation metrics. The evaluation results show that for each multi-label classification problem a particular MLC method can be recommended. The multi-label evaluation datasets used in this study are related to scene images, multimedia video frames, diagnostic medical report, email messages, emotional music data, biological genes and multi-structural proteins categorization.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The deployment of nodes in Wireless Sensor Networks (WSNs) arises as one of the biggest challenges of this field, which involves in distributing a large number of embedded systems to fulfill a specific application. The connectivity of WSNs is difficult to estimate due to the irregularity of the physical environment and affects the WSN designers? decision on deploying sensor nodes. Therefore, in this paper, a new method is proposed to enhance the efficiency and accuracy on ZigBee propagation simulation in indoor environments. The method consists of two steps: automatic 3D indoor reconstruction and 3D ray-tracing based radio simulation. The automatic 3D indoor reconstruction employs unattended image classification algorithm and image vectorization algorithm to build the environment database accurately, which also significantly reduces time and efforts spent on non-radio propagation issue. The 3D ray tracing is developed by using kd-tree space division algorithm and a modified polar sweep algorithm, which accelerates the searching of rays over the entire space. Signal propagation model is proposed for the ray tracing engine by considering both the materials of obstacles and the impact of positions along the ray path of radio. Three different WSN deployments are realized in the indoor environment of an office and the results are verified to be accurate. Experimental results also indicate that the proposed method is efficient in pre-simulation strategy and 3D ray searching scheme and is suitable for different indoor environments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

n this article, a tool for simulating the channel impulse response for indoor visible light communications using 3D computer-aided design (CAD) models is presented. The simulation tool is based on a previous Monte Carlo ray-tracing algorithm for indoor infrared channel estimation, but including wavelength response evaluation. The 3D scene, or the simulation environment, can be defined using any CAD software in which the user specifies, in addition to the setting geometry, the reflection characteristics of the surface materials as well as the structures of the emitters and receivers involved in the simulation. Also, in an effort to improve the computational efficiency, two optimizations are proposed. The first one consists of dividing the setting into cubic regions of equal size, which offers a calculation improvement of approximately 50% compared to not dividing the 3D scene into sub-regions. The second one involves the parallelization of the simulation algorithm, which provides a computational speed-up proportional to the number of processors used.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present work covers the first validation efforts of the EVA Tracking System for the assessment of minimally invasive surgery (MIS) psychomotor skills. Instrument movements were recorded for 42 surgeons (4 expert, 22 residents, 16 novice medical students) and analyzed for a box trainer peg transfer task. Construct validation was established for 7/9 motion analysis parameters (MAPs). Concurrent validation was determined for 8/9 MAPs against the TrEndo Tracking System. Finally, automatic determination of surgical proficiency based on the MAPs was sought by 3 different approaches to supervised classification (LDA, SVM, ANFIS), with accuracy results of 61.9%, 83.3% and 80.9% respectively. Results not only reflect on the validation of EVA for skills? assessment, but also on the relevance of motion analysis of instruments in the determination of surgical competence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As one of the most popular deep learning models, convolution neural network (CNN) has achieved huge success in image information extraction. Traditionally CNN is trained by supervised learning method with labeled data and used as a classifier by adding a classification layer in the end. Its capability of extracting image features is largely limited due to the difficulty of setting up a large training dataset. In this paper, we propose a new unsupervised learning CNN model, which uses a so-called convolutional sparse auto-encoder (CSAE) algorithm pre-Train the CNN. Instead of using labeled natural images for CNN training, the CSAE algorithm can be used to train the CNN with unlabeled artificial images, which enables easy expansion of training data and unsupervised learning. The CSAE algorithm is especially designed for extracting complex features from specific objects such as Chinese characters. After the features of articficial images are extracted by the CSAE algorithm, the learned parameters are used to initialize the first CNN convolutional layer, and then the CNN model is fine-Trained by scene image patches with a linear classifier. The new CNN model is applied to Chinese scene text detection and is evaluated with a multilingual image dataset, which labels Chinese, English and numerals texts separately. More than 10% detection precision gain is observed over two CNN models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Human motion monitoring is an important function in numerous applications. In this dissertation, two systems for monitoring motions of multiple human targets in wide-area indoor environments are discussed, both of which use radio frequency (RF) signals to detect, localize, and classify different types of human motion. In the first system, a coherent monostatic multiple-input multiple-output (MIMO) array is used, and a joint spatial-temporal adaptive processing method is developed to resolve micro-Doppler signatures at each location in a wide-area for motion mapping. The downranges are obtained by estimating time-delays from the targets, and the crossranges are obtained by coherently filtering array spatial signals. Motion classification is then applied to each target based on micro-Doppler analysis. In the second system, multiple noncoherent multistatic transmitters (Tx's) and receivers (Rx's) are distributed in a wide-area, and motion mapping is achieved by noncoherently combining bistatic range profiles from multiple Tx-Rx pairs. Also, motion classification is applied to each target by noncoherently combining bistatic micro-Doppler signatures from multiple Tx-Rx pairs. For both systems, simulation and real data results are shown to demonstrate the ability of the proposed methods for monitoring patient repositioning activities for pressure ulcer prevention.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Situational awareness is achieved naturally by the human senses of sight and hearing in combination. Automatic scene understanding aims at replicating this human ability using microphones and cameras in cooperation. In this paper, audio and video signals are fused and integrated at different levels of semantic abstractions. We detect and track a speaker who is relatively unconstrained, i.e., free to move indoors within an area larger than the comparable reported work, which is usually limited to round table meetings. The system is relatively simple: consisting of just 4 microphone pairs and a single camera. Results show that the overall multimodal tracker is more reliable than single modality systems, tolerating large occlusions and cross-talk. System evaluation is performed on both single and multi-modality tracking. The performance improvement given by the audio–video integration and fusion is quantified in terms of tracking precision and accuracy as well as speaker diarisation error rate and precision–recall (recognition). Improvements vs. the closest works are evaluated: 56% sound source localisation computational cost over an audio only system, 8% speaker diarisation error rate over an audio only speaker recognition unit and 36% on the precision–recall metric over an audio–video dominant speaker recognition method.