13 resultados para Information retrieval
em Indian Institute of Science - Bangalore - Índia
Resumo:
The following topics were dealt with: document analysis and recognition; multimedia document processing; character recognition; document image processing; cheque processing; form processing; music processing; document segmentation; electronic documents; character classification; handwritten character recognition; information retrieval; postal automation; font recognition; Indian language OCR; handwriting recognition; performance evaluation; graphics recognition; oriental character recognition; and word recognition
Resumo:
The problem of identifying user intent has received considerable attention in recent years, particularly in the context of improving the search experience via query contextualization. Intent can be characterized by multiple dimensions, which are often not observed from query words alone. Accurate identification of Intent from query words remains a challenging problem primarily because it is extremely difficult to discover these dimensions. The problem is often significantly compounded due to lack of representative training sample. We present a generic, extensible framework for learning the multi-dimensional representation of user intent from the query words. The approach models the latent relationships between facets using tree structured distribution which leads to an efficient and convergent algorithm, FastQ, for identifying the multi-faceted intent of users based on just the query words. We also incorporated WordNet to extend the system capabilities to queries which contain words that do not appear in the training data. Empirical results show that FastQ yields accurate identification of intent when compared to a gold standard.
Resumo:
Ranking problems have become increasingly important in machine learning and data mining in recent years, with applications ranging from information retrieval and recommender systems to computational biology and drug discovery. In this paper, we describe a new ranking algorithm that directly maximizes the number of relevant objects retrieved at the absolute top of the list. The algorithm is a support vector style algorithm, but due to the different objective, it no longer leads to a quadratic programming problem. Instead, the dual optimization problem involves l1, ∞ constraints; we solve this dual problem using the recent l1, ∞ projection method of Quattoni et al (2009). Our algorithm can be viewed as an l∞-norm extreme of the lp-norm based algorithm of Rudin (2009) (albeit in a support vector setting rather than a boosting setting); thus we refer to the algorithm as the ‘Infinite Push’. Experiments on real-world data sets confirm the algorithm’s focus on accuracy at the absolute top of the list.
Resumo:
In this paper we propose a postprocessing technique for a spectrogram diffusion based harmonic/percussion decom- position algorithm. The proposed technique removes har- monic instrument leakages in the percussion enhanced out- puts of the baseline algorithm. The technique uses median filtering and an adaptive detection of percussive segments in subbands followed by piecewise signal reconstruction using envelope properties to ensure that percussion is enhanced while harmonic leakages are suppressed. A new binary mask is created for the percussion signal which upon applying on the original signal improves harmonic versus percussion separation. We compare our algorithm with two recent techniques and show that on a database of polyphonic Indian music, the postprocessing algorithm improves the harmonic versus percussion decomposition significantly.
Resumo:
We propose an iterative algorithm to detect transient segments in audio signals. Short time Fourier transform(STFT) is used to detect rapid local changes in the audio signal. The algorithm has two steps that iteratively - (a) calculate a function of the STFT and (b) build a transient signal. A dynamic thresholding scheme is used to locate the potential positions of transients in the signal. The iterative procedure ensures that genuine transients are built up while the localised spectral noise are suppressed by using an energy criterion. The extracted transient signal is later compared to a ground truth dataset. The algorithm performed well on two databases. On the EBU-SQAM database of monophonic sounds, the algorithm achieved an F-measure of 90% while on our database of polyphonic audio an F-measure of 91% was achieved. This technique is being used as a preprocessing step for a tempo analysis algorithm and a TSR (Transients + Sines + Residue) decomposition scheme.
Resumo:
Learning from Positive and Unlabelled examples (LPU) has emerged as an important problem in data mining and information retrieval applications. Existing techniques are not ideally suited for real world scenarios where the datasets are linearly inseparable, as they either build linear classifiers or the non-linear classifiers fail to achieve the desired performance. In this work, we propose to extend maximum margin clustering ideas and present an iterative procedure to design a non-linear classifier for LPU. In particular, we build a least squares support vector classifier, suitable for handling this problem due to symmetry of its loss function. Further, we present techniques for appropriately initializing the labels of unlabelled examples and for enforcing the ratio of positive to negative examples while obtaining these labels. Experiments on real-world datasets demonstrate that the non-linear classifier designed using the proposed approach gives significantly better generalization performance than the existing relevant approaches for LPU.
Resumo:
The Ozone Monitoring Instrument (OMI) aboard EOS-Aura and the Moderate Resolution Imaging Spectroradiometer (MODIS) onboard EOS-Aqua fly in formation as part of the A-train. Though OMI retrieves aerosol optical depth (AOD) and aerosol absorption, it must assume aerosol layer height. The MODIS cannot retrieve aerosol absorption, but MODIS aerosol retrieval is not sensitive to aerosol layer height and with its smaller pixel size is less affected by subpixel clouds. Here we demonstrate an approach that uses MODIS-retrieved AOD to constrain the OMI retrieval, freeing OMI from making an a priori estimate of aerosol height and allowing a more direct retrieval of aerosol absorption. To predict near-UV optical depths using MODIS data we rely on the spectral curvature of the MODIS-retrieved visible and near-IR spectral AODs. Application of an OMI-MODIS joint retrieval over the north tropical Atlantic shows good agreement between OMI and MODIS-predicted AODs in the UV, which implies that the aerosol height assumed in the OMI-standard algorithm is probably correct. In contrast, over the Arabian Sea, MODIS-predicted AOD deviated from the OMI-standard retrieval, but combined OMI-MODIS retrievals substantially improved information on aerosol layer height (on the basis of validation against airborne lidar measurements). This implies an improvement in the aerosol absorption retrieval, but lack of UV absorption measurements prevents a true validation. Our study demonstrates the potential of multisatellite analysis of A-train data to improve the accuracy of retrieved aerosol products and suggests that a combined OMI-MODIS-CALIPSO retrieval has large potential to further improve assessments of aerosol absorption.
Resumo:
Several investigators in the past have used the radiance depression (with respect to clear-sky infrared radiance), resulting from the presence of mineral dust aerosols in the atmosphere, as an index of dust aerosol load in the atmosphere during local noon. Here, we have used a modified approach to retrieve dust index during night since assessment of diurnal average infrared dust forcing essentially requires information on dust aerosols during night. For this purpose, we used infrared radiance (10.5-12.5 mu m), acquired from the METEOSAT-5 satellite (similar to 5 km resolution). We found that the `dust index' algorithm, valid for daytime, will no longer hold during the night because dust is then hotter than the theoretical dust-free reference. Hence we followed a `minimum reference' approach instead of a conventional `maximum reference' approach. A detailed analysis suggests that the maximum dust load occurs during the daytime. Over the desert regions of India and Africa, maximum change in dust load is as much as a factor of four between day and night and factor of two variations are commonly observed. By realizing the consequent impact on long wave dust forcing, sensitivity studies were carried out, which indicate that utilizing day time data for estimating the diurnally averaged long-wave dust radiative forcing results in significant errors (as much as 50 to 70%). Annually and regionally averaged long wave dust radiative forcing (which account for the diurnal variation of dust) at the top of the atmosphere over Afro-Asian region is 2.6 +/- 1.8 W m(-2), which is 30 to 50% lower than those reported earlier. Our studies indicate that neglecting diurnal variation of dust while assessing its radiative impact leads to an overestimation of dust radiative forcing, which in turn result in underestimation of the radiative impact of anthropogenic aerosols.
Resumo:
We have compared the total as well as fine mode aerosol optical depth (tau and tau(fine)) retrieved by Moderate Resolution Imaging Spectroradiometer (MODIS) onboard Terra and Aqua (2001-2005) with the equivalent parameters derived by Aerosol Robotic Network (AERONET) at Kanpur (26.45 degrees N, 80.35 degrees E), northern India. MODIS Collection 005 (C005)-derived tau(0.55) was found to be in good agreement with the AERONET measurements. The tau(fine) and eta (tau(fine)/tau) were, however, biased low significantly in most matched cases. A new set of retrieval with the use of absorbing aerosol model (SSA similar to 0.87) with increased visible surface reflectance provided improved tau and tau(fine) at Kanpur. The new derivation of eta also compares well qualitatively with an independent set of in situ measurements of accumulation mass fraction over much of the southern India. This suggests that though MODIS land algorithm has limited information to derive size properties of aerosols over land, more accurate parameterization of aerosol and surface properties within the existing C005 algorithm may improve the accuracy of size-resolved aerosol optical properties. The results presented in this paper indicate that there is a need to reconsider the surface parameterization and assumed aerosol properties in MODIS C005 algorithm over the Indian region in order to retrieve more accurate aerosol optical and size properties, which are essential to quantify the impact of human-made aerosols on climate.
Resumo:
We propose a method to encode a 3D magnetic resonance image data and a decoder in such way that fast access to any 2D image is possible by decoding only the corresponding information from each subband image and thus provides minimum decoding time. This will be of immense use for medical community, because most of the PET and MRI data are volumetric data. Preprocessing is carried out at every level before wavelet transformation, to enable easier identification of coefficients from each subband image. Inclusion of special characters in the bit stream facilitates access to corresponding information from the encoded data. Results are taken by performing Daub4 along x (row), y (column) direction and Haar along z (slice) direction. Comparable results are achieved with the existing technique. In addition to that decoding time is reduced by 1.98 times. Arithmetic coding is used to encode corresponding information independently
Resumo:
As rapid brain development occurs during the neonatal period, environmental manipulation during this period may have a significant impact on sleep and memory functions. Moreover, rapid eye movement (REM) sleep plays an important role in integrating new information with the previously stored emotional experience. Hence, the impact of early maternal separation and isolation stress (MS) during the stress hyporesponsive period (SHRP) on fear memory retention and sleep in rats were studied. The neonatal rats were subjected to maternal separation and isolation stress during postnatal days 5-7 (6 h daily/3 d). Polysomnographic recordings and differential fear conditioning was carried out in two different sets of rats aged 2 months. The neuronal replay during REM sleep was analyzed using different parameters. MS rats showed increased time in REM stage and total sleep period also increased. MS rats showed fear generalization with increased fear memory retention than normal control (NC). The detailed analysis of the local field potentials across different time periods of REM sleep showed increased theta oscillations in the hippocampus, amygdala and cortical circuits. Our findings suggest that stress during SHRP has sensitized the hippocampus amygdala cortical loops which could be due to increased release of corticosterone that generally occurs during REM sleep. These rats when subjected to fear conditioning exhibit increased fear memory and increased, fear generalization. The development of helplessness, anxiety and sleep changes in human patients, thus, could be related to the reduced thermal, tactile and social stimulation during SHRP on brain plasticity and fear memory functions. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
The current study presents an algorithm to retrieve surface Soil Moisture (SM) from multi-temporal Synthetic Aperture Radar (SAR) data. The developed algorithm is based on the Cumulative Density Function (CDF) transformation of multi-temporal RADARSAT-2 backscatter coefficient (BC) to obtain relative SM values, and then converts relative SM values into absolute SM values using soil information. The algorithm is tested in a semi-arid tropical region in South India using 30 satellite images of RADARSAT-2, SMOS L2 SM products, and 1262 SM field measurements in 50 plots spanning over 4 years. The validation with the field data showed the ability of the developed algorithm to retrieve SM with RMSE ranging from 0.02 to 0.06 m(3)/m(3) for the majority of plots. Comparison with the SMOS SM showed a good temporal behaviour with RMSE of approximately 0.05 m(3)/m(3) and a correlation coefficient of approximately 0.9. The developed model is compared and found to be better than the change detection and delta index model. The approach does not require calibration of any parameter to obtain relative SM and hence can easily be extended to any region having time series of SAR data available.