817 resultados para Embedding mappin
Resumo:
The black rock series of the Upper Ordovician - Lower Silurian in Yangtze area are important source rocks and have exceptional characteristics of sediment, biology, element geochemistry, carbon and oxygen isotope, organic geochemistry and etc. These characteristics are the reflection of important geology events. Due to scarce system research, many problems that relate to the development mechanism of source rocks are not solved. And this restricts the exploration of Oil and gas in South China. In this paper, author studied the palaeo-climate, palaeo-structure and palaeo-environment of the Upper Ordovician - Lower Silurian in Yangtze area by sedimentology, palaeobiology and geochemistry, especially the element geochemistry and isotope geochemistry. The environment model of source rocks is established and some conclusions are drawn. The Upper Ordovician - Lower Silurian sediment types in Yangtze area are mostly black shales, next, mudstone, shell limestone and siltystone. During the Late Ordovician and Earily Silurian periods, a series of big upheaval and depressed are distributed in Yangtze area, and the strata pattern of interphase upheaval and depressed led to Yangtze palaeosea isolated with outside sea. So the stagnant and anoxic environment that are the favorable factor of rich organic black shales sediment is formed in Yangtze area. That Chemical Index of Alteration (CIA) values of the lower Wufeng formation and Longmaxi formation exhibits moderate chemistry weathering suggests they were deposited under the circumstances of the warm and humid climate. However, the large difference of the CIA values of N.extraordinarius-N.ojsuensis biozone suggests that climate is changeful. Therefore, there were two different kinds of climates in the course of the deposition of the Wufeng formation and Longmaxi formation. During the Late Ordovician - Earily Silurian periods, in Yangtze palaeosea, the surface water which is full of rich nutriment and abundant bacterium - algae has high palaeo-productivity that is obvious difference in the different space – time. The content of sulphate changes gradually from the surface water columns to the deep water columns. That is, salinity in the surface water columns is serious low and the salinity in deep water columns is normal. Salinity delamination is favor of the forming of deep anoxic environment. During Wufeng period, the oxidated and low sulfate environment exists in the upper Yangtze palaeosea, while the anoxic and normal salinity environment occurs in the lower Yangtze palaeosea. During the Late Wufeng and Guanyinqiao periods, the steady anoxic environment is replaced by oxidated environment. During the Longmaxi period, layered and anoxic environment recur. In Yangtze area, studies of δ13C of sedimentary organic carbon show a positive δ13C excursion up to 4‰ in the Guanyinqiao stage and then, acute negative excursion in the earily Longmaxi stage. These organic carbon isotopes curve are not only efficient measure of carving up strata borderline, but also reflected the change of originality productivity. These organic carbon isotopes curves showed the process of the enhanced embedding of the global organic carbon. Anoxic event is the main factor of increasing organic carbon embedding speed. And the reduced organic carbon embedding in Hirnantian stage is due to the water column with abundant oxygen. The δ34S values are gradually positive excursion from P.pacificus biozone to N.extraordinarius biozone, and reach the maximum in the Upper Hirnantian stage. Then, the δ34S values are negative excursion. The excursions of δ13C and δ34S reflect the acute change of environment. The formation of source rocks is largely dependent on the nature of organisms from which kerogen is derived and the preservation conditions of organic matter, which are fundamentally dependent on a favourable combination of various elements in which organisms live and are subsequently buried. These elements include palaeoclimate, palaeostructure and palaeoenvironmental conditions. Based on above mentioned circumstance, the coupling connection of source rock and the palaeoclimate, and of palaeostructure and palaeoenvironmental conditions are confirmed, and the “anoxic-marginal depression-photosynthesis” environemental model is established. It is indicated that anoxic played important role in production of organic matter. The produced organic matter was accumulated in marginal depression of the Yangtze area. The photosynthesis is favor of the high productivity. Source rocks have a good perspective, like that of “hot shale” deposited in North Africa.
Resumo:
In this paper, we apply the preconditioned conjugate gradient method to the solution of positive-definite Toeplitz systems, especially we introduce a new kind of co-circulant preconditioners Pn[ca] by the use of embedding method. We have also discussed the properties of these new preconditioners and proved that many of former preconditioners can be considered as some special cases of Pn[co\. Because of the introduction of co-circulant preconditioners pn[a>], we can greatly overcome the singularity caused by circulant preconditioners. We have discussed the oo-circulant series and functions. We compare the ordinary circularity with the co-circularity, showing that the latter one can be considered as the extended form of the former one; correspondingly, many methods and theorems of the ordinary circularity can be extended. Furthermore, we present the co-circulant decompositional method. By the use of this method, we can divide any co-circulant signal into a summation of many sub-signals; especially among those sub-signals, there are many subseries of which their period is just equal to 1, which are actually the frequency elements of the original co-circulant signal. In this way, we can establish the relationship between the signal and its frequency elements, that is, the frequency elements hi the frequency domain are actually signals with the period of 1 in the spatial domain. We have also proved that the co-circulant has already existed in the traditional Fourier theory. By the use of different criteria for constructing preconditioners, we can get many different preconditioned systems. From the preconditioned systems PN[
Resumo:
These are two parts included in this report. In the first part, the zonation of the complexes in its series, lithofacies, the depth of magma source and chambers is discussed in detailed for the first time based on the new data of petrol-chemistry, isotopes, tectono-magma activity of Mesozoic volcano-plutonic complexes in the southern Great Hinggan Mts. Then, the genetic model of the zonality, double overlapped layer system, is proposed. The main conclusions are presented as follows: The Mesozoic volcanic-plutonic complexes in the southern Great Hinggan were formed by four stages of magma activity on the base of the subduction system formed in late Paleozoic. The Mesozoic magmatic activity began in Meso-Jurassic Epoch, flourished in late Jurassic Epoch, and declined in early Cretaceous Epoch. The complexes consist dominantly of acidic rocks with substantial intermediate rocks and a few mefic ones include the series of calc alkaline, high potassium calc alkaline, shoshonite, and a few alkaline. Most of those rocks are characterized by high potassium. The volcano-plutonic complexes is characterized by zonality, and can be divided mainly into there zones. The west zone, located in northwestern side of gneiss zone in Great Xinggan mountains, are dominated of high potassium basalts and basaltic andesite. The middle zone lies on the southeast side of the Proterozoic gneiss zone, and its southeast margin is along Huangganliang, Wushijiazi, and Baitazi. It composed of dominatly calc-alkaline, high potassium calc-alkaline rocks, deep granite and extrusive rhyolite. The east zone, occurring along Kesheketong Qi-Balinyou Qi-Balinzuo Qi, is dominated of shoshonite. In generally, southeastward from the Proterozoic gneiss zone, the Mesozoic plutons show the zones-mica granitites zone, hornblende-mica granitite zone, mica-hornblende granitite zone; the volcanic rocks also display the zones of calc alkaline-high potassium calc alkaline and shoshonites. In the same space, the late Paleozoic plutons also display the same zonality, which zones are combined of binary granite, granodiorite, quartz diorite and diorite southeast wards from the gneiss. Meso-Jurassic Epoch granite plutons almost distribute in the middle zone on the whole. Whereas late Jurassic Epoch volcanic rocks distribute in the west and east zone. This distribution of the volcano-plutonic complexes reveals that the middle zone was uplifted more intensively then the other zones in Meso-Jurassic and late Jurassic Epoches. Whole rock Rb-Sr isochron ages of the high potassium calc-alkaline volcanic rocks in the west zone, the calc-alkaline and high potassium calc-alkaline granite the middle zone, shoshonite in the east zone are 136Ma, 175Ma and 154Ma, respectively. The alkaline rocks close to the shoshonite zone is 143Ma and 126Ma. The isochron ages are comparable well with the K-Ar ages of the rocks obtained previously by other researchers. The compositions of Sr ans Nd isotopes suggest that the source of Mesozoic volcanic-plutonic complexes in Great Hinggan Mts. is mostly Paleo-Asia oceanic volcanic-sedimentary rocks, which probably was mixed by antiquated gneiss. The tectonic setting for Mesozoic magmatism was subductive continental margin. But this it was not directly formed by present west Pacific subduction. It actully was the re-working of the Paleozoic subduction system( which was formed during the Paleo-Asia ocean shortening) controlled by west Pacific subduction. For this reason, Although Great Hinggan Mts. is far away from west Pacific subduction zone, its volcanic arc still occurred echoing to the volcanic activities of east China, it, but the variation trend of potassium content in volcano-plutonic complexes of Great Hinggan is just reverse to ones of west Pacific. The primitive magmas occurred in the southern Great Hinggan Mts. Include high-potassium calc-alkaline basalt, high potassium calc-alkaline rhyolite, high potassium rhyolite, non-Eu negative anomaly trachy-rhyolite et al. Therefore, all of primitive magmas are either mafic or acid, and most of intermediate rocks occurring in the area are the products of Mesozoic acid magma contaminated by the Paleozoic volcanic- sedimentary rocks. The depth of those primitive magma sources and chambers gradually increase from northwest to southeast. This suggests that Paleozoic subduction still controlled the Mesozoic magmatism. In summary, the lithosphere tectonic system of the southern Great Hinggan Mts. controlling Mesozoic magmatism is a double overlapped layer system developing from Paleozoic subduction system. For this reason, the depth of crust of the southern Great Hinggan Mts. is thicker than that of its two sides, and consequently it causes regional negative gravity abnormity. The second part of this report shows the prolongation of the research work carried on in my doctor's period. Author presents new data about Rb-Sr and Sm-Nd isotopic compositions and ages, geochamical features, genesis mineralogy and ore deposit geology of the volcanic rocks in Kunyang rift. On the base of the substantial work, author presents a prospect of copper bearing magnetite ore deposit. The most important conclusions are as follows: 1. It is proved that all of these carbonatites controlled by a ringing structure system in Wuding-Lufeng basin in the central Yunnan were formed in the Mesoproterozoic period. Two stages could be identified as follows: in the first stage, carbonatitic volcanic rocks, such as lavas(Sm-Nd, 1685Ma), basaltic porphyrite dykes(Sm-Nd, 1645Ma), pyroclastic rocks and volcaniclastic sedimentary rocks, formed in the outer ring; in the second stage, carbonatitic breccias and dykes(Rb-Sr, 1048 Ma) did in the middle ring. The metamorphic age of the carbonatitic lavas (Rb-Sr, 893 Ma) in the outer ring was determined. The magma of carbonatitic volcanic rocks derived mainly form enriched mantle whose basement is depleted mantle that had been metasomated by mantle fluid and contaminated by Archaean lower crust. Carbonatitic spheres were discovered in ore bearing layers in Lishi copper mining in Yimen recently, which formed in calcite carbonatitic magma extrusion. This discovery indicates that the formation of copper ore deposit genesis relates to carbonatitic volcanic activity. The iron and copper ore deposits occurring in carbonatitic volcanic- sedimentary rocks in Kunyang rift results from carbonatitic magmatism. Author calls this kind of ore deposits as subaqueous carbonatitic iron-copper deposit. The magnetic anomaly area in the north of Lishi copper mining in Yimen was a depression more lower than its circumference. Iron and copper ores occurrig on the margin of the magnetic anomaly are volcanic hydrothermal deposit. The magnetic body causing the magnetic anomaly must be magnetite ore. Because the anomaly area is wide, it can be sure that there is a large insidious ore deposit embedding there.
Resumo:
The dissertation addressed the problems of signals reconstruction and data restoration in seismic data processing, which takes the representation methods of signal as the main clue, and take the seismic information reconstruction (signals separation and trace interpolation) as the core. On the natural bases signal representation, I present the ICA fundamentals, algorithms and its original applications to nature earth quake signals separation and survey seismic signals separation. On determinative bases signal representation, the paper proposed seismic dada reconstruction least square inversion regularization methods, sparseness constraints, pre-conditioned conjugate gradient methods, and their applications to seismic de-convolution, Radon transformation, et. al. The core contents are about de-alias uneven seismic data reconstruction algorithm and its application to seismic interpolation. Although the dissertation discussed two cases of signal representation, they can be integrated into one frame, because they both deal with the signals or information restoration, the former reconstructing original signals from mixed signals, the later reconstructing whole data from sparse or irregular data. The goal of them is same to provide pre-processing methods and post-processing method for seismic pre-stack depth migration. ICA can separate the original signals from mixed signals by them, or abstract the basic structure from analyzed data. I surveyed the fundamental, algorithms and applications of ICA. Compared with KL transformation, I proposed the independent components transformation concept (ICT). On basis of the ne-entropy measurement of independence, I implemented the FastICA and improved it by covariance matrix. By analyzing the characteristics of the seismic signals, I introduced ICA into seismic signal processing firstly in Geophysical community, and implemented the noise separation from seismic signal. Synthetic and real data examples show the usability of ICA to seismic signal processing and initial effects are achieved. The application of ICA to separation quake conversion wave from multiple in sedimentary area is made, which demonstrates good effects, so more reasonable interpretation of underground un-continuity is got. The results show the perspective of application of ICA to Geophysical signal processing. By virtue of the relationship between ICA and Blind Deconvolution , I surveyed the seismic blind deconvolution, and discussed the perspective of applying ICA to seismic blind deconvolution with two possible solutions. The relationship of PC A, ICA and wavelet transform is claimed. It is proved that reconstruction of wavelet prototype functions is Lie group representation. By the way, over-sampled wavelet transform is proposed to enhance the seismic data resolution, which is validated by numerical examples. The key of pre-stack depth migration is the regularization of pre-stack seismic data. As a main procedure, seismic interpolation and missing data reconstruction are necessary. Firstly, I review the seismic imaging methods in order to argue the critical effect of regularization. By review of the seismic interpolation algorithms, I acclaim that de-alias uneven data reconstruction is still a challenge. The fundamental of seismic reconstruction is discussed firstly. Then sparseness constraint on least square inversion and preconditioned conjugate gradient solver are studied and implemented. Choosing constraint item with Cauchy distribution, I programmed PCG algorithm and implement sparse seismic deconvolution, high resolution Radon Transformation by PCG, which is prepared for seismic data reconstruction. About seismic interpolation, dealias even data interpolation and uneven data reconstruction are very good respectively, however they can not be combined each other. In this paper, a novel Fourier transform based method and a algorithm have been proposed, which could reconstruct both uneven and alias seismic data. I formulated band-limited data reconstruction as minimum norm least squares inversion problem where an adaptive DFT-weighted norm regularization term is used. The inverse problem is solved by pre-conditional conjugate gradient method, which makes the solutions stable and convergent quickly. Based on the assumption that seismic data are consisted of finite linear events, from sampling theorem, alias events can be attenuated via LS weight predicted linearly from low frequency. Three application issues are discussed on even gap trace interpolation, uneven gap filling, high frequency trace reconstruction from low frequency data trace constrained by few high frequency traces. Both synthetic and real data numerical examples show the proposed method is valid, efficient and applicable. The research is valuable to seismic data regularization and cross well seismic. To meet 3D shot profile depth migration request for data, schemes must be taken to make the data even and fitting the velocity dataset. The methods of this paper are used to interpolate and extrapolate the shot gathers instead of simply embedding zero traces. So, the aperture of migration is enlarged and the migration effect is improved. The results show the effectiveness and the practicability.
Resumo:
With the development of oil and gas exploration, the exploration of the continental oil and gas turns into the exploration of the subtle oil and gas reservoirs from the structural oil and gas reservoirs in China. The reserves of the found subtle oil and gas reservoirs account for more than 60 percent of the in the discovered oil and gas reserves. Exploration of the subtle oil and gas reservoirs is becoming more and more important and can be taken as the main orientation for the increase of the oil and gas reserves. The characteristics of the continental sedimentary facies determine the complexities of the lithological exploration. Most of the continental rift basins in East China have entered exploration stages of medium and high maturity. Although the quality of the seismic data is relatively good, this areas have the characteristics of the thin sand thickness, small faults, small range of the stratum. It requests that the seismic data have high resolution. It is a important task how to improve the signal/noise ratio of the high frequency of seismic data. In West China, there are the complex landforms, the deep embedding the targets of the prospecting, the complex geological constructs, many ruptures, small range of the traps, the low rock properties, many high pressure stratums and difficulties of boring well. Those represent low signal/noise ratio and complex kinds of noise in the seismic records. This needs to develop the method and technique of the noise attenuation in the data acquisition and processing. So that, oil and gas explorations need the high resolution technique of the geophysics in order to solve the implementation of the oil resources strategy for keep oil production and reserves stable in Ease China and developing the crude production and reserves in West China. High signal/noise ratio of seismic data is the basis. It is impossible to realize for the high resolution and high fidelity without the high signal/noise ratio. We play emphasis on many researches based on the structure analysis for improving signal/noise ratio of the complex areas. Several methods are put forward for noise attenuation to truly reflect the geological features. Those can reflect the geological structures, keep the edges of geological construction and improve the identifications of the oil and gas traps. The ideas of emphasize the foundation, give prominence to innovate, and pay attention to application runs through the paper. The dip-scanning method as the center of the scanned point inevitably blurs the edges of geological features, such as fault and fractures. We develop the new dip scanning method in the shap of end with two sides scanning to solve this problem. We bring forward the methods of signal estimation with the coherence, seismic wave characteristc with coherence, the most homogeneous dip-sanning for the noise attenuation using the new dip-scanning method. They can keep the geological characters, suppress the random noise and improve the s/n ratio and resolution. The rutine dip-scanning is in the time-space domain. Anew method of dip-scanning in the frequency-wavenumber domain for the noise attenuation is put forward. It use the quality of distinguishing between different dip events of the reflection in f-k domain. It can reduce the noise and gain the dip information. We describe a methodology for studying and developing filtering methods based on differential equations. It transforms the filtering equations in the frequency domain or the f-k domain into time or time-space domains, and uses a finite-difference algorithm to solve these equations. This method does not require that seismic data be stationary, so their parameters can vary at every temporal and spatial point. That enhances the adaptability of the filter. It is computationally efficient. We put forward a method of matching pursuits for the noise suppression. This method decomposes any signal into a linear expansion of waveforms that are selected from a redundant dictionary of functions. These waveforms are chosen in order to best match the signal structures. It can extract the effective signal from the noisy signal and reduce the noise. We introduce the beamforming filtering method for the noise elimination. Real seismic data processing shows that it is effective in attenuating multiples and internal multiples. The s/n ratio and resolution are improved. The effective signals have the high fidelity. Through calculating in the theoretic model and applying it to the real seismic data processing, it is proved that the methods in this paper can effectively suppress the random noise, eliminate the cohence noise, and improve the resolution of the seismic data. Their practicability is very better. And the effect is very obvious.
Resumo:
The technique of energy extraction using groundwater source heat pumps, as a sustainable way of low-grade thermal energy utilization, has widely been used since mid-1990's. Based on the basic theories of groundwater flow and heat transfer and by employing two analytic models, the relationship of the thermal breakthrough time for a production well with the effect factors involved is analyzed and the impact of heat transfer by means of conduction and convection, under different groundwater velocity conditions, on geo-temperature field is discussed.A mathematical model, coupling the equations for groundwater flow with those for heat transfer, was developed. The impact of energy mining using a single well system of supplying and returning water on geo-temperature field under different hydrogeological conditions, well structures, withdraw-and-reinjection rates, and natural groundwater flow velocities was quantitatively simulated using the finite difference simulator HST3D. Theoretical analyses of the simulated results were also made. The simulated results of the single well system indicate that neither the permeability nor the porosity of a homogeneous aquifer has significant effect on the temperature of the production segment provided that the production and injection capability of each well in the aquifers involved can meet the designed value. If there exists a lower permeable interlayer, compared with the main aquifer, between the production and injection segments, the temperature changes of the production segment will decrease. The thicker the interlayer and the lower the interlayer permeability, the longer the thermal breakthrough time of the production segment and the smaller the temperature changes of the production segment. According to the above modeling, it can also be found that with the increase of the aquifer thickness, the distance between the production and injection screens, and/or the regional groundwater flow velocity, and/or the decrease of the production-and-reinjection rate, the temperature changes of the production segment decline. For an aquifer of a constant thickness, continuously increase the screen lengths of production and injection segments may lead to the decrease of the distance between the production and injection screens, and the temperature changes of the production segment will increase, consequently.According to the simulation results of the single well system, the parameters, that can cause significant influence on heat transfer as well as geo-temperature field, were chosen for doublet system simulation. It is indicated that the temperature changes of the pumping well will decrease as the aquifer thickness, the distance between the well pair and/or the screen lengths of the doublet increase. In the case of a low permeable interlayer embedding in the main aquifer, if the screens of the pumping and the injection wells are installed respectively below and above the interlayer, the temperature changes of the pumping well will be smaller than that without the interlay. The lower the permeability of the interlayer, the smaller the temperature changes. The simulation results also indicate that the lower the pumping-and-reinjection rate, the greater the temperature changes of the pumping well. It can also be found that if the producer and the injector are chosen reasonably, the temperature changes of the pumping well will decline as the regional groundwater flow velocity increases. Compared with the case that the groundwater flow direction is perpendicular to the well pair, if the regional flow is directed from the pumping well to the injection well, the temperature changes of the pumping well is relatively smaller.Based on the above simulation study, a case history was conducted using the data from an operating system in Beijing. By means of the conceptual model and the mathematical model, a 3-D simulation model was developed and the hydrogeological parameters and the thermal properties were calibrated. The calibrated model was used to predict the evolution of the geo-temperature field for the next five years. The simulation results indicate that the calibrated model can represent the hydrogeological conditions and the nature of the aquifers. It can also be found that the temperature fronts in high permeable aquifers move very fast and the radiuses of temperature influence are large. Comparatively, the temperature changes in clay layers are smaller and there is an obvious lag of the temperature changes. According to the current energy mining load, the temperature of the pumping wells will increase by 0.7°C at the end of the next five years. The above case study may provide reliable base for the scientific management of the operating system studied.
Resumo:
Amorphous computing is the study of programming ultra-scale computing environments of smart sensors and actuators cite{white-paper}. The individual elements are identical, asynchronous, randomly placed, embedded and communicate locally via wireless broadcast. Aggregating the processors into groups is a useful paradigm for programming an amorphous computer because groups can be used for specialization, increased robustness, and efficient resource allocation. This paper presents a new algorithm, called the clubs algorithm, for efficiently aggregating processors into groups in an amorphous computer, in time proportional to the local density of processors. The clubs algorithm is well-suited to the unique characteristics of an amorphous computer. In addition, the algorithm derives two properties from the physical embedding of the amorphous computer: an upper bound on the number of groups formed and a constant upper bound on the density of groups. The clubs algorithm can also be extended to find the maximal independent set (MIS) and $Delta + 1$ vertex coloring in an amorphous computer in $O(log N)$ rounds, where $N$ is the total number of elements and $Delta$ is the maximum degree.
Resumo:
Weighted graph matching is a good way to align a pair of shapes represented by a set of descriptive local features; the set of correspondences produced by the minimum cost of matching features from one shape to the features of the other often reveals how similar the two shapes are. However, due to the complexity of computing the exact minimum cost matching, previous algorithms could only run efficiently when using a limited number of features per shape, and could not scale to perform retrievals from large databases. We present a contour matching algorithm that quickly computes the minimum weight matching between sets of descriptive local features using a recently introduced low-distortion embedding of the Earth Mover's Distance (EMD) into a normed space. Given a novel embedded contour, the nearest neighbors in a database of embedded contours are retrieved in sublinear time via approximate nearest neighbors search. We demonstrate our shape matching method on databases of 10,000 images of human figures and 60,000 images of handwritten digits.
Resumo:
There is a natural norm associated with a starting point of the homogeneous self-dual (HSD) embedding model for conic convex optimization. In this norm two measures of the HSD model’s behavior are precisely controlled independent of the problem instance: (i) the sizes of ε-optimal solutions, and (ii) the maximum distance of ε-optimal solutions to the boundary of the cone of the HSD variables. This norm is also useful in developing a stopping-rule theory for HSD-based interior-point methods such as SeDuMi. Under mild assumptions, we show that a standard stopping rule implicitly involves the sum of the sizes of the ε-optimal primal and dual solutions, as well as the size of the initial primal and dual infeasibility residuals. This theory suggests possible criteria for developing starting points for the homogeneous self-dual model that might improve the resulting solution time in practice
Resumo:
A method is proposed that can generate a ranked list of plausible three-dimensional hand configurations that best match an input image. Hand pose estimation is formulated as an image database indexing problem, where the closest matches for an input hand image are retrieved from a large database of synthetic hand images. In contrast to previous approaches, the system can function in the presence of clutter, thanks to two novel clutter-tolerant indexing methods. First, a computationally efficient approximation of the image-to-model chamfer distance is obtained by embedding binary edge images into a high-dimensional Euclide an space. Second, a general-purpose, probabilistic line matching method identifies those line segment correspondences between model and input images that are the least likely to have occurred by chance. The performance of this clutter-tolerant approach is demonstrated in quantitative experiments with hundreds of real hand images.
Resumo:
This paper introduces BoostMap, a method that can significantly reduce retrieval time in image and video database systems that employ computationally expensive distance measures, metric or non-metric. Database and query objects are embedded into a Euclidean space, in which similarities can be rapidly measured using a weighted Manhattan distance. Embedding construction is formulated as a machine learning task, where AdaBoost is used to combine many simple, 1D embeddings into a multidimensional embedding that preserves a significant amount of the proximity structure in the original space. Performance is evaluated in a hand pose estimation system, and a dynamic gesture recognition system, where the proposed method is used to retrieve approximate nearest neighbors under expensive image and video similarity measures. In both systems, BoostMap significantly increases efficiency, with minimal losses in accuracy. Moreover, the experiments indicate that BoostMap compares favorably with existing embedding methods that have been employed in computer vision and database applications, i.e., FastMap and Bourgain embeddings.
Resumo:
BoostMap is a recently proposed method for efficient approximate nearest neighbor retrieval in arbitrary non-Euclidean spaces with computationally expensive and possibly non-metric distance measures. Database and query objects are embedded into a Euclidean space, in which similarities can be rapidly measured using a weighted Manhattan distance. The key idea is formulating embedding construction as a machine learning task, where AdaBoost is used to combine simple, 1D embeddings into a multidimensional embedding that preserves a large amount of the proximity structure of the original space. This paper demonstrates that, using the machine learning formulation of BoostMap, we can optimize embeddings for indexing and classification, in ways that are not possible with existing alternatives for constructive embeddings, and without additional costs in retrieval time. First, we show how to construct embeddings that are query-sensitive, in the sense that they yield a different distance measure for different queries, so as to improve nearest neighbor retrieval accuracy for each query. Second, we show how to optimize embeddings for nearest neighbor classification tasks, by tuning them to approximate a parameter space distance measure, instead of the original feature-based distance measure.
Resumo:
Many real world image analysis problems, such as face recognition and hand pose estimation, involve recognizing a large number of classes of objects or shapes. Large margin methods, such as AdaBoost and Support Vector Machines (SVMs), often provide competitive accuracy rates, but at the cost of evaluating a large number of binary classifiers, thus making it difficult to apply such methods when thousands or millions of classes need to be recognized. This thesis proposes a filter-and-refine framework, whereby, given a test pattern, a small number of candidate classes can be identified efficiently at the filter step, and computationally expensive large margin classifiers are used to evaluate these candidates at the refine step. Two different filtering methods are proposed, ClassMap and OVA-VS (One-vs.-All classification using Vector Search). ClassMap is an embedding-based method, works for both boosted classifiers and SVMs, and tends to map the patterns and their associated classes close to each other in a vector space. OVA-VS maps OVA classifiers and test patterns to vectors based on the weights and outputs of weak classifiers of the boosting scheme. At runtime, finding the strongest-responding OVA classifier becomes a classical vector search problem, where well-known methods can be used to gain efficiency. In our experiments, the proposed methods achieve significant speed-ups, in some cases up to two orders of magnitude, compared to exhaustive evaluation of all OVA classifiers. This was achieved in hand pose recognition and face recognition systems where the number of classes ranges from 535 to 48,600.
Resumo:
This thesis elaborates on the problem of preprocessing a large graph so that single-pair shortest-path queries can be answered quickly at runtime. Computing shortest paths is a well studied problem, but exact algorithms do not scale well to real-world huge graphs in applications that require very short response time. The focus is on approximate methods for distance estimation, in particular in landmarks-based distance indexing. This approach involves choosing some nodes as landmarks and computing (offline), for each node in the graph its embedding, i.e., the vector of its distances from all the landmarks. At runtime, when the distance between a pair of nodes is queried, it can be quickly estimated by combining the embeddings of the two nodes. Choosing optimal landmarks is shown to be hard and thus heuristic solutions are employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the techniques presented in this thesis is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach which considers selecting landmarks at random. Finally, they are applied in two important problems arising naturally in large-scale graphs, namely social search and community detection.
Resumo:
Nearest neighbor classification using shape context can yield highly accurate results in a number of recognition problems. Unfortunately, the approach can be too slow for practical applications, and thus approximation strategies are needed to make shape context practical. This paper proposes a method for efficient and accurate nearest neighbor classification in non-Euclidean spaces, such as the space induced by the shape context measure. First, a method is introduced for constructing a Euclidean embedding that is optimized for nearest neighbor classification accuracy. Using that embedding, multiple approximations of the underlying non-Euclidean similarity measure are obtained, at different levels of accuracy and efficiency. The approximations are automatically combined to form a cascade classifier, which applies the slower approximations only to the hardest cases. Unlike typical cascade-of-classifiers approaches, that are applied to binary classification problems, our method constructs a cascade for a multiclass problem. Experiments with a standard shape data set indicate that a two-to-three order of magnitude speed up is gained over the standard shape context classifier, with minimal losses in classification accuracy.