14 resultados para automatic content extraction
em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast
Resumo:
A rapidly increasing number of Web databases are now become accessible via
their HTML form-based query interfaces. Query result pages are dynamically generated
in response to user queries, which encode structured data and are displayed for human
use. Query result pages usually contain other types of information in addition to query
results, e.g., advertisements, navigation bar etc. The problem of extracting structured data
from query result pages is critical for web data integration applications, such as comparison
shopping, meta-search engines etc, and has been intensively studied. A number of approaches
have been proposed. As the structures of Web pages become more and more complex, the
existing approaches start to fail, and most of them do not remove irrelevant contents which
may a®ect the accuracy of data record extraction. We propose an automated approach for
Web data extraction. First, it makes use of visual features and query terms to identify data
sections and extracts data records in these sections. We also represent several content and
visual features of visual blocks in a data section, and use them to ¯lter out noisy blocks.
Second, it measures similarity between data items in di®erent data records based on their
visual and content features, and aligns them into di®erent groups so that the data in the
same group have the same semantics. The results of our experiments with a large set of
Web query result pages in di®erent domains show that our proposed approaches are highly
e®ective.
Resumo:
Thermal comfort is defined as “that condition of mind which expresses satisfaction with the thermal environment’ [1] [2]. Field studies have been completed in order to establish the governing conditions for thermal comfort [3]. These studies showed that the internal climate of a room was the strongest factor in establishing thermal comfort. Direct manipulation of the internal climate is necessary to retain an acceptable level of thermal comfort. In order for Building Energy Management Systems (BEMS) strategies to be efficiently utilised it is necessary to have the ability to predict the effect that activating a heating/cooling source (radiators, windows and doors) will have on the room. The numerical modelling of the domain can be challenging due to necessity to capture temperature stratification and/or different heat sources (radiators, computers and human beings). Computational Fluid Dynamic (CFD) models are usually utilised for this function because they provide the level of details required. Although they provide the necessary level of accuracy these models tend to be highly computationally expensive especially when transient behaviour needs to be analysed. Consequently they cannot be integrated in BEMS. This paper presents and describes validation of a CFD-ROM method for real-time simulations of building thermal performance. The CFD-ROM method involves the automatic extraction and solution of reduced order models (ROMs) from validated CFD simulations. The test case used in this work is a room of the Environmental Research Institute (ERI) Building at the University College Cork (UCC). ROMs have shown that they are sufficiently accurate with a total error of less than 1% and successfully retain a satisfactory representation of the phenomena modelled. The number of zones in a ROM defines the size and complexity of that ROM. It has been observed that ROMs with a higher number of zones produce more accurate results. As each ROM has a time to solution of less than 20 seconds they can be integrated into the BEMS of a building which opens the potential to real time physics based building energy modelling.
Resumo:
Accurate modelling of the internal climate of buildings is essential if Building Energy Management Systems (BEMS) are to efficiently maintain adequate thermal comfort. Computational fluid dynamics (CFD) models are usually utilised to predict internal climate. Nevertheless CFD models, although providing the necessary level of accuracy, are highly computationally expensive, and cannot practically be integrated in BEMS. This paper presents and describes validation of a CFD-ROM method for real-time simulations of building thermal performance. The CFD-ROM method involves the automatic extraction and solution of reduced order models (ROMs) from validated CFD simulations. ROMs are shown to be adequately accurate with a total error below 5% and to retain satisfactory representation of the phenomena modelled. Each ROM has a time to solution under 20seconds, which opens the potential of their integration with BEMS, giving real-time physics-based building energy modelling. A parameter study was conducted to investigate the applicability of the extracted ROM to initial boundary conditions different from those from which it was extracted. The results show that the ROMs retained satisfactory total errors when the initial conditions in the room were varied by ±5°C. This allows the production of a finite number of ROMs with the ability to rapidly model many possible scenarios.
Resumo:
A novel, fast automatic motion segmentation approach is presented. It differs from conventional pixel or edge based motion segmentation approaches in that the proposed method uses labelled regions (facets) to segment various video objects from the background. Facets are clustered into objects based on their motion and proximity details using Bayesian logic. Because the number of facets is usually much lower than the number of edges and points, using facets can greatly reduce the computational complexity of motion segmentation. The proposed method can tackle efficiently the complexity of video object motion tracking, and offers potential for real-time content-based video annotation.
Resumo:
Proper application of stable isotopes (e. g., delta N-15 and delta C-13) to food web analysis requires an understanding of all nondietary factors that contribute to isotopic variability. Lipid extraction is often used during stable isotope analysis (SIA), because synthesized lipids have a low delta C-13 and can mask the delta C-13 of a consumer's diet. Recent studies indicate that lipid extraction intended to adjust delta C-13 may also cause shifts in delta N-15, but the magnitude of and reasons for the shift are highly uncertain. We examined a large data set (n = 854) for effects of lipid extraction (using Bligh and dyer's [ 1959] chloroform-methanol solvent mixtures) on the delta N-15 of aquatic consumers. We found no effect of chemically extracting lipids on the delta N-15 of whole zooplankton, unionid mussels, and fish liver samples, and found a small increase in fish muscle delta N-15 of similar to 0.4%. We also detected a negative relationship between the shift in delta N-15 following extraction and the C:N ratio in muscle tissue, suggesting that effects of extraction were greater for tissue with lower lipid content. As long as appropriate techniques such as those from Bligh and dyer (1959) are used, effects of lipid extraction on delta N-15 of aquatic consumers need not be a major consideration in the SIA of food webs.
Resumo:
The importance and use of text extraction from camera based coloured scene images is rapidly increasing with time. Text within a camera grabbed image can contain a huge amount of meta data about that scene. Such meta data can be useful for identification, indexing and retrieval purposes. While the segmentation and recognition of text from document images is quite successful, detection of coloured scene text is a new challenge for all camera based images. Common problems for text extraction from camera based images are the lack of prior knowledge of any kind of text features such as colour, font, size and orientation as well as the location of the probable text regions. In this paper, we document the development of a fully automatic and extremely robust text segmentation technique that can be used for any type of camera grabbed frame be it single image or video. A new algorithm is proposed which can overcome the current problems of text segmentation. The algorithm exploits text appearance in terms of colour and spatial distribution. When the new text extraction technique was tested on a variety of camera based images it was found to out perform existing techniques (or something similar). The proposed technique also overcomes any problems that can arise due to an unconstraint complex background. The novelty in the works arises from the fact that this is the first time that colour and spatial information are used simultaneously for the purpose of text extraction.
Resumo:
In the present study the extraction of paralytic shellfish poisoning (PSP) toxins from a toxic strain of the marine dinoflagellate Alexandrium tamarense CCMP-1493 using various mechanical and/or physical procedures was investigated. PBS buffer was investigated as the extraction solvent in order for these procedures to be used directly with immuno-magnetic Ferrospheres-N. The extraction was performed following the determination of when toxin content by the algae was at its highest during batch culture. The methods used for cell lysis and toxin extraction included freeze-thawing, freeze-boiling, steel ball bearing beating, glass bead beating, and sonication. The steel ball bearing beating was determined to release a similar amount of toxin when compared to a modified standard extraction method which was reported to release 100% of toxins from the algal cells and was therefore used in the next phase of the study. This next phase was to determine the feasibility of utilising an antibody coupled to novel magnetic microspheres (Ferrospheres-N) as a simple, rapid immune-capture procedure for PSP toxins extracted from the algae. The effects of increasing mass of Ferrospheres-N on the immuno-capture of the PSP toxins from the toxic algal strain extracts were investigated. Toxin recovery was found to increase when an increasing mass of Ferrospheres-N was used until 96.2% (+/- 1.3 SD) of the toxin extracted from the cells was captured and eluted. Toxin recovery was determined by comparison to an appropriate PSP toxin standard curve following analysis by the AOAC HPLC method. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The potential adverse effects on health of diet-derived advanced glycation end-products (AGEs) is of current interest, due to their proposed involvement in the disease progression of diabetic and uraemic conditions. However, accurate information about levels of AGEs in foods is lacking. The objective of this investigation was to determine the level of one particular AGE, N-epsilon-(carboxymethyl)lysine (CML), a marker of AGE formation, in a wide range of foods commonly consumed in a Western style diet. Individual foods (n = 257) were mixed, lyophilised, ground, reduced, fat-extracted, hydrolysed, and underwent solid-phase extraction. Extracts were analysed by ultra-performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS). Cereal (2.6 mg/100 g food) and fruit and vegetable (0.13 mg/100 g food) categories had the highest and lowest mean level of CML, respectively, when expressed in mg/100 g food. These data can be used for estimating potential consumer intakes, and provide information that can be used to educated consumers on how to reduce their CML intake. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
Two approaches were undertaken to characterize the arsenic (As) content of Chinese rice. First, a national market basket survey (n = 240) was conducted in provincial capitals, sourcing grain from China's premier rice production areas. Second, to reflect rural diets, paddy rice (n = 195) directly from farmers fields were collected from three regions in Hunan, a key rice producing province located in southern China. Two of the sites were within mining and smeltery districts, and the third was devoid of large-scale metal processing industries. Arsenic levels were determined in all the samples while a subset (n = 33) were characterized for As species, using a new simple and rapid extraction method suitable for use with Hamilton PRP-X100 anion exchange columns and HPLC-ICP-MS. The vast majority (85%) of the market rice grains possessed total As levels <150 ng g(-1). The rice collected from mine-impacted regions, however, were found to be highly enriched in As, reaching concentrations of up to 624 ng g(-1). Inorganic As (As(i)) was the predominant species detected in all of the speciated grain, with As(i) levels in some samples exceeding 300 ng g(-1). The As(i) concentration in polished and unpolished Chinese rice was successfully predicted from total As levels. The mean baseline concentrations for As(i) in Chinese market rice based on this survey were estimated to be 96 ng g(-1) while levels in mine-impacted areas were higher with ca. 50% of the rice in one region predicted to fail the national standard.
Resumo:
This article describes a practical demonstration of a complete full-duplex “amplitude shift keying (ASK)” retrodirective radio frequency identification (RFID) transceiver array.The interrogator incorporates a “retrodirective array (RDA)” with a dual-conversion phase conjugating architecture in order to achieve better performance than is possible with conventional RFID solutions. Here mixers phase conjugate the incoming signal and a carrier recovery circuit recovers incoming angle of arrival phase information of an encoded amplitude shift keyed signal. The resulting interrogator provides a receiver sensitivity level of -109 dBm. A four element square patch RDA gives a 3 dB automatic beam steering angle of acceptance of ±45°. When compared to an RFID system operating by conventional (non-retrodirective) means retrodirective action leads to improved range extension of up to 16 times at ±45°. Operator pointing accuracy requirements are also reduced due to automatic retrodirective self-pointing. These features significantly enhance deployment opportunities requiring long range low equivalent isotropic radiation power (EIRP) and/or RFID tagging of moving platforms. © 2012 Wiley Periodicals, Inc. Microwave Opt Technol Lett 55:160–164, 2013; View this article online at wileyonlinelibrary.com. DOI 10.1002/mop.27258
Resumo:
An environment friendly arsenic removal technique from contaminated soil with high iron content has been studied. A natural surfactant extracted from soapnut fruit, phosphate solution and their mixture was used separately as extractants. The mixture was most effective in desorbing arsenic, attaining above 70 % efficiency in the pH range of 4–5. Desorption kinetics followed Elovich model. Micellar solubilization by soapnut and arsenic exchange mechanism by phosphate are the probable mechanisms behind arsenic desorption. Sequential extraction reveals that the mixed soapnut–phosphate system is effective in desorbing arsenic associated with amphoteric–Fe-oxide forms. No chemical change to the wash solutions was observed by Fourier transform-infrared spectra. Soil:solution ratio, surfactant and phosphate concentrations were found to affect the arsenic desorption process. Addition of phosphate boosted the performance of soapnut solution considerably. Response surface methodology approach predicted up to 80 % desorption of arsenic from soil when treated with a mixture of ≈1.5 % soapnut, ≈100 mM phosphate at a soil:solution ratio of 1:30.
Resumo:
In the semiconductor manufacturing environment it is very important to understand which factors have the most impact on process outcomes and to control them accordingly. This is usually achieved through design of experiments at process start-up and long term observation of production. As such it relies heavily on the expertise of the process engineer. In this work, we present an automatic approach to extracting useful insights about production processes and equipment based on state-of-the-art Machine Learning techniques. The main goal of this activity is to provide tools to process engineers to accelerate the learning-by-observation phase of process analysis. Using a Metal Deposition process as an example, we highlight various ways in which the extracted information can be employed.