229 resultados para Features extraction
Resumo:
Currently we are facing an overburdening growth of the number of reliable information sources on the Internet. The quantity of information available to everyone via Internet is dramatically growing each year [15]. At the same time, temporal and cognitive resources of human users are not changing, therefore causing a phenomenon of information overload. World Wide Web is one of the main sources of information for decision makers (reference to my research). However our studies show that, at least in Poland, the decision makers see some important problems when turning to Internet as a source of decision information. One of the most common obstacles raised is distribution of relevant information among many sources, and therefore need to visit different Web sources in order to collect all important content and analyze it. A few research groups have recently turned to the problem of information extraction from the Web [13]. The most effort so far has been directed toward collecting data from dispersed databases accessible via web pages (related to as data extraction or information extraction from the Web) and towards understanding natural language texts by means of fact, entity, and association recognition (related to as information extraction). Data extraction efforts show some interesting results, however proper integration of web databases is still beyond us. Information extraction field has been recently very successful in retrieving information from natural language texts, however it is still lacking abilities to understand more complex information, requiring use of common sense knowledge, discourse analysis and disambiguation techniques.
Resumo:
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative XPath expressions, although not widely used, should be used in preference to absolute XPath expressions in extracting content from human-created Web documents. Evaluation of robustness covers four thousand queries executed on several hundred webpages. We show that in referencing parts of real world dynamic HTML documents, relative XPath expressions are on average significantly more robust than absolute XPath ones.
Resumo:
Bioacoustic monitoring has become a significant research topic for species diversity conservation. Due to the development of sensing techniques, acoustic sensors are widely deployed in the field to record animal sounds over a large spatial and temporal scale. With large volumes of collected audio data, it is essential to develop semi-automatic or automatic techniques to analyse the data. This can help ecologists make decisions on how to protect and promote the species diversity. This paper presents generic features to characterize a range of bird species for vocalisation retrieval. In the implementation, audio recordings are first converted to spectrograms using short-time Fourier transform, then a ridge detection method is applied to the spectrogram for detecting points of interest. Based on the detected points, a new region representation are explored for describing various bird vocalisations and a local descriptor including temporal entropy, frequency bin entropy and histogram of counts of four ridge directions is calculated for each sub-region. To speed up the retrieval process, indexing is carried out and the retrieved results are ranked according to similarity scores. The experiment results show that our proposed feature set can achieve 0.71 in term of retrieval success rate which outperforms spectral ridge features alone (0.55) and Mel frequency cepstral coefficients (0.36).
Resumo:
Fibrodysplasia Ossificans Progressiva (FOP) is a rare, heritable condition typified by progression of extensive ossification within skeletal muscle, ligament and tendon together with defects in skeletal development. The condition is easily diagnosed by the presence of shortened great toes and there is severe advancement of disability with age. FOP has been shown to result from a point mutation (c.617G>A) in the ACVR1 gene in almost all patients reported. Very recently two other mutations have been described in three FOP patients. We present here evidence for two further unique mutations (c.605G>T and c.983G>A) in this gene in two FOP patients with some atypical digit abnormalities and other clinical features. The observation of disparate missense mutations mapped to the GS and kinase domains of the protein supports the disease model of mild kinase activation and provides a potential rationale for phenotypic variation. © 2009 Petrie et al.
Resumo:
Objective National guidelines for management of intermediate risk patients with suspected acute coronary syndrome, in whom AMI has been excluded, advocate provocative testing to final risk stratify these patients into low risk (negative testing) or high risk (positive testing suggestive of unstable angina). Adults less than 40 years have a low pretest probability of acute coronary syndrome. The utility of exercise stress testing in young adults with chest pain suspected of acute coronary syndrome who have National Heart Foundation intermediate risk features was evaluated Methods A retrospective analysis of exercise stress testing performed on patients less than 40 years was evaluated. Patients were enrolled on a chest pain pathway and had negative serial ECGs and cardiac biomarkers before exercise stress testing to rule-out acute coronary syndrome. Chart review was completed on patients with positive stress tests. Results The 3987 patients with suspected intermediate risk acute coronary syndrome underwent exercise stress testing. One thousand and twenty-seven (25.8%) were aged less than 40 years (age 33.3 ± 4.8 years). Four of these 1027 patients had a positive exercise stress test (0.4% incidence of positive exercise stress testing). Of those, three patients had subsequent non-invasive functional testing that yielded a negative result. One patient declined further investigations. Assuming this was a true positive exercise stress test, the incidence of true positive exercise stress testing would have been 0.097% (95% confidence interval: 0.079–0.115%) (one of 1027 patients). Conclusions Routine exercise stress testing has limited value in the risk stratification of adults less than 40 years with suspected intermediate risk of acute coronary syndrome
Resumo:
Background Today, finding an ideal biomaterial to treat the large bone defects, delayed unions and non-unions remains a challenge for orthopaedic surgeions and researchers. Several studies have been carried out on the subject of bone regeneration, each having its own advantages. The present study has been designed in vivo to evaluate the effects of cellular auto-transplantation of tail vertebrae on healing of experimental critical bone defect in a dog model. Methods Six indigenous breeds of dog with 32 ± 3.6 kg average weight from both sexes (5 males and 1 female) received bilateral critical-sized ulnar segmental defects. After determining the health condition, divided to 2 groups: The Group I were kept as control I (n = 1) while in Group II (experimental group; n = 5) bioactive bone implants were inserted. The defects were implanted with either autogeneic coccygeal bone grafts in dogs with 3-4 cm diaphyseal defects in the ulna. Defects were stabilized with internal plate fixation, and the control defects were not stabilized. Animals were euthanized at 16 weeks and analyzed by histopathology. Results Histological evaluation of this new bone at sixteen weeks postoperatively revealed primarily lamellar bone, with the formation of new cortices and normal-appearing marrow elements. And also reformation cortical compartment and reconstitution of marrow space were observed at the graft-host interface together with graft resorption and necrosis responses. Finally, our data were consistent with the osteoconducting function of the tail autograft. Conclusions Our results suggested that the tail vertebrae autograft seemed to be a new source of autogenous cortical bone in order to supporting segmental long bone defects in dogs. Furthermore, cellular autotransplantation was found to be a successful replacement for the tail vertebrae allograft bone at 3-4 cm segmental defects in the canine mid- ulna. Clinical application using graft expanders or bone autotransplantation should be used carefully and requires further investigation.
Resumo:
A method for determination of tricyclazole in water using solid phase extraction and high performance liquid chromatography (HPLC) with UV detection at 230nm and a mobile phase of acetonitrile:water (20:80, v/v) was developed. A performance comparison between two types of solid phase sorbents, the C18 sorbent of Supelclean ENVI-18 cartridge and the styrene-divinyl benzene copolymer sorbent of Sep-Pak PS2-Plus cartridge was conducted. The Sep-Pak PS2-Plus cartridges were found more suitable for extracting tricyclazole from water samples than the Supelclean ENVI-18 cartridges. For this cartridge, both methanol and ethyl acetate produced good results. The method was validated with good linearity and with a limit of detection of 0.008gL-1 for a 500-fold concentration through the SPE procedure. The recoveries of the method were stable at 80% and the precision was from 1.1-6.0% within the range of fortified concentrations. The validated method was also applied to measure the concentrations of tricyclazole in real paddy water.
Resumo:
Sustainability practices in government regulations and within the society influence the delivery of sustainable housing. The actual delivery rate of Australian sustain-able housing is not as high as other countries. There is an absence of engagement by stakeholders in adopting sustainable housing practices. This may be due, in the current Australian property market, to confusion as to what sustainability features should be considered, given the large range of environmental, economic and social sustainability options possible. One of the main problems appears to be that information demanders, especially real estate agents, valuers, insurance agents and mortgage lenders do not include sustainability perspectives in their advice or in their decision processes. Information distribution in the Australian property market is flawed, resulting in a lack of return-on-investment value of ‘green’ features implemented by some stakeholders. This paper reviewed the global sustainable development concept and Australian sustainable assessment methods. This review identified the possibility of a research project which aimed at identifying and integrating different perceptions and priority needs of the information demanders, for developing a model for the potential implementation of sustainability features distribution in the property industry. This research will reduce confusion on the sustainability-related information which can influence the decision making of stakeholders in the supply and demand of sustainable housing.
Resumo:
Over past few decades, frog species have been experiencing dramatic decline around the world. The reason for this decline includes habitat loss, invasive species, climate change and so on. To better know the status of frog species, classifying frogs has become increasingly important. In this study, acoustic features are investigated for multi-level classification of Australian frogs: family, genus and species, including three families, eleven genera and eighty five species which are collected from Queensland, Australia. For each frog species, six instances are selected from which ten acoustic features are calculated. Then, the multicollinearity between ten features are studied for selecting non-correlated features for subsequent analysis. A decision tree (DT) classifier is used to visually and explicitly determine which acoustic features are relatively important for classifying family, which for genus, and which for species. Finally, a weighted support vector machines (SVMs) classifier is used for the multi- level classification with three most important acoustic features respectively. Our experiment results indicate that using different acoustic feature sets can successfully classify frogs at different levels and the average classification accuracy can be up to 85.6%, 86.1% and 56.2% for family, genus and species respectively.
Resumo:
With the explosion of information resources, there is an imminent need to understand interesting text features or topics in massive text information. This thesis proposes a theoretical model to accurately weight specific text features, such as patterns and n-grams. The proposed model achieves impressive performance in two data collections, Reuters Corpus Volume 1 (RCV1) and Reuters 21578.
Resumo:
Bioacoustic data can be used for monitoring animal species diversity. The deployment of acoustic sensors enables acoustic monitoring at large temporal and spatial scales. We describe a content-based birdcall retrieval algorithm for the exploration of large data bases of acoustic recordings. In the algorithm, an event-based searching scheme and compact features are developed. In detail, ridge events are detected from audio files using event detection on spectral ridges. Then event alignment is used to search through audio files to locate candidate instances. A similarity measure is then applied to dimension-reduced spectral ridge feature vectors. The event-based searching method processes a smaller list of instances for faster retrieval. The experimental results demonstrate that our features achieve better success rate than existing methods and the feature dimension is greatly reduced.
Resumo:
Background Fusion transcripts are found in many tissues and have the potential to create novel functional products. Here, we investigate the genomic sequences around fusion junctions to better understand the transcriptional mechanisms mediating fusion transcription/splicing. We analyzed data from prostate (cancer) cells as previous studies have shown extensively that these cells readily undergo fusion transcription. Results We used the FusionMap program to identify high-confidence fusion transcripts from RNAseq data. The RNAseq datasets were from our (N = 8) and other (N = 14) clinical prostate tumors with adjacent non-cancer cells, and from the LNCaP prostate cancer cell line that were mock-, androgen- (DHT), and anti-androgen- (bicalutamide, enzalutamide) treated. In total, 185 fusion transcripts were identified from all RNAseq datasets. The majority (76 %) of these fusion transcripts were ‘read-through chimeras’ derived from adjacent genes in the genome. Characterization of sequences at fusion loci were carried out using a combination of the FusionMap program, custom Perl scripts, and the RNAfold program. Our computational analysis indicated that most fusion junctions (76 %) use the consensus GT-AG intron donor-acceptor splice site, and most fusion transcripts (85 %) maintained the open reading frame. We assessed whether parental genes of fusion transcripts have the potential to form complementary base pairing between parental genes which might bring them into physical proximity. Our computational analysis of sequences flanking fusion junctions at parental loci indicate that these loci have a similar propensity as non-fusion loci to hybridize. The abundance of repetitive sequences at fusion and non-fusion loci was also investigated given that SINE repeats are involved in aberrant gene transcription. We found few instances of repetitive sequences at both fusion and non-fusion junctions. Finally, RT-qPCR was performed on RNA from both clinical prostate tumors and adjacent non-cancer cells (N = 7), and LNCaP cells treated as above to validate the expression of seven fusion transcripts and their respective parental genes. We reveal that fusion transcript expression is similar to the expression of parental genes. Conclusions Fusion transcripts maintain the open reading frame, and likely use the same transcriptional machinery as non-fusion transcripts as they share many genomic features at splice/fusion junctions.
Resumo:
Age estimation from facial images is increasingly receiving attention to solve age-based access control, age-adaptive targeted marketing, amongst other applications. Since even humans can be induced in error due to the complex biological processes involved, finding a robust method remains a research challenge today. In this paper, we propose a new framework for the integration of Active Appearance Models (AAM), Local Binary Patterns (LBP), Gabor wavelets (GW) and Local Phase Quantization (LPQ) in order to obtain a highly discriminative feature representation which is able to model shape, appearance, wrinkles and skin spots. In addition, this paper proposes a novel flexible hierarchical age estimation approach consisting of a multi-class Support Vector Machine (SVM) to classify a subject into an age group followed by a Support Vector Regression (SVR) to estimate a specific age. The errors that may happen in the classification step, caused by the hard boundaries between age classes, are compensated in the specific age estimation by a flexible overlapping of the age ranges. The performance of the proposed approach was evaluated on FG-NET Aging and MORPH Album 2 datasets and a mean absolute error (MAE) of 4.50 and 5.86 years was achieved respectively. The robustness of the proposed approach was also evaluated on a merge of both datasets and a MAE of 5.20 years was achieved. Furthermore, we have also compared the age estimation made by humans with the proposed approach and it has shown that the machine outperforms humans. The proposed approach is competitive with current state-of-the-art and it provides an additional robustness to blur, lighting and expression variance brought about by the local phase features.
Resumo:
User generated information such as product reviews have been booming due to the advent of web 2.0. In particular, rich information associated with reviewed products has been buried in such big data. In order to facilitate identifying useful information from product (e.g., cameras) reviews, opinion mining has been proposed and widely used in recent years. In detail, as the most critical step of opinion mining, feature extraction aims to extract significant product features from review texts. However, most existing approaches only find individual features rather than identifying the hierarchical relationships between the product features. In this paper, we propose an approach which finds both features and feature relationships, structured as a feature hierarchy which is referred to as feature taxonomy in the remainder of the paper. Specifically, by making use of frequent patterns and association rules, we construct the feature taxonomy to profile the product at multiple levels instead of single level, which provides more detailed information about the product. The experiment which has been conducted based upon some real world review datasets shows that our proposed method is capable of identifying product features and relations effectively.
Resumo:
Organochlorine pesticides (OCPs) are ubiquitous environmental contaminants with adverse impacts on aquatic biota, wildlife and human health even at low concentrations. However, conventional methods for their determination in river sediments are resource intensive. This paper presents an approach that is rapid and also reliable for the detection of OCPs. Accelerated Solvent Extraction (ASE) with in-cell silica gel clean-up followed by Triple Quadrupole Gas Chromatograph Mass Spectrometry (GCMS/MS) was used to recover OCPs from sediment samples. Variables such as temperature, solvent ratio, adsorbent mass and extraction cycle were evaluated and optimised for the extraction. With the exception of Aldrin, which was unaffected by any of the variables evaluated, the recovery of OCPs from sediment samples was largely influenced by solvent ratio and adsorbent mass and, to some extent, the number of cycles and temperature. The optimised conditions for OCPs extraction in sediment with good recoveries were determined to be 4 cycles, 4.5 g of silica gel, 105 ᴼC, and 4:3 v/v DCM: hexane mixture. With the exception of two compounds (α-BHC and Aldrin) whose recoveries were low (59.73 and 47.66 % respectively), the recovery of the other pesticides were in the range 85.35 – 117.97% with precision < 10 % RSD. The method developed significantly reduces sample preparation time, the amount of solvent used, matrix interference, and is highly sensitive and selective.