905 resultados para Extraction socket
Resumo:
Except the article forming the main content most HTML documents on the WWW contain additional contents such as navigation menus, design elements or commercial banners. In the context of several applications it is necessary to draw the distinction between main and additional content automatically. Content extraction and template detection are the two approaches to solve this task. This thesis gives an extensive overview of existing algorithms from both areas. It contributes an objective way to measure and evaluate the performance of content extraction algorithms under different aspects. These evaluation measures allow to draw the first objective comparison of existing extraction solutions. The newly introduced content code blurring algorithm overcomes several drawbacks of previous approaches and proves to be the best content extraction algorithm at the moment. An analysis of methods to cluster web documents according to their underlying templates is the third major contribution of this thesis. In combination with a localised crawling process this clustering analysis can be used to automatically create sets of training documents for template detection algorithms. As the whole process can be automated it allows to perform template detection on a single document, thereby combining the advantages of single and multi document algorithms.
Resumo:
This research work is aimed at the valorization of two types of pomace deriving from the extra virgin olive oil mechanical extraction process, such as olive pomace and a new by-product named “paté”, in the livestock sector as important sources of antioxidants and unsaturated fatty acids. In the first research the suitability of dried stoned olive pomace as a dietary supplement for dairy buffaloes was evaluated. The effectiveness of this utilization in modifying fatty acid composition and improving the oxidative stability of buffalo milk and mozzarella cheese have been proven by means of the analysis of qualitative and quantitative parameters. In the second research the use of paté as a new by-product in dietary feed supplementation for dairy ewes, already fed with a source of unsaturated fatty acids such as extruded linseed, was studied in order to assess the effect of this combination on the dairy products obtained. The characterization of paté as a new by-product was also carried out, studying the optimal conditions of its stabilization and preservation at the same time. The main results, common to both researches, have been the detection and the characterization of hydrophilic phenols in the milk. The analytical detection of hydroxytyrosol and tyrosol in the ewes’ milk fed with the paté and hydroxytyrosol in buffalo fed with pomace showed for the first time the presence in the milk of hydroxytyrosol, which is one of the most important bioactive compounds of the oil industry products; the transfer of these antioxidants and the proven improvement of the quality of milk fat could positively interact in the prevention of some human cardiovascular diseases and some tumours, increasing in this manner the quality of dairy products, also improving their shelf-life. These results also provide important information on the bioavailability of these phenolic compounds.
Resumo:
This thesis aims at investigating methods and software architectures for discovering what are the typical and frequently occurring structures used for organizing knowledge in the Web. We identify these structures as Knowledge Patterns (KPs). KP discovery needs to address two main research problems: the heterogeneity of sources, formats and semantics in the Web (i.e., the knowledge soup problem) and the difficulty to draw relevant boundary around data that allows to capture the meaningful knowledge with respect to a certain context (i.e., the knowledge boundary problem). Hence, we introduce two methods that provide different solutions to these two problems by tackling KP discovery from two different perspectives: (i) the transformation of KP-like artifacts to KPs formalized as OWL2 ontologies; (ii) the bottom-up extraction of KPs by analyzing how data are organized in Linked Data. The two methods address the knowledge soup and boundary problems in different ways. The first method provides a solution to the two aforementioned problems that is based on a purely syntactic transformation step of the original source to RDF followed by a refactoring step whose aim is to add semantics to RDF by select meaningful RDF triples. The second method allows to draw boundaries around RDF in Linked Data by analyzing type paths. A type path is a possible route through an RDF that takes into account the types associated to the nodes of a path. Then we present K~ore, a software architecture conceived to be the basis for developing KP discovery systems and designed according to two software architectural styles, i.e, the Component-based and REST. Finally we provide an example of reuse of KP based on Aemoo, an exploratory search tool which exploits KPs for performing entity summarization.
Resumo:
Over the past ten years, the cross-correlation of long-time series of ambient seismic noise (ASN) has been widely adopted to extract the surface-wave part of the Green’s Functions (GF). This stochastic procedure relies on the assumption that ASN wave-field is diffuse and stationary. At frequencies <1Hz, the ASN is mainly composed by surface-waves, whose origin is attributed to the sea-wave climate. Consequently, marked directional properties may be observed, which call for accurate investigation about location and temporal evolution of the ASN-sources before attempting any GF retrieval. Within this general context, this thesis is aimed at a thorough investigation about feasibility and robustness of the noise-based methods toward the imaging of complex geological structures at the local (∼10-50km) scale. The study focused on the analysis of an extended (11 months) seismological data set collected at the Larderello-Travale geothermal field (Italy), an area for which the underground geological structures are well-constrained thanks to decades of geothermal exploration. Focusing on the secondary microseism band (SM;f>0.1Hz), I first investigate the spectral features and the kinematic properties of the noise wavefield using beamforming analysis, highlighting a marked variability with time and frequency. For the 0.1-0.3Hz frequency band and during Spring- Summer-time, the SMs waves propagate with high apparent velocities and from well-defined directions, likely associated with ocean-storms in the south- ern hemisphere. Conversely, at frequencies >0.3Hz the distribution of back- azimuths is more scattered, thus indicating that this frequency-band is the most appropriate for the application of stochastic techniques. For this latter frequency interval, I tested two correlation-based methods, acting in the time (NCF) and frequency (modified-SPAC) domains, respectively yielding esti- mates of the group- and phase-velocity dispersions. Velocity data provided by the two methods are markedly discordant; comparison with independent geological and geophysical constraints suggests that NCF results are more robust and reliable.
Resumo:
In this thesis we are going to talk about technologies which allow us to approach sentiment analysis on newspapers articles. The final goal of this work is to help social scholars to do content analysis on big corpora of texts in a faster way thanks to the support of automatic text classification.
Resumo:
In questo lavoro si introducono i concetti di base di Natural Language Processing, soffermandosi su Information Extraction e analizzandone gli ambiti applicativi, le attività principali e la differenza rispetto a Information Retrieval. Successivamente si analizza il processo di Named Entity Recognition, focalizzando l’attenzione sulle principali problematiche di annotazione di testi e sui metodi per la valutazione della qualità dell’estrazione di entità. Infine si fornisce una panoramica della piattaforma software open-source di language processing GATE/ANNIE, descrivendone l’architettura e i suoi componenti principali, con approfondimenti sugli strumenti che GATE offre per l'approccio rule-based a Named Entity Recognition.
Resumo:
Early implant placement is one of the treatment options after tooth extraction. Implant surgery is performed after a healing period of 4 to 8 weeks and combined with a simultaneous contour augmentation using the guided bone regeneration technique to rebuild stable esthetic facial hard- and soft-tissue contours.
Resumo:
In most pathology laboratories worldwide, formalin-fixed paraffin embedded (FFPE) samples are the only tissue specimens available for routine diagnostics. Although commercial kits for diagnostic molecular pathology testing are becoming available, most of the current diagnostic tests are laboratory-based assays. Thus, there is a need for standardized procedures in molecular pathology, starting from the extraction of nucleic acids. To evaluate the current methods for extracting nucleic acids from FFPE tissues, 13 European laboratories, participating to the European FP6 program IMPACTS (www.impactsnetwork.eu), isolated nucleic acids from four diagnostic FFPE tissues using their routine methods, followed by quality assessment. The DNA-extraction protocols ranged from homemade protocols to commercial kits. Except for one homemade protocol, the majority gave comparable results in terms of the quality of the extracted DNA measured by the ability to amplify differently sized control gene fragments by PCR. For array-applications or tests that require an accurately determined DNA-input, we recommend using silica based adsorption columns for DNA recovery. For RNA extractions, the best results were obtained using chromatography column based commercial kits, which resulted in the highest quantity and best assayable RNA. Quality testing using RT-PCR gave successful amplification of 200 bp-250 bp PCR products from most tested tissues. Modifications of the proteinase-K digestion time led to better results, even when commercial kits were applied. The results of the study emphasize the need for quality control of the nucleic acid extracts with standardised methods to prevent false negative results and to allow data comparison among different diagnostic laboratories.
Resumo:
The aim of this study was to assess the changes in inclination of the maxillary second (M2) and third (M3) molars after orthodontic treatment of Class II Division 1 malocclusion with extraction of maxillary first molars.
Resumo:
This paper utilizes a Contingent Valuation Method survey of a random sample of residents to estimate that households are willing to pay an average of $12.00 per month for public projects designed to improve river access and $10.46 per month for additional safety measures that would eliminate risks to local watersheds from drilling for natural gas from underground shale formations. These estimates can be compared to the costs of providing each of these two amenities to help foster the formation of efficient policy decisions.
Resumo:
Extraction of natural gas by hydraulic fracturing of the Middle Devonian Marcellus Shale, a major gas-bearing unit in the Appalachian Basin, results in significant quantities of produced water containing high total dissolved solids (TDS). We carried out a strontium (Sr) isotope investigation to determine the utility of Sr isotopes in identifying and quantifying the interaction of Marcellus Formation produced waters with other waters in the Appalachian Basin in the event of an accidental release, and to provide information about the source of the dissolved solids. Strontium isotopic ratios of Marcellus produced waters collected over a geographic range of ∼375 km from southwestern to northeastern Pennsylvania define a relatively narrow set of values (εSr SW = +13.8 to +41.6, where εSr SW is the deviation of the 87Sr/86Sr ratio from that of seawater in parts per 104); this isotopic range falls above that of Middle Devonian seawater, and is distinct from most western Pennsylvania acid mine drainage and Upper Devonian Venango Group oil and gas brines. The uniformity of the isotope ratios suggests a basin-wide source of dissolved solids with a component that is more radiogenic than seawater. Mixing models indicate that Sr isotope ratios can be used to sensitively differentiate between Marcellus Formation produced water and other potential sources of TDS into ground or surface waters.
Resumo:
Biliary cast syndrome (BCS) is the presence of casts within the intrahepatic or extrahepatic biliary system after orthotopic liver transplantation. Our work compares two percutaneous methods for BCS treatment: the mechanical cast-extraction technique (MCE) versus the hydraulic cast-extraction (HCE) technique using a rheolytic system.