41 resultados para Data Mining and its Application
Resumo:
The objective of the PANACEA ICT-2007.2.2 EU project is to build a platform that automates the stages involved in the acquisition,production, updating and maintenance of the large language resources required by, among others, MT systems. The development of a Corpus Acquisition Component (CAC) for extracting monolingual and bilingual data from the web is one of the most innovative building blocks of PANACEA. The CAC, which is the first stage in the PANACEA pipeline for building Language Resources, adopts an efficient and distributed methodology to crawl for web documents with rich textual content in specific languages and predefined domains. The CAC includes modules that can acquire parallel data from sites with in-domain content available in more than one language. In order to extrinsically evaluate the CAC methodology, we have conducted several experiments that used crawled parallel corpora for the identification and extraction of parallel sentences using sentence alignment. The corpora were then successfully used for domain adaptation of Machine Translation Systems.
Resumo:
Floods are the natural hazards that produce the highest number of casualties and material damage in the Western Mediterranean. An improvement in flood risk assessment and study of a possible increase in flooding occurrence are therefore needed. To carry out these tasks it is important to have at our disposal extensive knowledge on historical floods and to find an efficient way to manage this geographical data. In this paper we present a complete flood database spanning the 20th century for the whole of Catalonia (NE Spain), which includes documentary information (affected areas and damage) and instrumental information (meteorological and hydrological records). This geodatabase, named Inungama, has been implemented on a GIS (Geographical Information System) in order to display all the information within a given geographical scenario, as well as to carry out an analysis thereof using queries, overlays and calculus. Following a description of the type and amount of information stored in the database and the structure of the information system, the first applications of Inungama are presented. The geographical distribution of floods shows the localities which are more likely to be flooded, confirming that the most affected municipalities are the most densely populated ones in coastal areas. Regarding the existence of an increase in flooding occurrence, a temporal analysis has been carried out, showing a steady increase over the last 30 years.
Resumo:
The existence of fluids and partial melt in the lower crust of the seismically active Kutch rift basin (on the western continental margin of India) owing to underplating has been proposed in previous geological and geophysical studies. This hypothesis is examined using magnetotelluric (MT) data acquired at 23 stations along two profiles across Kutch Mainland Uplift and Wagad Uplift. A detailed upper crustal structure is also presented using twodimensional inversion of MT data in the Bhuj earthquake (2001) area. The prominent boundaries of reflection in the upper crust at 5, 10 and 20 km obtained in previous seismic reflection profiles correlate with conductive structures in our models. The MT study reveals 1-2 km thick Mesozoic sediments under the Deccan trap cover. The Deccan trap thickness in this region varies from a few meters to 1.5 km. The basement is shallow on the northern side compared to the south and is in good agreement with geological models as well as drilling information. The models for these profiles indicate that the thickness of sediments would further increase southwards into the Gulf of Kutch. Significant findings of the present study indicate 1) the hypocentre region of the earthquake is devoid of fluids, 2) absence of melt (that is emplaced during rifting as suggested from the passive seismological studies) in the lower crust and 3) a low resistive zone in the depth range of 5-20 km. The present MT study rules out fluidsand melt (magma) as the causative factors that triggered the Bhuj earthquake. The estimated porosity value of 0.02% will explain 100-500 ohm·m resistivity values observed in the lower crust. Based on the seismic velocities and geochemical studies, presence of garnet is inferred. The lower crust consists of basalts - probably generated by partial melting of metasomatised garnet peridotite at deeper depths in the lithosphere - and their composition might be modified by reaction with the spinel peridotites.
Resumo:
A raga is a collective melodic expression consisting of motifs. A raga can be identified using motifs which areunique to it. Motifs can be thought of as signature prosodic phrases. Different ragas may be composed of the same setof notes, or even phrases, but the prosody may be completely different. In this paper, an attempt is made to determinethe characteristic motifs that enable identification of a raga and distinguish between them. To determine this, motifs are first manually marked for a set of five popular raga by a professional musician. The motifs are then normalisedwith respect to the tonic. HMMs are trained for each motif using 80% of the data and about 20% are used for testing. The results do indicate that about 80% of the motifs are identified as belonging to a specific raga accurately.
Resumo:
This paper reviews the concept of presence in immersive virtual environments, the sense of being there signalled by people acting and responding realistically to virtual situations and events. We argue that presence is a unique phenomenon that must be distinguished from the degree of engagement, involvement in the portrayed environment. We argue that there are three necessary conditions for presence: the (a) consistent low latency sensorimotor loop between sensory data and proprioception; (b) statistical plausibility: images must be statistically plausible in relation to the probability distribution of images over natural scenes. A constraint on this plausibility is the level of immersion;(c) behaviour-response correlations: Presence may be enhanced and maintained over time by appropriate correlations between the state and behaviour of participants and responses within the environment, correlations that show appropriate responses to the activity of the participants. We conclude with a discussion of methods for assessing whether presence occurs, and in particular recommend the approach of comparison with ground truth and give some examples of this.
Resumo:
Peer-reviewed
Resumo:
In the present research we have set forth a new, simple, Trade-Off model that would allow us to calculate how much debt and, by default, how much equity a company should have, using easily available information and calculating the cost of debt dynamically on the basis of the effect that the capital structure of the company has on the risk of bankruptcy; in an attempt to answer this question. The proposed model has been applied to the companies that make up the Dow Jones Industrial Average (DJIA) in 2007. We have used consolidated financial data from 1996 to 2006, published by Bloomberg. We have used simplex optimization method to find the debt level that maximizes firm value. Then, we compare the estimated debt with real debt of companies using statistical nonparametric Mann-Whitney. The results indicate that 63% of companies do not show a statistically significant difference between the real and the estimated debt.
Resumo:
This paper reviews the concept of presence in immersive virtual environments, the sense of being there signalled by people acting and responding realistically to virtual situations and events. We argue that presence is a unique phenomenon that must be distinguished from the degree of engagement, involvement in the portrayed environment. We argue that there are three necessary conditions for presence: the (a) consistent low latency sensorimotor loop between sensory data and proprioception; (b) statistical plausibility: images must be statistically plausible in relation to the probability distribution of images over natural scenes. A constraint on this plausibility is the level of immersion;(c) behaviour-response correlations: Presence may be enhanced and maintained over time by appropriate correlations between the state and behaviour of participants and responses within the environment, correlations that show appropriate responses to the activity of the participants. We conclude with a discussion of methods for assessing whether presence occurs, and in particular recommend the approach of comparison with ground truth and give some examples of this.
Resumo:
This paper aims to provide insights into the phenomenon of knowledge flows. We study one of the main mechanisms through which these flows occur, i.e., the mobility of highly-skilled individuals. We focus on the geographical mobility of inventors across European regions. Thus, patent data are used to trace the pattern of inventors’ mobility across european regions, to track down focuses of attraction of talent throughout the continent, and to study their distribution across the space. To do so, we gather information from PCT patent documents and we first match the names which seemed to belong to the same inventor and then we create a new algorithm to decide whether each patent applied for under each name belongs to the same inventor.
Resumo:
Fibrinolytic therapy with Recombinant Tissue-Plasminogen Activator (rt-PA) is currently the only effective treatment for ischaemic stroke in its acute phase. Even though its use generally improves the prognosis of those patients likely to receive it, rt-PA administration is associated to several risks, such as haemorrhagic transformation ofthe ischaemic lesion and activation of excitotoxic mechanisms that may contribute to an increase in mortality or to a poor outcome in certain occasions, specially when arterial recanalization is not achieved or the rt-PA is lately administrated. Since in the last few years the role of glutamate in the neurotoxicity associated toischaemia has been widely studied and it is known that high plasma glutamate levels are predictors of ischaemic lesion growth and poor neurological outcome, it is necessary to find out which factors can contribute to glutamate release in the brain. The aim of this study is to determine if rt-PA administration is related to an increase in plasma glutamate levels, as well as to define if higher plasma glutamate levels at admission are related to different evolution and prognosis of our patients, both in those in which recanalisation is achieved and not. A series of cases of patients with hemispheric cerebral infarction admitted in our hospital during a year will be studied, and the data obtained from them will be compared to the data obtained from a control group, the samples of wich were takenyears ago, before rt-PA was routinely used
Resumo:
Antioxidant enzymes are involved in important processes of cell detoxification during oxidative stress and have, therefore, been used as biomarkers in algae. Nevertheless, their limited use in fluvial biofilms may be due to the complexity of such communities. Here, a comparison between different extraction methods was performed to obtain a reliable method for catalase extraction from fluvial biofilms. Homogenization followed by glass bead disruption appeared to be the best compromise for catalase extraction. This method was then applied to a field study in a metal-polluted stream (Riou Mort, France). The most polluted sites were characterized by a catalase activity 4–6 times lower than in the low-polluted site. Results of the comparison process and its application are promising for the use of catalase activity as an early warning biomarker of toxicity using biofilms in the laboratory and in the field