998 resultados para translated data


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Many educational researchers conducting studies in non-English speaking settings attempt to report on their project in English to boost their scholarly impact. It requires preparing and presenting translations of data collected from interviews and observations. This paper discusses the process and ethical considerations involved in this invisible methodological phase. The process includes activities prior to data analysis and to its presentation to be undertaken by the bilingual researcher as translator in order to convey participants’ original meanings as well as to establish and fulfil translation ethics. This paper offers strategies to address such issues; the most appropriate translation method for qualitative study; and approaches to address political issues when presenting such data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper investigates several approaches to bootstrapping a new spoken language understanding (SLU) component in a target language given a large dataset of semantically-annotated utterances in some other source language. The aim is to reduce the cost associated with porting a spoken dialogue system from one language to another by minimising the amount of data required in the target language. Since word-level semantic annotations are costly, Semantic Tuple Classifiers (STCs) are used in conjunction with statistical machine translation models both of which are trained from unaligned data to further reduce development time. The paper presents experiments in which a French SLU component in the tourist information domain is bootstrapped from English data. Results show that training STCs on automatically translated data produced the best performance for predicting the utterance's dialogue act type, however individual slot/value pairs are best predicted by training STCs on the source language and using them to decode translated utterances. © 2010 ISCA.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Estimates of potential and actual C sequestration require areal information about various types of management activities. Forest surveys, land use data, and agricultural statistics contribute information enabling calculation of the impacts of current and historical land management on C sequestration in biomass (in forests) or in soil (in agricultural systems). Unfortunately little information exists on the distribution of various management activities that can impact soil C content in grassland systems. Limited information of this type restricts our ability to carry out bottom-up estimates of the current C balance of grasslands or to assess the potential for grasslands to act as C sinks with changes in management. Here we review currently available information about grassland management, how that information could be related to information about the impacts of management on soil C stocks, information that may be available in the future, and needs that remain to be filled before in-depth assessments may be carried out. We also evaluate constraints induced by variability in information sources within and between countries. It is readily apparent that activity data for grassland management is collected less frequently and on a coarser scale than data for forest or agricultural inventories and that grassland activity data cannot be directly translated into IPCC-type factors as is done for IPCC inventories of agricultural soils. However, those management data that are available can serve to delineate broad-scale differences in management activities within regions in which soil C is likely to change in response to changes in management. This, coupled with the distinct possibility of more intensive surveys planned in the future, may enable more accurate assessments of grassland C dynamics with higher resolution both spatially and in the number management activities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mesoscale eddy plays an important role in the ocean circulation. In order to improve the simulation accuracy of the mesoscale eddies, a three-dimensional variation (3DVAR) data assimilation system called Ocean Variational Analysis System (OVALS) is coupled with a POM model to simulate the mesoscale eddies in the Northwest Pacific Ocean. In this system, the sea surface height anomaly (SSHA) data by satellite altimeters are assimilated and translated into pseudo temperature and salinity (T-S) profile data. Then, these profile data are taken as observation data to be assimilated again and produce the three-dimensional analysis T-S field. According to the characteristics of mesoscale eddy, the most appropriate assimilation parameters are set up and testified in this system. A ten years mesoscale eddies simulation and comparison experiment is made, which includes two schemes: assimilation and non-assimilation. The results of comparison between two schemes and the observation show that the simulation accuracy of the assimilation scheme is much better than that of non-assimilation, which verified that the altimetry data assimilation method can improve the simulation accuracy of the mesoscale dramatically and indicates that it is possible to use this system on the forecast of mesoscale eddies in the future.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Assigning uncertainty to ocean-color satellite products is a requirement to allow informed use of these data. Here, uncertainty estimates are derived using the comparison on a 12th-degree grid of coincident daily records of the remote-sensing reflectance RRS obtained with the same processing chain from three satellite missions, MERIS, MODIS and SeaWiFS. The approach is spatially resolved and produces σ, the part of the RRS uncertainty budget associated with random effects. The global average of σ decreases with wavelength from approximately 0.7– 0.9 10−3 sr−1 at 412 nm to 0.05–0.1 10−3 sr−1 at the red band, with uncertainties on σ evaluated as 20–30% between 412 and 555 nm, and 30–40% at 670 nm. The distribution of σ shows a restricted spatial variability and small variations with season, which makes the multi-annual global distribution of σ an estimate applicable to all retrievals of the considered missions. The comparison of σ with other uncertainty estimates derived from field data or with the support of algorithms provides a consistent picture. When translated in relative terms, and assuming a relatively low bias, the distribution of σ suggests that the objective of a 5% uncertainty is fulfilled between 412 and 490 nm for oligotrophic waters (chlorophyll-a concentration below 0.1 mg m−3). This study also provides comparison statistics. Spectrally, the mean absolute relative difference between RRS from different missions shows a characteristic U-shape with both ends at blue and red wavelengths inversely related to the amplitude of RRS. On average and for the considered data sets, SeaWiFS RRS tend to be slightly higher than MODIS RRS, which in turn appear higher than MERIS RRS. Biases between mission-specific RRS may exhibit a seasonal dependence, particularly in the subtropical belt.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação de mestrado, Engenharia Informática, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2015

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The interest in using information to improve the quality of living in large urban areas and its governance efficiency has been around for decades. Nevertheless, the improvements in Information and Communications Technology has sparked a new dynamic in academic research, usually under the umbrella term of Smart Cities. This concept of Smart City can probably be translated, in a simplified version, into cities that are lived, managed and developed in an information-saturated environment. While it makes perfect sense and we can easily foresee the benefits of such a concept, presently there are still several significant challenges that need to be tackled before we can materialize this vision. In this work we aim at providing a small contribution in this direction, which maximizes the relevancy of the available information resources. One of the most detailed and geographically relevant information resource available, for the study of cities, is the census, more specifically the data available at block level (Subsecção Estatística). In this work, we use Self-Organizing Maps (SOM) and the variant Geo-SOM to explore the block level data from the Portuguese census of Lisbon city, for the years of 2001 and 2011. We focus on gauging change, proposing ways that allow the comparison of the two time periods, which have two different underlying geographical bases. We proceed with the analysis of the data using different SOM variants, aiming at producing a two-fold portrait: one, of the evolution of Lisbon during the first decade of the XXI century, another, of how the census dataset and SOM’s can be used to produce an informational framework for the study of cities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Results of a search for new phenomena in events with an energetic photon and large missing transverse momentum in proton-proton collisions at root s = 7 TeV are reported. Data collected by the ATLAS experiment at the LHC corresponding to an integrated luminosity of 4.6 fb(-1) are used. Good agreement is observed between the data and the standard model predictions. The results are translated into exclusion limits on models with large extra spatial dimensions and on pair production of weakly interacting dark matter candidates.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Geochemical behavior of Rb-Sr and K-Ar systems in Upper Vendian clayey rocks of the Russian Platform is under consideredation. The use of additional data on grain size fractions of sedimentary rocks recovered from boreholes drilled in the Gavrilov Yam area made it possible to confirm the previous conclusion on two stages of epigenetic matter transformation (approximately 600 and 400 Ma ago). Distortions are related to transformation of sediments due to interaction in the water-rock system. Interaction degree was more intense in the upper part of the sedimentary section relative to its lower strata. These conclusions are substantiated by materials from boreholes that characterize different types of Vendian sections and different tectonic zones.