990 resultados para multidimensional data
Resumo:
La medición de la desigualdad de oportunidades con las bases de PISA implican varias limitaciones: (i) la muestra sólo representa una fracción limitada de las cohortes de jóvenes de 15 años en los países en desarrollo y (ii) estas fracciones no son uniformes entre países ni entre periodos. Lo anterior genera dudas sobre la confiabilidad de estas mediciones cuando se usan para comparaciones internacionales: mayor equidad puede ser resultado de una muestra más restringida y más homogénea. A diferencia de enfoques previos basados en reconstrucción de las muestras, el enfoque del documento consiste en proveer un índice bidimensional que incluye logro y acceso como dimensiones del índice. Se utilizan varios métodos de agregación y se observan cambios considerables en los rankings de (in) equidad de oportunidades cuando solo se observa el logro y cuando se observan ambas dimensiones en las pruebas de PISA 2006/2009. Finalmente se propone una generalización del enfoque permitiendo otras dimensiones adicionales y otros pesos utilizados en la agregación.
Resumo:
A program is provided to determine structural parameters of atoms in or adsorbed on surfaces by refinement of atomistic models towards experimentally determined data generated by the normal incidence X-ray standing wave (NIXSW) technique. The method employs a combination of Differential Evolution Genetic Algorithms and Steepest Descent Line Minimisations to provide a fast, reliable and user friendly tool for experimentalists to interpret complex multidimensional NIXSW data sets.
Resumo:
Background: Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results: We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2 of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log(2) units (6 of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions: This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells.
Resumo:
Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.
Resumo:
Three methodological limitations in English-Chinese contrastive rhetoric research have been identified in previous research, namely: the failure to control for the quality of L1 data; an inference approach to interpreting the relationship between L1 and L2 writing; and a focus on national cultural factors in interpreting rhetorical differences. Addressing these limitations, the current study examined the presence or absence and placement of thesis statement and topic sentences in four sets of argumentative texts produced by three groups of university students. We found that Chinese students tended to favour a direct/deductive approach in their English and Chinese writing, while native English writers typically adopted an indirect/inductive approach. This study argues for a dynamic and ecological interpretation of rhetorical practices in different languages and cultures.
Resumo:
Purpose – This paper aims to address the gaps in service recovery strategy assessment. An effective service recovery strategy that prevents customer defection after a service failure is a powerful managerial instrument. The literature to date does not present a comprehensive assessment of service recovery strategy. It also lacks a clear picture of the service recovery actions at managers’ disposal in case of failure and the effectiveness of individual strategies on customer outcomes. Design/methodology/approach – Based on service recovery theory, this paper proposes a formative index of service recovery strategy and empirically validates this measure using partial least-squares path modelling with survey data from 437 complainants in the telecommunications industry in Egypt. Findings – The CURE scale (CUstomer REcovery scale) presents evidence of reliability as well as convergent, discriminant and nomological validity. Findings also reveal that problem-solving, speed of response, effort, facilitation and apology are the actions that have an impact on the customer’s satisfaction with service recovery. Practical implications – This new formative index is of potential value in investigating links between strategy and customer evaluations of service by helping managers identify which actions contribute most to changes in the overall service recovery strategy as well as satisfaction with service recovery. Ultimately, the CURE scale facilitates the long-term planning of effective complaint management. Originality/value – This is the first study in the service marketing literature to propose a comprehensive assessment of service recovery strategy and clearly identify the service recovery actions that contribute most to changes in the overall service recovery strategy.
Resumo:
Bloom filters are a data structure for storing data in a compressed form. They offer excellent space and time efficiency at the cost of some loss of accuracy (so-called lossy compression). This work presents a yes-no Bloom filter, which as a data structure consisting of two parts: the yes-filter which is a standard Bloom filter and the no-filter which is another Bloom filter whose purpose is to represent those objects that were recognised incorrectly by the yes-filter (that is, to recognise the false positives of the yes-filter). By querying the no-filter after an object has been recognised by the yes-filter, we get a chance of rejecting it, which improves the accuracy of data recognition in comparison with the standard Bloom filter of the same total length. A further increase in accuracy is possible if one chooses objects to include in the no-filter so that the no-filter recognises as many as possible false positives but no true positives, thus producing the most accurate yes-no Bloom filter among all yes-no Bloom filters. This paper studies how optimization techniques can be used to maximize the number of false positives recognised by the no-filter, with the constraint being that it should recognise no true positives. To achieve this aim, an Integer Linear Program (ILP) is proposed for the optimal selection of false positives. In practice the problem size is normally large leading to intractable optimal solution. Considering the similarity of the ILP with the Multidimensional Knapsack Problem, an Approximate Dynamic Programming (ADP) model is developed making use of a reduced ILP for the value function approximation. Numerical results show the ADP model works best comparing with a number of heuristics as well as the CPLEX built-in solver (B&B), and this is what can be recommended for use in yes-no Bloom filters. In a wider context of the study of lossy compression algorithms, our researchis an example showing how the arsenal of optimization methods can be applied to improving the accuracy of compressed data.
Resumo:
Most multidimensional projection techniques rely on distance (dissimilarity) information between data instances to embed high-dimensional data into a visual space. When data are endowed with Cartesian coordinates, an extra computational effort is necessary to compute the needed distances, making multidimensional projection prohibitive in applications dealing with interactivity and massive data. The novel multidimensional projection technique proposed in this work, called Part-Linear Multidimensional Projection (PLMP), has been tailored to handle multivariate data represented in Cartesian high-dimensional spaces, requiring only distance information between pairs of representative samples. This characteristic renders PLMP faster than previous methods when processing large data sets while still being competitive in terms of precision. Moreover, knowing the range of variation for data instances in the high-dimensional space, we can make PLMP a truly streaming data projection technique, a trait absent in previous methods.
Resumo:
Sistemas de tomada de decisão baseados em Data Warehouse (DW) estão sendo cada dia mais utilizados por grandes empresas e organizações. O modelo multidimensional de organização dos dados utilizado por estes sistemas, juntamente com as técnicas de processamento analítico on-line (OLAP), permitem análises complexas sobre o histórico dos negócios através de uma simples e intuitiva interface de consulta. Apesar dos DWs armazenarem dados históricos por natureza, as estruturas de organização e classificação destes dados, chamadas de dimensões, não possuem a rigor uma representação temporal, refletindo somente a estrutura corrente. Para um sistema destinado à análise de dados, a falta do histórico das dimensões impossibilita consultas sobre o ambiente real de contextualização dos dados passados. Além disso, as alterações dos esquemas multidimensionais precisam ser assistidas e gerenciadas por um modelo de evolução, de forma a garantir a consistência e integridade do modelo multidimensional sem a perda de informações relevantes. Neste trabalho são apresentadas dezessete operações de alteração de esquema e sete operações de alteração de instâncias para modelos multidimensionais de DW. Um modelo de versões, baseado na associação de intervalos de validade aos esquemas e instâncias, é proposto para o gerenciamento dessas operações. Todo o histórico de definições e de dados do DW é mantido por esse modelo, permitindo análises completas dos dados passados e da evolução do DW. Além de suportar consultas históricas sobre as definições e as instâncias do DW, o modelo também permite a manutenção de mais de um esquema ativo simultaneamente. Isto é, dois ou mais esquemas podem continuar a ter seus dados atualizados periodicamente, permitindo assim que as aplicações possam consultar dados recentes utilizando diferentes versões de esquema.
Resumo:
The increasing use of fossil fuels in line with cities demographic explosion carries out to huge environmental impact in society. For mitigate these social impacts, regulatory requirements have positively influenced the environmental consciousness of society, as well as, the strategic behavior of businesses. Along with this environmental awareness, the regulatory organs have conquered and formulated new laws to control potentially polluting activities, mostly in the gas stations sector. Seeking for increasing market competitiveness, this sector needs to quickly respond to internal and external pressures, adapting to the new standards required in a strategic way to get the Green Badge . Gas stations have incorporated new strategies to attract and retain new customers whom present increasingly social demand. In the social dimension, these projects help the local economy by generating jobs and income distribution. In this survey, the present research aims to align the social, economic and environmental dimensions to set the sustainable performance indicators at Gas Stations sector in the city of Natal/RN. The Sustainable Balanced Scorecard (SBSC) framework was create with a set of indicators for mapping the production process of gas stations. This mapping aimed at identifying operational inefficiencies through multidimensional indicators. To carry out this research, was developed a system for evaluating the sustainability performance with application of Data Envelopment Analysis (DEA) through a quantitative method approach to detect system s efficiency level. In order to understand the systemic complexity, sub organizational processes were analyzed by the technique Network Data Envelopment Analysis (NDEA) figuring their micro activities to identify and diagnose the real causes of overall inefficiency. The sample size comprised 33 Gas stations and the conceptual model included 15 indicators distributed in the three dimensions of sustainability: social, environmental and economic. These three dimensions were measured by means of classical models DEA-CCR input oriented. To unify performance score of individual dimensions, was designed a unique grouping index based upon two means: arithmetic and weighted. After this, another analysis was performed to measure the four perspectives of SBSC: learning and growth, internal processes, customers, and financial, unifying, by averaging the performance scores. NDEA results showed that no company was assessed with excellence in sustainability performance. Some NDEA higher efficiency Gas Stations proved to be inefficient under certain perspectives of SBSC. In the sequence, a comparative sustainable performance and assessment analyzes among the gas station was done, enabling entrepreneurs evaluate their performance in the market competitors. Diagnoses were also obtained to support the decision making of entrepreneurs in improving the management of organizational resources and promote guidelines the regulators. Finally, the average index of sustainable performance was 69.42%, representing the efforts of the environmental suitability of the Gas station. This results point out a significant awareness of this segment, but it still needs further action to enhance sustainability in the long term
Resumo:
This work has as objective generality to make a multidimensional analysis in the genre journalistic assay, communicative genre that, beyond complex and multimodal, presents hybrid characteristics. Specifically, with the intention to propose defining criteria of the cited genre, this research looks for to establish differences and similarities between the assay and other genres of the same sphere, from the description and interpretation of used multimodal resources. The analysis of the formal, schematical and rhetorical resources identified in the formatting of the journalistic assay sample that the analyses are supported in the socio-semiotic and socio-rhetorical approaches. In the formal dimension, we contemplate elements that constitute design of the text, including the forms of representation from the typography, the colors, images, as well as the aspects communicative-linguistics: the modalization indices, the communicative operators and the category time; in the schematical dimension, we present the organizational structure, considering the rhetorical movements postulates for Swales (1990) and in the rhetorical dimension we observe the categories: who writes, for who it writes, on what it writes and where writes. The adopted methodologicals postulates are of qualitative nature and the procedure is documentary, data that in we are valid them written texts of this genre as analysis object. Corpus it is constituted by a composed sample for 14 extracted texts of a set of 173 propagated journalistic assays weekly for the magazine Veja, in the period between August of 2004 and January of 2008. The analysis of the data showed that the journalistic assay, object of this study, materializes through multiple symbolic representations and multiple subjects that turn since a small episode of the daily facts of great social relevance in the present time, of historical and cultural nature, nationwide or international. Used for the first time by Montaigne in 1580, to assign, in saying of the proper author, written fast on its life and historical events, which could nor be remembered later , the term `assay' was enriched with other specifications, of form to enclose the one that if they call scientific assay today, academic assay, journalistic assay and other types of specific assays. These denominations have to see with the enrollment of the members of diverse of practices communities, in virtue of the multiplicity of activities carried through in these spheres. The conclusions the one that we arrive had been the following ones: 1. the discursive genre is not a pure entity, in virtue of the multiplicity of situations where the sorts if insert in the social actions; 2. the institutions define the configuration of one definitive genre, also its proper assignment, since for backwards of all discursive genre a voice exists to discipline - institutional voice, and in the case of the assays for analyzed us, the institutional voice if it presents, really, as a defining trace; 3. the journalistic assay, for its multiple symbolic representations, multiple subjects and for passing explicit or implicit opinions of its author, resembles it other genres, being able, therefore, to be inserted in a colony of opinionatives genres
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Background. It has been suggested that the study of women who survive life-threatening complications related to pregnancy (maternal near-miss cases) may represent a practical alternative to surveillance of maternal morbidity/mortality since the number of cases is higher and the woman herself is able to provide information on the difficulties she faced and the long-term repercussions of the event. These repercussions, which may include sexual dysfunction, postpartum depression and posttraumatic stress disorder, may persist for prolonged periods of time, affecting women's quality of life and resulting in adverse effects to them and their babies. Objective. The aims of the present study are to create a nationwide network of scientific cooperation to carry out surveillance and estimate the frequency of maternal near-miss cases, to perform a multicenter investigation into the quality of care for women with severe complications of pregnancy, and to carry out a multidimensional evaluation of these women up to six months. Methods/Design. This project has two components: a multicenter, cross-sectional study to be implemented in 27 referral obstetric units in different geographical regions of Brazil, and a concurrent cohort study of multidimensional analysis. Over 12 months, investigators will perform prospective surveillance to identify all maternal complications. The population of the cross-sectional component will consist of all women surviving potentially life-threatening conditions (severe maternal complications) or life-threatening conditions (the maternal near miss criteria) and maternal deaths according to the new WHO definition and criteria. Data analysis will be performed in case subgroups according to the moment of occurrence and determining cause. Frequencies of near-miss and other severe maternal morbidity and the association between organ dysfunction and maternal death will be estimated. A proportion of cases identified in the cross-sectional study will comprise the cohort of women for the multidimensional analysis. Various aspects of the lives of women surviving severe maternal complications will be evaluated 3 and 6 months after the event and compared to a group of women who suffered no severe complications in pregnancy. Previously validated questionnaires will be used in the interviews to assess reproductive function, posttraumatic stress, functional capacity, quality of life, sexual function, postpartum depression and infant development. © 2009 Cecatti et al.
Resumo:
Includes bibliography.