969 resultados para on-disk data layout


Relevância:

100.00% 100.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62F10, 62J05, 62P30

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we evaluate and compare two representativeand popular distributed processing engines for large scalebig data analytics, Spark and graph based engine GraphLab. Wedesign a benchmark suite including representative algorithmsand datasets to compare the performances of the computingengines, from performance aspects of running time, memory andCPU usage, network and I/O overhead. The benchmark suite istested on both local computer cluster and virtual machines oncloud. By varying the number of computers and memory weexamine the scalability of the computing engines with increasingcomputing resources (such as CPU and memory). We also runcross-evaluation of generic and graph based analytic algorithmsover graph processing and generic platforms to identify thepotential performance degradation if only one processing engineis available. It is observed that both computing engines showgood scalability with increase of computing resources. WhileGraphLab largely outperforms Spark for graph algorithms, ithas close running time performance as Spark for non-graphalgorithms. Additionally the running time with Spark for graphalgorithms over cloud virtual machines is observed to increaseby almost 100% compared to over local computer clusters.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The representation of serial position in sequences is an important topic in a variety of cognitive areas including the domains of language, memory, and motor control. In the neuropsychological literature, serial position data have often been normalized across different lengths, and an improved procedure for this has recently been reported by Machtynger and Shallice (2009). Effects of length and a U-shaped normalized serial position curve have been criteria for identifying working memory deficits. We present simulations and analyses to illustrate some of the issues that arise when relating serial position data to specific theories. We show that critical distinctions are often difficult to make based on normalized data. We suggest that curves for different lengths are best presented in their raw form and that binomial regression can be used to answer specific questions about the effects of length, position, and linear or nonlinear shape that are critical to making theoretical distinctions. © 2010 Psychology Press.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In our study we rely on a data mining procedure known as support vector machine (SVM) on the database of the first Hungarian bankruptcy model. The models constructed are then contrasted with the results of earlier bankruptcy models with the use of classification accuracy and the area under the ROC curve. In using the SVM technique, in addition to conventional kernel functions, we also examine the possibilities of applying the ANOVA kernel function and take a detailed look at data preparation tasks recommended in using the SVM method (handling of outliers). The results of the models assembled suggest that a significant improvement of classification accuracy can be achieved on the database of the first Hungarian bankruptcy model when using the SVM method as opposed to neural networks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the years 2004 and 2005 we collected samples of phytoplankton, zooplankton and macroinvertebrates in an artificial small pond in Budapest. We set up a simulation model predicting the abundance of the cyclopoids, Eudiaptomus zachariasi and Ischnura pumilio by considering only temperature as it affects the abundance of population of the previous day. Phytoplankton abundance was simulated by considering not only temperature, but the abundance of the three mentioned groups. This discrete-deterministic model could generate similar patterns like the observed one and testing it on historical data was successful. However, because the model was overpredicting the abundances of Ischnura pumilio and Cyclopoida at the end of the year, these results were not considered. Running the model with the data series of climate change scenarios, we had an opportunity to predict the individual numbers for the period around 2050. If the model is run with the data series of the two scenarios UKHI and UKLO, which predict drastic global warming, then we can observe a decrease in abundance and shift in the date of the maximum abundance occurring (excluding Ischnura pumilio, where the maximum abundance increases and it occurs later), whereas under unchanged climatic conditions (BASE scenario) the change in abundance is negligible. According to the scenarios GFDL 2535, GFDL 5564 and UKTR, a transition could be noticed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Examined area can be found at Balaton Uplands National Park (Hungary). 5 sample areas were examined in Badacsonytördemic: 1: 32 hectare under-grazed pasture, 2: 38 hectare overgrazed pasture, 3: 34 hectare hayfield, 4: trampled area, 5: beaten track. Livestock population was 118 in the monitored pastures. Sampling was executed along five 52m long circular transects, within 5cm × 5cm interlocking quadrates. Based on the data we can state that the curve of the drinking area was the highest of speciesarea examinations however weed appeared because of degradation which provided more species. According to species-area examinations overgrazed areas were richer in species then other examined areas. Based on diversity data drinking area considered degraded, while meadow and overgrazed areas was considered as proper state. Diversity of meadow was larger, but dominance of economically useful species was smaller. The amount of less valuable species – Carex hirta – increased.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article provides new insight into how the ambience and design of shopping environments impact onspending behaviour. Environmental cues in a retail area influence emotional states of by-passers, which in turn influence spending levels. Past research suggested that this effect only applies to shops with moderate arousal level. Also, several studies failed to confirm a relationship between emotions and spending levels. This is surprising, since high arousal environments (e.g., amusement parks, sports stadiums and airports) often feature a wide range of retail outlets. Based on survey data collected in a live airport shopping area, this study finds a relationship between pleasure emotions associated with the retail area and recalled consumer spending, but also the time available for shopping (which in an airport is constrained). Also, visitors’ emotional state was influenced by the ambience (e.g., cleanliness, noise levels, lighting) as well as the design (e.g., easy wayfinding, seating areas) of the retail area. Shopper’s arousal levels did not explain variations in spending level. Implications for researchers and managers are discussed as well as suggestions for future research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research presented in this dissertation is comprised of several parts which jointly attain the goal of Semantic Distributed Database Management with Applications to Internet Dissemination of Environmental Data. ^ Part of the research into more effective and efficient data management has been pursued through enhancements to the Semantic Binary Object-Oriented database (Sem-ODB) such as more effective load balancing techniques for the database engine, and the use of Sem-ODB as a tool for integrating structured and unstructured heterogeneous data sources. Another part of the research in data management has pursued methods for optimizing queries in distributed databases through the intelligent use of network bandwidth; this has applications in networks that provide varying levels of Quality of Service or throughput. ^ The application of the Semantic Binary database model as a tool for relational database modeling has also been pursued. This has resulted in database applications that are used by researchers at the Everglades National Park to store environmental data and to remotely-sensed imagery. ^ The areas of research described above have contributed to the creation TerraFly, which provides for the dissemination of geospatial data via the Internet. TerraFly research presented herein ranges from the development of TerraFly's back-end database and interfaces, through the features that are presented to the public (such as the ability to provide autopilot scripts and on-demand data about a point), to applications of TerraFly in the areas of hazard mitigation, recreation, and aviation. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study examined the predictive merits of selected cognitive and noncognitive variables on the national Registry exam pass rate using 2008 graduates (n = 175) from community college radiography programs in Florida. The independent variables included two GPAs, final grades in five radiography courses, self-efficacy, and social support. The dependent variable was the first-attempt results on the national Registry exam. The design was a retrospective predictive study that relied on academic data collected from participants using the self-report method and on perceptions of students' success on the national Registry exam collected through a questionnaire developed and piloted in the study. All independent variables except self-efficacy and social support correlated with success on the national Registry exam ( p < .01) using the Pearson Product-Moment Correlation analysis. The strongest predictor of the national Registry exam success was the end-of-program GPA, r = .550, p < .001. The GPAs and scores for self-efficacy and social support were entered into a logistic regression analysis to produce a prediction model. The end-of-program GPA (p = .015) emerged as a significant variable. This model predicted 44% of the students who failed the national Registry exam and 97.3% of those who passed, explaining 45.8% of the variance. A second model included the final grades for the radiography courses, self efficacy, and social support. Three courses significantly predicted national Registry exam success; Radiographic Exposures, p < .001; Radiologic Physics, p = .014; and Radiation Safety & Protection, p = .044, explaining 56.8% of the variance. This model predicted 64% of the students who failed the national Registry exam and 96% of those who passed. The findings support the use of in-program data as accurate predictors of success on the national Registry exam.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study examines the impact of globalization and religious nationalism on the personal and professional lives of urban Hindu middle class media women. The research demonstrates how newly strengthened forces of globalization and Hindutva shape Indian womanhood. The research rests on various data that reveal how Indian women interpret and negotiate constructed identities. The study seeks to give voice to the objectified by scrutinizing and challenging the stereotypical modern faces of Indian womanhood seen in the narratives of globalization and Hindutva. Feminist open-ended interviewing was conducted in English and Hindi in New Delhi, the capital of India, with 23 Hindu women, employed by electronic and print media corporations. Accumulated data were analyzed and interpreted using feminist critical discourse analysis. Findings from the study indicate that while the Indian middle class women have embraced professional opportunities presented by globalization, they remain circumscribed by mutating gender politics. The research also finds that as academic and professional progress empower the women within their homes, their public lives have become fraught with increasing gender violence and decreasing recourse to justice. Therefore, women accept the power stratification of their lives as being dependent on spatial and temporal distinctions, and have learnt to engage and strategize with the public environment for physical safety and personal-professional progress. While the media women see systemic masculine domination as being symbiotic with tenets of religious nationalism, they exhibit an unquestioned embracing of capitalism/globalization as the means of empowerment. My research also strongly indicates the importance of the media’s role in shaping gender dynamics in a global context. In conclusion, my research shows the mediawomen’s immense agency in pursuing academic and professional careers while being aware of deeply ingrained gender roles through their strong commitment towards their families. The findings of this study contribute to the literature on Third World nationalism, urban globalization and understandings of reworked-renewed masculine domination. Finally, the study also engages with recent scholarship on the Indian middle class (See Nanda 2010; Shenoy 2009; Lukose 2005; and Radhakrishnan 2006) while simultaneously addressing the notions of privilege and disengagement levied at the middle class woman, a symbiosis of idealization and imprisonment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Online Social Network (OSN) services provided by Internet companies bring people together to chat, share the information, and enjoy the information. Meanwhile, huge amounts of data are generated by those services (they can be regarded as the social media ) every day, every hour, even every minute, and every second. Currently, researchers are interested in analyzing the OSN data, extracting interesting patterns from it, and applying those patterns to real-world applications. However, due to the large-scale property of the OSN data, it is difficult to effectively analyze it. This dissertation focuses on applying data mining and information retrieval techniques to mine two key components in the social media data — users and user-generated contents. Specifically, it aims at addressing three problems related to the social media users and contents: (1) how does one organize the users and the contents? (2) how does one summarize the textual contents so that users do not have to go over every post to capture the general idea? (3) how does one identify the influential users in the social media to benefit other applications, e.g., Marketing Campaign? The contribution of this dissertation is briefly summarized as follows. (1) It provides a comprehensive and versatile data mining framework to analyze the users and user-generated contents from the social media. (2) It designs a hierarchical co-clustering algorithm to organize the users and contents. (3) It proposes multi-document summarization methods to extract core information from the social network contents. (4) It introduces three important dimensions of social influence, and a dynamic influence model for identifying influential users.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The focus of this thesis is placed on text data compression based on the fundamental coding scheme referred to as the American Standard Code for Information Interchange or ASCII. The research objective is the development of software algorithms that result in significant compression of text data. Past and current compression techniques have been thoroughly reviewed to ensure proper contrast between the compression results of the proposed technique with those of existing ones. The research problem is based on the need to achieve higher compression of text files in order to save valuable memory space and increase the transmission rate of these text files. It was deemed necessary that the compression algorithm to be developed would have to be effective even for small files and be able to contend with uncommon words as they are dynamically included in the dictionary once they are encountered. A critical design aspect of this compression technique is its compatibility to existing compression techniques. In other words, the developed algorithm can be used in conjunction with existing techniques to yield even higher compression ratios. This thesis demonstrates such capabilities and such outcomes, and the research objective of achieving higher compression ratio is attained.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The past variability of the South Asian Monsoon is mostly known from records of wind strength over the Arabian Sea while high-resolution paleorecords from regions of strong monsoon precipitation are still lacking. Here, we present records of past monsoon variability obtained from sediment core SK 168/GC-1, which was collected at the Alcock Seamount complex in the Andaman Sea. We utilize the ecological habitats of different planktic foraminiferal species to reconstruct freshwater-induced stratification based on paired Mg/Ca and d18O analyses and to estimate seawater d18O (d18Osw). The difference between surface and thermocline temperatures (delta T) and d18Osw (delta d18Osw) is used to investigate changes in upper ocean stratification. Additionally, Ba/Ca in G. sacculifer tests is used as a direct proxy for riverine runoff and sea surface salinity (SSS) changes related to monsoon precipitation on land. Our delta d18Osw time series reveals that upper ocean salinity stratification did not change significantly throughout the last glacial suggesting little influence of NH insolation changes. The strongest increase in temperature gradients between the mixed layer and the thermocline is recorded for the mid-Holocene and indicate the presence of a significantly shallower thermocline. In line with previous work, the d18Osw and Ba/Ca records demonstrate that monsoon climate during the LGM was characterized by a significantly weaker southwest monsoon circulation and strongly reduced runoff. Based on our data the South Asian Summer Monsoon (SAM) over the Irrawaddyy strengthened gradually after the LGM beginning at ~18 ka. This is some 3 kyrs before an increase of the Ba/Ca record from the Arabian Sea and indicates that South Asian Monsoon climate dynamics are more complex than the simple N-S displacement of the ITCZ as generally described for other regions. Minimum d18Osw values recorded during the mid-Holocene are in phase with Ba/Ca marking a stronger monsoon precipitation, which is consistent with model simulations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present new Holocene century to millennial-scale proxies for the well-dated piston core MD99-2269 from Húnaflóadjúp on the North Iceland Shelf. The core is located in 365 mwd and lies close to the fluctuating boundary between Atlantic and Arctic/Polar waters. The proxies are: alkenone-based SST°C, and Mg/Ca SST°C estimates and stable d13C and d18O values on planktonic and benthic foraminifera. The data were converted to 60 yr equi-spaced time-series. Significant trends in the data were extracted using Singular Spectrum Analysis and these accounted for between 50% and 70% of the variance. A comparison between these data with previously published climate proxies from MD99-2269 was carried out on a data set which consisted of 14-variable data set covering the interval 400-9200 cal yr BP at 100 yr time steps. This analysis indicated that the 1st two PC axes accounted for 57% of the variability with high loadings clustering primarily into "nutrient" and "temperature" proxies. Clustering on the 100 yr time-series indicated major changes in environment at ~6350 and ~3450 cal yr BP, which define early, mid- and late Holocene climatic intervals. We argue that a pervasive freshwater cap during the early Holocene resulted in warm SST°s, a stratified water column, and a depleted nutrient supply. The loss of the freshwater layer in the mid-Holocene resulted in high carbonate production, and the late Holocene/neoglacial interval was marked by significantly more variable sea surface conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Este estudo tem como objetivo investigar os impactos da oscilação de Madden-Julian (OMJ) na precipitação da região Nordeste do Brasil (NEB). Para tanto foram utilizados dados diários de precipitação baseados em 492 pluviômetros distribuídos na região e cobrindo um período de 30 anos (1981 − 2010). As análises através de composições de anomalias de precipitação, radiação de onda longa e fluxo de umidade, foram obtidas com base no índice da OMJ desenvolvido por Jones-Carvalho. Para distinguir o sinal da OMJ de outros padrões de variabilidade climática, todos os dados diários foram filtrados na escala de 20 − 90 dias; portanto somente dias classificados como eventos da OMJ foram considerados nas composições. Uma análise preliminar baseada apenas nos dados de precipitação foi feita para uma pequena área localizada no interior semiárido do NEB, conhecida como Seridó. Essa microrregião é uma das áreas mais secas do NEB e foi reconhecida pela Convenção das Nações Unidas para o Combate à Desertificação e Mitigação dos Efeitos das Secas como particularmente vulnerável à desertificação. Composições de anomalias de precipitação foram feitas para cada uma das oito fases da OMJ durante Fevereiro-Maio (principal período chuvoso da microrregião). Os resultados mostraram a existência de variações significativas nos padrões de precipitação (de precipitação excessiva à deficiente) associados à propagação da OMJ. A combinação dos sinais de precipitação obtidos durantes as fases úmidas e secas da OMJ mostrou que a diferença corresponde cerca de 50 − 150% de modulação das chuvas na microrregião. Em seguida, uma investigação abrangente sobre o papel da OMJ sobre toda a região Nordeste foi feita considerando-se as quatro estações do ano. Os resultados mostraram que os impactos da OMJ na precipitação intrassazonal do NEB apresentam forte sazonalidade. A maior coerência espacial dos sinais de precipitação ocorreram durante o verão austral, quando cerca de 80% das estações pluviométricas apresentaram anomalias positivas de precipitação durante as fases 1 − 2 da OMJ e anomalias negativas de precipitação nas fases 5 − 6 da oscilação. Embora impactos da OMJ na precipitação intrassazonal tenham sido encontrados na maioria das localidades e em todas as estações do ano, eles apresentaram variações na magnitude dos sinais e dependem da fase da oscilação. As anomalias de precipitação do NEB observadas são explicadas através da interação existente entre as ondas de Kelvin-Rossby acopladas convectivamente e as características climáticas predominantes sobre a região em cada estação do ano. O aumento de precipitação observado sobre a maior parte do NEB durante o verão e primavera austrais encontra-se associado com o fluxo de umidade de oeste (regime de oeste), o qual favorece a atividade convectiva em amplas áreas da América do Sul tropical. Por outro lado, as anomalias de precipitação durante o inverno e outono austrais apresentaram uma variabilidade espacial mais complexa. Durante estas estações, as anomalias de precipitação observadas nas estações localizadas na costa leste do NEB dependem da intensidade do anticiclone do Atlântico Sul, o qual é modulado em grande parte por ondas de Rossby. As características topográficas do NEB parecem desempenhar um papel importante na variabilidade observada na precipitação, principalmente nestas áreas costeiras. A intensificação do anticiclone aumenta a convergência dos ventos alísios na costa contribuindo para a ocorrência de precipitação observada à barlavento do planalto da Borborema. Por outro lado, o aumento da subsidência parece ser responsável pelos déficits de precipitação observados à sotavento. Tais condições mostraram-se típicas durante o predomínio do regime de leste sobre a região tropical da América do Sul e o NEB, durante o qual ocorre uma diminuição no fluxo de umidade proveniente da Amazônia.