895 resultados para data types and operators
Resumo:
* The research was supported by INTAS 00-397 and 00-626 Projects.
Resumo:
The evaluation of geospatial data quality and trustworthiness presents a major challenge to geospatial data users when making a dataset selection decision. The research presented here therefore focused on defining and developing a GEO label – a decision support mechanism to assist data users in efficient and effective geospatial dataset selection on the basis of quality, trustworthiness and fitness for use. This thesis thus presents six phases of research and development conducted to: (a) identify the informational aspects upon which users rely when assessing geospatial dataset quality and trustworthiness; (2) elicit initial user views on the GEO label role in supporting dataset comparison and selection; (3) evaluate prototype label visualisations; (4) develop a Web service to support GEO label generation; (5) develop a prototype GEO label-based dataset discovery and intercomparison decision support tool; and (6) evaluate the prototype tool in a controlled human-subject study. The results of the studies revealed, and subsequently confirmed, eight geospatial data informational aspects that were considered important by users when evaluating geospatial dataset quality and trustworthiness, namely: producer information, producer comments, lineage information, compliance with standards, quantitative quality information, user feedback, expert reviews, and citations information. Following an iterative user-centred design (UCD) approach, it was established that the GEO label should visually summarise availability and allow interrogation of these key informational aspects. A Web service was developed to support generation of dynamic GEO label representations and integrated into a number of real-world GIS applications. The service was also utilised in the development of the GEO LINC tool – a GEO label-based dataset discovery and intercomparison decision support tool. The results of the final evaluation study indicated that (a) the GEO label effectively communicates the availability of dataset quality and trustworthiness information and (b) GEO LINC successfully facilitates ‘at a glance’ dataset intercomparison and fitness for purpose-based dataset selection.
Resumo:
Mathematics Subject Classification: 47A56, 47A57,47A63
Resumo:
The sheer volume of citizen weather data collected and uploaded to online data hubs is immense. However as with any citizen data it is difficult to assess the accuracy of the measurements. Within this project we quantify just how much data is available, where it comes from, the frequency at which it is collected, and the types of automatic weather stations being used. We also list the numerous possible sources of error and uncertainty within citizen weather observations before showing evidence of such effects in real data. A thorough intercomparison field study was conducted, testing popular models of citizen weather stations. From this study we were able to parameterise key sources of bias. Most significantly the project develops a complete quality control system through which citizen air temperature observations can be passed. The structure of this system was heavily informed by the results of the field study. Using a Bayesian framework the system learns and updates its estimates of the calibration and radiation-induced biases inherent to each station. We then show the benefit of correcting for these learnt biases over using the original uncorrected data. The system also attaches an uncertainty estimate to each observation, which would provide real world applications that choose to incorporate such observations with a measure on which they may base their confidence in the data. The system relies on interpolated temperature and radiation observations from neighbouring professional weather stations for which a Bayesian regression model is used. We recognise some of the assumptions and flaws of the developed system and suggest further work that needs to be done to bring it to an operational setting. Such a system will hopefully allow applications to leverage the additional value citizen weather data brings to longstanding professional observing networks.
Resumo:
Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier's feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space
Resumo:
Lehet-e beszélni a 2011-ig felgyülemlett empirikus tapasztalatok tükrében egy egységes válságlefolyásról, amely a fejlett ipari országok egészére általában jellemző, és a meghatározó országok esetében is megragadható? Megállapíthatók-e olyan univerzális változások a kibocsátás, a munkapiacok, a fogyasztás, valamint a beruházás tekintetében, amelyek jól illeszkednek a korábbi tapasztalatokhoz, nem kevésbé az ismert makromodellek predikcióihoz? A válasz – legalábbis jelen sorok írásakor – nemleges: sem a válság lefolyásának jellegzetességeiben és a makrogazdasági teljesítmények romlásának ütemében, sem a visszacsúszás mértékében és időbeli kiterjedésében sincsenek jól azonosítható közös jegyek, olyanok, amelyek a meglévő elméleti keretekbe jól beilleszthetők. A tanulmány áttekinti a válsággal és a makrogazdasági sokkokkal foglalkozó empirikus irodalom – a pénzügyi globalizáció értelmezései nyomán – relevánsnak tartott munkáit. Ezt követően egy 60 év távlatát átfogó vizsgálatban próbáljuk megítélni a recessziós időszakokban az amerikai gazdaság teljesítményét azzal a célkitűzéssel, hogy az elmúlt válság súlyosságának megítélése kellően objektív lehessen, legalább a fontosabb makrováltozók elmozdulásának nagyságrendje tekintetében. / === / Based on the empirical evidence accumulated until 2011, using official statistics from the OECD data bank and the US Commerce Department, the article addresses the question whether one can, or cannot, speak about generally observable recession/crisis patterns, such that were to be universally recognized in all major industrial countries (the G7). The answer to this question is a firm no. Changes and volatility in most major macroeconomic indicators such as output-gap, labor market distortions and large deviations from trend in consumption and in investment did all, respectively, exhibit wide differences in depth and width across the G7 countries. The large deviations in output-gaps and especially strong distortions in labor market inputs and hours per capita worked over the crisis months can hardly be explained by the existing model classes of DSGE and those of the real business cycle. Especially bothering are the difficulties in fitting the data into any established model whether business cycle or some other types, in which financial distress reduces economic activity. It is argued that standard business cycle models with financial market imperfections have no mechanism for generating deviation from standard theory, thus they do not shed light on the key factors underlying the 2007–2009 recession. That does not imply that the financial crisis is unimportant in understanding the recession, but it does indicate however, that we do not fully understand the channels through which financial distress reduced labor input. Long historical trends on the privately held portion of the federal debt in the US economy indicate that the standard macro proposition of public debt crowding out private investment and thus inhibiting growth, can be strongly challenged in so far as this ratio is neither a direct indicator of growth slowing down, nor for recession.
Resumo:
In the years 2002, 2003 and 2004 we collected samples of macroinvertebrates on a total of 36 occasions in Badacsony bay, in areas of open water (in the years 2003 and 2004 reed-grassy) as well as populated by reed (Phragmites australis) and cattail (Typha angustifolia). Samples were taken using a stiff hand net. The sampling site includes three microhabitats differentiated only by the aquatic plants inhabiting these areas. Our data was gathered from processing 208 individual samples. The quantity of macroinvertebrates is represented by biovolume value based on volume estimates. We can identify taxa in abundant numbers found in all water types and ooze; as well as groups associated with individual microhabitats with various aquatic plants. We can observe a notable difference between the years in the volume of invertebrate macrofauna caused by the drop of water level, and the multiplication of submerged macrophytes. There are smaller differences between the samples taken in reeds and cattail stands. In the second half of 2003 – which was a year of drought – the Najas marina appeared in open waters and allowed to support larger quantities of macroinvertebrates. In 2004 with higher water levels, the Potamogeton perfoliatus occurring in the same area has had an even more significant effect. This type of reed-grass may support the most macroinvertebrates during the summer. From the aspect of diversity relations we may suspect different characteristics. The reeds sampling site proved to be the richest, while the cattail microhabitat is close behind, open water (with submerged macrophytes) is the least diverse microhabitat.
Resumo:
The correct modelling of long- and short-term seasonality is a very interesting issue. The choice between the deterministic and stochastic modelling of trend and seasonality and their implications are as relevant as the case of deterministic and stochastic trends itself. The study considers the special case when the stochastic trend and seasonality do not evolve independently and the usual differencing filters do not apply. The results are applied to the day-ahead (spot) trading data of some main European energy exchanges (power and natural gas).
Resumo:
This study is an exploratory analysis of an operational measure for resource development strategies, and an exploratory analysis of internal organizational contingencies influencing choices of these strategies in charitable nonprofit organizations. The study provides conceptual guidance for advancing understanding about resource development in the nonprofit sector. The statistical findings are, however, inconclusive without further rigorous examination. A three category typology based on organization technology is initially presented to define the strategies. Three dimensions of internal organizational contingencies explored represent organization identity, professional staff, and boards of directors. Based on relevant literature and key informant interviews, an original survey was administered by mail to a national sample of nonprofit organizations. The survey collected data on indicators of the proposed strategy types and selected contingencies. Factor analysis extracted two of the initial categories in the typology. The Building Resource Development Infrastructure Strategy encompasses information technology, personnel, legal structures, and policies facilitating fund development. The Building Resource Development Infrastructure Strategy encompasses the mission, service niche, and type of service delivery forming the basis for seeking financial support. Linear regressions with each strategy type as the dependent variable identified distinct and common contingencies which may partly explain choices of strategies. Discriminant analysis suggests the potential predictive accuracy of the contingencies. Follow-up case studies with survey respondents provide additional criteria for operationalizing future measures of resource development strategies, and support and expand the analysis on contingencies. The typology offers a beginning framework for defining alternative approaches to resource development, and for exploring organization capacity specific to each approach. Contingencies that may be integral components of organization capacity are funding, leadership frame, background and experience, staff and volunteer effort, board member support, and relationships in the external environment. Based on these findings, management questions are offered for nonprofit organization stakeholders to consider in planning for resource development. Lessons learned in designing and conducting this study are also provided to enhance future related research. ^
Resumo:
Computers have dramatically changed the way we live, conduct business, and deliver education. They have infiltrated the Bahamian public school system to the extent that many educators now feel the need for a national plan. The development of such a plan is a challenging undertaking, especially in developing countries where physical, financial, and human resources are scarce. This study assessed the situation with regard to computers within the Bahamian public school system, and provided recommended guidelines to the Bahamian government based on the results of a survey, the body of knowledge about trends in computer usage in schools, and the country's needs. ^ This was a descriptive study for which an extensive review of literature in areas of computer hardware, software, teacher training, research, curriculum, support services and local context variables was undertaken. One objective of the study was to establish what should or could be relative to the state-of-the-art in educational computing. A survey was conducted involving 201 teachers and 51 school administrators from 60 randomly selected Bahamian public schools. A random stratified cluster sampling technique was used. ^ This study used both quantitative and qualitative research methodologies. Quantitative methods were used to summarize the data about numbers and types of computers, categories of software available, peripheral equipment, and related topics through the use of forced-choice questions in a survey instrument. Results of these were displayed in tables and charts. Qualitative methods, data synthesis and content analysis, were used to analyze the non-numeric data obtained from open-ended questions on teachers' and school administrators' questionnaires, such as those regarding teachers' perceptions and attitudes about computers and their use in classrooms. Also, interpretative methodologies were used to analyze the qualitative results of several interviews conducted with senior public school system's officials. Content analysis was used to gather data from the literature on topics pertaining to the study. ^ Based on the literature review and the data gathered for this study a number of recommendations are presented. These recommendations may be used by the government of the Commonwealth of The Bahamas to establish policies with regard to the use of computers within the public school system. ^
Resumo:
Interpersonal conflicts have the potential for detrimental consequences if not managed successfully. Understanding the factors that contribute to conflict resolution has implications for interpersonal relationships and the workplace. Researchers have suggested that personality plays an important and predictable role in conflict resolution behaviors (Chanin & Schneer, 1984; Kilmann & Thomas, 1975; Mills, Robey & Smith, 1985). However, other investigators have contended that contextual factors are important contributors in triggering the behavioral responses (Shoda & Mischel, 2000; Mischel & Shoda, 1995). The purpose of this study was to investigate the relationships among personality types, demographic characteristics and contextual factors on the conflict resolution behaviors reported by graduate occupational therapy students (n = 125). ^ The study design was correlational. The Myers Briggs Type Indicator (MBTI) and the Thomas-Kilmann (MODE) Instrument were used to establish the personality types and the context independent conflict resolution behaviors respectively. The effects of contextual factors of task vs. relationship and power were measured with the Conflict Case Scenarios Questionnaire (CCSQ). One-way ANOVA and linear regression procedures were used to test the relationships between personality types and demographic characteristics with the context independent conflict behaviors. Chi-Square procedures of the personality types by contextual conditions ascertained the effects of contexts in modifying the resolution modes. Descriptive statistics established a profile of the sample. ^ The results of the hypotheses tests revealed significant relationships between the personality types of feeling-thinking and sensing-intuition with the conflict resolution behaviors. The contextual attributes of task vs. relationship orientation and of peer vs. supervisor relationships were shown to modify the conflict behaviors. Furthermore, demographic characteristics of age, gender, GPA and educational background were shown to have an effect on the conflict resolution behaviors. The knowledge gained has implications for students' training, specifically understanding their styles and use of effective conflict resolution strategies. It also contributes to the knowledge on management approaches and interpersonal competencies and how this might facilitate the students' transition to the clinical role. ^
Resumo:
With advances in science and technology, computing and business intelligence (BI) systems are steadily becoming more complex with an increasing variety of heterogeneous software and hardware components. They are thus becoming progressively more difficult to monitor, manage and maintain. Traditional approaches to system management have largely relied on domain experts through a knowledge acquisition process that translates domain knowledge into operating rules and policies. It is widely acknowledged as a cumbersome, labor intensive, and error prone process, besides being difficult to keep up with the rapidly changing environments. In addition, many traditional business systems deliver primarily pre-defined historic metrics for a long-term strategic or mid-term tactical analysis, and lack the necessary flexibility to support evolving metrics or data collection for real-time operational analysis. There is thus a pressing need for automatic and efficient approaches to monitor and manage complex computing and BI systems. To realize the goal of autonomic management and enable self-management capabilities, we propose to mine system historical log data generated by computing and BI systems, and automatically extract actionable patterns from this data. This dissertation focuses on the development of different data mining techniques to extract actionable patterns from various types of log data in computing and BI systems. Four key problems—Log data categorization and event summarization, Leading indicator identification , Pattern prioritization by exploring the link structures , and Tensor model for three-way log data are studied. Case studies and comprehensive experiments on real application scenarios and datasets are conducted to show the effectiveness of our proposed approaches.
Resumo:
This dissertation established a software-hardware integrated design for a multisite data repository in pediatric epilepsy. A total of 16 institutions formed a consortium for this web-based application. This innovative fully operational web application allows users to upload and retrieve information through a unique human-computer graphical interface that is remotely accessible to all users of the consortium. A solution based on a Linux platform with My-SQL and Personal Home Page scripts (PHP) has been selected. Research was conducted to evaluate mechanisms to electronically transfer diverse datasets from different hospitals and collect the clinical data in concert with their related functional magnetic resonance imaging (fMRI). What was unique in the approach considered is that all pertinent clinical information about patients is synthesized with input from clinical experts into 4 different forms, which were: Clinical, fMRI scoring, Image information, and Neuropsychological data entry forms. A first contribution of this dissertation was in proposing an integrated processing platform that was site and scanner independent in order to uniformly process the varied fMRI datasets and to generate comparative brain activation patterns. The data collection from the consortium complied with the IRB requirements and provides all the safeguards for security and confidentiality requirements. An 1-MR1-based software library was used to perform data processing and statistical analysis to obtain the brain activation maps. Lateralization Index (LI) of healthy control (HC) subjects in contrast to localization-related epilepsy (LRE) subjects were evaluated. Over 110 activation maps were generated, and their respective LIs were computed yielding the following groups: (a) strong right lateralization: (HC=0%, LRE=18%), (b) right lateralization: (HC=2%, LRE=10%), (c) bilateral: (HC=20%, LRE=15%), (d) left lateralization: (HC=42%, LRE=26%), e) strong left lateralization: (HC=36%, LRE=31%). Moreover, nonlinear-multidimensional decision functions were used to seek an optimal separation between typical and atypical brain activations on the basis of the demographics as well as the extent and intensity of these brain activations. The intent was not to seek the highest output measures given the inherent overlap of the data, but rather to assess which of the many dimensions were critical in the overall assessment of typical and atypical language activations with the freedom to select any number of dimensions and impose any degree of complexity in the nonlinearity of the decision space.
Resumo:
With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.
Resumo:
Exclusionary school discipline results in students being removed from classrooms as a consequence of their disruptive behavior and may lead to subsequent suspension and/or expulsion. Literature documents that nondominant students, particularly Black males, are disproportionately impacted by exclusionary discipline, to the point that researchers from a variety of critical perspectives consider exclusionary school discipline an oppressive educational practice and condition. Little or no research examines specific teacher-student social interactions within classrooms that influence teachers’ decisions to use or not use exclusionary discipline. Therefore, this study set forth the central research question: In relation to classroom interactions in alternative education settings, what accounts for teachers’ use or non-use of exclusionary discipline with students? A critical social practice theory of learning served as the framework for exploring this question, and a critical microethnographic methodology informed the data collection and analysis. ^ Criterion sampling was used to select four classrooms in the same alternative education school with two teachers who frequently and two who rarely used exclusionary discipline. Nine stages of data collection and reconstructive data analysis were conducted. Data collection involved video recorded classroom observations, digitally recorded interviews of teachers and students discussing selected video segments, and individual teacher interviews. Reconstructive data analysis procedures involved hermeneutic inferencing of possible underlying meanings, critical discourse analysis, interactive power analysis and role analysis, thematic analysis of the interactions in each classroom, and a final comparative analysis of the four classrooms. ^ Four predominant themes of social interaction (resistance, conformism, accommodation, and negotiation) emerged with terminology adapted from Giroux’s (2001) theory of resistance in education and Third Space theory (Gutiérrez, 2008). Four types of power (normative, coercive, interactively established contracts, and charm), based on Carspecken’s (1996) typology, were found in the interactions between teacher and students in varying degrees for different purposes. ^ This research contributes to the knowledge base on teacher-student classroom interactions, specifically in relation to exclusionary discipline. Understanding how the themes and varying power relations influence their decisions and actions may enable teachers to reduce use of exclusionary discipline and remain focused on positive teacher-student academic interactions. ^